Matches in SemOpenAlex for { <https://semopenalex.org/work/W1495442604> ?p ?o ?g. }
- W1495442604 abstract "This thesis argues that successful semi-supervised learning is improved by learning many functions at once in a coupled manner. Given knowledge about constraints between functions to be learned (e.g., f1(x) → ¬ f2(x)), forcing the models that are learned to obey these constraints can yield a more constrained, and therefore easier, set of learning problems. We apply these ideas to bootstrap learning methods as well as semi-supervised logistic regression models, and show that considerable improvements are achieved in both settings. In experimental work, we focus on the problem of extracting factual knowledge from the web. This problem is an ideal case study for the general problems that we study because there is an abundance of unlabeled web page data available, and because thousands or millions of functions are discussed on the web. Chapter 3 focuses on coupling the semi-supervised learning of information extractors that extract information from free text using textual extraction patterns (e.g., mayor of X and Y star quarterback X). We present an approach in which the input to the learner is an ontology defining a set of target categories and relations to be learned, a handful of seed examples for each, and a set of constraints that couple the various categories and relations (e.g., Person and Sport are mutually exclusive). We show that given this input and millions of unlabeled documents, a semi-supervised learning procedure can, by avoiding violations of the constraints in how its learned extractors label unlabeled data, achieve very significant accuracy improvements over semi-supervised methods that do not avoid such violations. In Chapter 4, we apply the ideas from Chapter 3 to a different type of extraction method, wrapper induction for semi-structured web pages. We also consider how to couple multiple extraction methods that typically make independent errors. To couple pattern-based extraction and wrapper-based extraction, we use a strategy that only promotes instances extracted by both methods. Experimental results on dozens of categories and relations demonstrate that coupling wrapper induction improves the precision of the promoted facts, and that coupling multiple extraction methods leads to higher precision than either of the methods alone. In Chapter 5, we consider two questions: (1) Can we scale up the number and variety of predicates in our ontology and still maintain high precision with coupled semi-supervised learning methods? and (2) Should we consider adding additional extraction methods beyond textual patterns and wrappers? We first describe a general architecture that can exploit many different extraction methods. We then describe a prototype implementation of our architecture, called Multi-Extractor Coupler (MEC). With an extended ontology of 123 categories and 55 relations, MEC has learned to extract a knowledge base containing over 242,000 beliefs with an estimated precision of 74%. Chapter 6 considers how to couple the semi-supervised learning of logistic regression models. Specifically, we consider learning many binary logistic regression classifiers when many pairs of classes are known to be mutually exclusive. We present a method that uses unlabeled data through a penalty function that regularizes the training of classifiers by penalizing violations of mutual exclusion. We apply this idea to training classifiers which decide if a noun phrase is a member of some specific category. Semi-supervised training of such classifiers is shown to improve performance relative to supervised-only training. We speculate that use of similar penalty functions could provide an alternative to the methods for coupled semi-supervised learning presented in previous chapters, with the advantage that the models being learned are principled, probablistic models that are easy to train and can be applied to any example to provide a prediction of posterior probabilities." @default.
- W1495442604 created "2016-06-24" @default.
- W1495442604 creator A5012544239 @default.
- W1495442604 creator A5034266240 @default.
- W1495442604 date "2010-01-01" @default.
- W1495442604 modified "2023-09-26" @default.
- W1495442604 title "Coupled semi-supervised learning" @default.
- W1495442604 cites W1489949474 @default.
- W1495442604 cites W1493490255 @default.
- W1495442604 cites W1499297345 @default.
- W1495442604 cites W1505544955 @default.
- W1495442604 cites W1512387364 @default.
- W1495442604 cites W1520377376 @default.
- W1495442604 cites W1531743498 @default.
- W1495442604 cites W1533057952 @default.
- W1495442604 cites W1534477342 @default.
- W1495442604 cites W1553019137 @default.
- W1495442604 cites W1567365482 @default.
- W1495442604 cites W157725869 @default.
- W1495442604 cites W1585743408 @default.
- W1495442604 cites W167355512 @default.
- W1495442604 cites W171093852 @default.
- W1495442604 cites W1766290689 @default.
- W1495442604 cites W1822246767 @default.
- W1495442604 cites W1890589545 @default.
- W1495442604 cites W1970381522 @default.
- W1495442604 cites W197270748 @default.
- W1495442604 cites W1981082061 @default.
- W1495442604 cites W1990190154 @default.
- W1495442604 cites W1991564165 @default.
- W1495442604 cites W1992250294 @default.
- W1495442604 cites W1998284155 @default.
- W1495442604 cites W2012179495 @default.
- W1495442604 cites W2020278455 @default.
- W1495442604 cites W2026080185 @default.
- W1495442604 cites W2027081355 @default.
- W1495442604 cites W2029580144 @default.
- W1495442604 cites W2048679005 @default.
- W1495442604 cites W2049633694 @default.
- W1495442604 cites W2050712820 @default.
- W1495442604 cites W2051768896 @default.
- W1495442604 cites W2053237598 @default.
- W1495442604 cites W2057052429 @default.
- W1495442604 cites W2068431743 @default.
- W1495442604 cites W2068737686 @default.
- W1495442604 cites W2088762045 @default.
- W1495442604 cites W2093559286 @default.
- W1495442604 cites W2101210369 @default.
- W1495442604 cites W2103296194 @default.
- W1495442604 cites W2103931177 @default.
- W1495442604 cites W2104987630 @default.
- W1495442604 cites W2107598941 @default.
- W1495442604 cites W2111284344 @default.
- W1495442604 cites W2113958117 @default.
- W1495442604 cites W2117130368 @default.
- W1495442604 cites W2117510361 @default.
- W1495442604 cites W2117729721 @default.
- W1495442604 cites W2119598708 @default.
- W1495442604 cites W2123084125 @default.
- W1495442604 cites W2125327503 @default.
- W1495442604 cites W2125464431 @default.
- W1495442604 cites W2126539437 @default.
- W1495442604 cites W2130903752 @default.
- W1495442604 cites W2132174753 @default.
- W1495442604 cites W2132655161 @default.
- W1495442604 cites W2133013156 @default.
- W1495442604 cites W2133348086 @default.
- W1495442604 cites W2136504847 @default.
- W1495442604 cites W2136518234 @default.
- W1495442604 cites W2137382827 @default.
- W1495442604 cites W2139823104 @default.
- W1495442604 cites W2140602286 @default.
- W1495442604 cites W2141416357 @default.
- W1495442604 cites W2145494108 @default.
- W1495442604 cites W2146057216 @default.
- W1495442604 cites W2147444342 @default.
- W1495442604 cites W2148252020 @default.
- W1495442604 cites W2148540243 @default.
- W1495442604 cites W2148603752 @default.
- W1495442604 cites W2148738951 @default.
- W1495442604 cites W2150102617 @default.
- W1495442604 cites W2150588363 @default.
- W1495442604 cites W2152553986 @default.
- W1495442604 cites W2153653572 @default.
- W1495442604 cites W2154368244 @default.
- W1495442604 cites W2163568299 @default.
- W1495442604 cites W2165968176 @default.
- W1495442604 cites W2166983036 @default.
- W1495442604 cites W2171674772 @default.
- W1495442604 cites W2173213060 @default.
- W1495442604 cites W2180809782 @default.
- W1495442604 cites W2337480916 @default.
- W1495442604 cites W2486125749 @default.
- W1495442604 cites W2785349534 @default.
- W1495442604 cites W2882319491 @default.
- W1495442604 cites W2903158431 @default.
- W1495442604 cites W314565566 @default.
- W1495442604 cites W1614862348 @default.
- W1495442604 cites W3151309449 @default.
- W1495442604 cites W3201816866 @default.