Matches in SemOpenAlex for { <https://semopenalex.org/work/W113837456> ?p ?o ?g. }
- W113837456 abstract "Statistical language processing models are being applied to an ever wider and more varied range of linguistic domains. Collecting and curating training sets for each different domain is prohibitively expensive, and at the same time differences in vocabulary and writing style across domains can cause state-of-the-art supervised models to dramatically increase in error. The first part of this thesis describes structural correspondence learning (SCL), a method for adapting linear discriminative models from resource-rich source domains to resource-poor target domains. The key idea is the use of pivot features which occur frequently and behave similarly in both the source and target domains. SCL builds a shared representation by searching for a low-dimensional feature subspace that allows us to accurately predict the presence or absence of pivot features on unlabeled data. We demonstrate SCL on two text processing problems: sentiment classification of product reviews and part of speech tagging. For both tasks, SCL significantly improves over state of the art supervised models using only unlabeled target data. In the second part of the thesis, we develop a formal framework for analyzing domain adaptation tasks. We first describe a measure of divergence, the HDH -divergence, that depends on the hypothesis class H from which we estimate our supervised model. We then use this measure to state an upper bound on the true target error of a model trained to minimize a convex combination of empirical source and target errors. The bound characterizes the tradeoff inherent in training on both the large quantity of biased source data and the small quantity of unbiased target data, and we can compute it from finite labeled and unlabeled samples of the source and target distributions under relatively weak assumptions. Finally, we confirm experimentally that the bound corresponds well to empirical target error for the task of sentiment classification." @default.
- W113837456 created "2016-06-24" @default.
- W113837456 creator A5035351931 @default.
- W113837456 creator A5053208168 @default.
- W113837456 date "2008-01-01" @default.
- W113837456 modified "2023-09-26" @default.
- W113837456 title "Domain adaptation of natural language processing systems" @default.
- W113837456 cites W118545087 @default.
- W113837456 cites W122553268 @default.
- W113837456 cites W1480376833 @default.
- W113837456 cites W1500614872 @default.
- W113837456 cites W1508165687 @default.
- W113837456 cites W1510073064 @default.
- W113837456 cites W1520252399 @default.
- W113837456 cites W1536675765 @default.
- W113837456 cites W1542886316 @default.
- W113837456 cites W1560143607 @default.
- W113837456 cites W1571969069 @default.
- W113837456 cites W1574901103 @default.
- W113837456 cites W1632114991 @default.
- W113837456 cites W1773803948 @default.
- W113837456 cites W1818857488 @default.
- W113837456 cites W1932968309 @default.
- W113837456 cites W1966026565 @default.
- W113837456 cites W1967807490 @default.
- W113837456 cites W1968934255 @default.
- W113837456 cites W1996430422 @default.
- W113837456 cites W2008652694 @default.
- W113837456 cites W2023719791 @default.
- W113837456 cites W2029538739 @default.
- W113837456 cites W2032235985 @default.
- W113837456 cites W2035411645 @default.
- W113837456 cites W2045218416 @default.
- W113837456 cites W2048679005 @default.
- W113837456 cites W2071085454 @default.
- W113837456 cites W2091825929 @default.
- W113837456 cites W2097826433 @default.
- W113837456 cites W2100235303 @default.
- W113837456 cites W2101210369 @default.
- W113837456 cites W2103012681 @default.
- W113837456 cites W2104290444 @default.
- W113837456 cites W2110091014 @default.
- W113837456 cites W2111362445 @default.
- W113837456 cites W2112483442 @default.
- W113837456 cites W2116316001 @default.
- W113837456 cites W2116410915 @default.
- W113837456 cites W2118670840 @default.
- W113837456 cites W2120354757 @default.
- W113837456 cites W2120587290 @default.
- W113837456 cites W2120708938 @default.
- W113837456 cites W2122838776 @default.
- W113837456 cites W2130903752 @default.
- W113837456 cites W2131953535 @default.
- W113837456 cites W2136504847 @default.
- W113837456 cites W2138505392 @default.
- W113837456 cites W2139578439 @default.
- W113837456 cites W2139823104 @default.
- W113837456 cites W2140076625 @default.
- W113837456 cites W2142604614 @default.
- W113837456 cites W2144578941 @default.
- W113837456 cites W2145234365 @default.
- W113837456 cites W2147880316 @default.
- W113837456 cites W2148603752 @default.
- W113837456 cites W2152005244 @default.
- W113837456 cites W2156515921 @default.
- W113837456 cites W2158108973 @default.
- W113837456 cites W2160218441 @default.
- W113837456 cites W2163302275 @default.
- W113837456 cites W2163455955 @default.
- W113837456 cites W2163568299 @default.
- W113837456 cites W2166706824 @default.
- W113837456 cites W2169569713 @default.
- W113837456 cites W2255883267 @default.
- W113837456 cites W2579923771 @default.
- W113837456 cites W2785349534 @default.
- W113837456 cites W2811380766 @default.
- W113837456 cites W3146306708 @default.
- W113837456 hasPublicationYear "2008" @default.
- W113837456 type Work @default.
- W113837456 sameAs 113837456 @default.
- W113837456 citedByCount "45" @default.
- W113837456 countsByYear W1138374562012 @default.
- W113837456 countsByYear W1138374562013 @default.
- W113837456 countsByYear W1138374562014 @default.
- W113837456 countsByYear W1138374562015 @default.
- W113837456 countsByYear W1138374562016 @default.
- W113837456 countsByYear W1138374562017 @default.
- W113837456 countsByYear W1138374562018 @default.
- W113837456 countsByYear W1138374562019 @default.
- W113837456 countsByYear W1138374562020 @default.
- W113837456 countsByYear W1138374562021 @default.
- W113837456 crossrefType "journal-article" @default.
- W113837456 hasAuthorship W113837456A5035351931 @default.
- W113837456 hasAuthorship W113837456A5053208168 @default.
- W113837456 hasConcept C119857082 @default.
- W113837456 hasConcept C124101348 @default.
- W113837456 hasConcept C134306372 @default.
- W113837456 hasConcept C138885662 @default.
- W113837456 hasConcept C153180895 @default.
- W113837456 hasConcept C154945302 @default.