Matches in SemOpenAlex for { <https://semopenalex.org/work/W1488348033> ?p ?o ?g. }
- W1488348033 abstract "The preservation of meaning between inputs and outputs is perhaps the most ambitious and, often, the most elusive goal of systems that attempt to process natural language. Nowhere is this goal of more obvious importance than for the tasks of machine translation and paraphrase generation. Preserving meaning between the input and the output is paramount for both, the monolingual vs bilingual distinction notwithstanding. In this thesis, I present a novel, symbiotic relationship between these two tasks that I term the “circle of meaning”. Today’s statistical machine translation (SMT) systems require high quality human translations for parameter tuning, in addition to large bi-texts for learning the translation units. This parameter tuning usually involves generating translations at different points in the parameter space and obtaining feedback against human-authored reference translations as to how good the translations. This feedback then dictates what point in the parameter space should be explored next. To measure this feedback, it is generally considered wise to have multiple (usually 4) reference translations to avoid unfair penalization of translation hypotheses which could easily happen given the large number of ways in which a sentence can be translated from one language to another. However, this reliance on multiple reference translations creates a problem since they are labor intensive and expensive to obtain. Therefore, most current MT datasets only contain a single reference. This leads to the problem of reference sparsity—the primary open problem that I address in this dissertation—one that has a serious effect on the SMT parameter tuning process. Bannard and Callison-Burch (2005) were the first to provide a practical connection between phrase-based statistical machine translation and paraphrase generation. However, their technique is restricted to generating phrasal paraphrases. I build upon their approach and augment a phrasal paraphrase extractor into a sentential paraphraser with extremely broad coverage. The novelty in this augmentation lies in the further strengthening of the connection between statistical machine translation and paraphrase generation; whereas Bannard and Callison-Burch only relied on SMT machinery to extract phrasal paraphrase rules and stopped there, I take it a few steps further and build a full English-to-English SMT system. This system can, as expected, “translate” any English input sentence into a new English sentence with the same degree of meaning preservation that exists in a bilingual SMT system. In fact, being a state-of-the-art SMT system, it is able to generate n-best “translations” for any given input sentence. This sentential paraphraser, built almost entirely from existing SMT machinery, represents the first 180 degrees of the circle of meaning. To complete the circle, I describe a novel connection in the other direction. I claim that the sentential paraphraser, once built in this fashion, can provide a solution to the reference sparsity problem and, hence, be used to improve the performance a bilingual SMT system. I discuss two different instantiations of the sentential paraphraser and show several results that provide empirical validation for this connection." @default.
- W1488348033 created "2016-06-24" @default.
- W1488348033 creator A5015235744 @default.
- W1488348033 creator A5054060679 @default.
- W1488348033 date "2010-01-01" @default.
- W1488348033 modified "2023-09-23" @default.
- W1488348033 title "The circle of meaning: from translation to paraphrasing and back" @default.
- W1488348033 cites W131533222 @default.
- W1488348033 cites W137105571 @default.
- W1488348033 cites W1508900626 @default.
- W1488348033 cites W1520917717 @default.
- W1488348033 cites W1521365985 @default.
- W1488348033 cites W1549339229 @default.
- W1488348033 cites W1573537589 @default.
- W1488348033 cites W1581790469 @default.
- W1488348033 cites W1587169500 @default.
- W1488348033 cites W1623072288 @default.
- W1488348033 cites W1646006088 @default.
- W1488348033 cites W1658658360 @default.
- W1488348033 cites W16933249 @default.
- W1488348033 cites W174630521 @default.
- W1488348033 cites W1859564223 @default.
- W1488348033 cites W190711684 @default.
- W1488348033 cites W1916559533 @default.
- W1488348033 cites W1947252915 @default.
- W1488348033 cites W19551284 @default.
- W1488348033 cites W1965605789 @default.
- W1488348033 cites W1969135428 @default.
- W1488348033 cites W1972645849 @default.
- W1488348033 cites W1973923101 @default.
- W1488348033 cites W1980776243 @default.
- W1488348033 cites W1983013908 @default.
- W1488348033 cites W1990387894 @default.
- W1488348033 cites W1990524510 @default.
- W1488348033 cites W2003677545 @default.
- W1488348033 cites W2006969979 @default.
- W1488348033 cites W2009570821 @default.
- W1488348033 cites W2012561700 @default.
- W1488348033 cites W2012833704 @default.
- W1488348033 cites W2020777959 @default.
- W1488348033 cites W2021738903 @default.
- W1488348033 cites W2027979924 @default.
- W1488348033 cites W2032175749 @default.
- W1488348033 cites W2032494091 @default.
- W1488348033 cites W2034742893 @default.
- W1488348033 cites W2038721957 @default.
- W1488348033 cites W2051593977 @default.
- W1488348033 cites W2052182675 @default.
- W1488348033 cites W2052893955 @default.
- W1488348033 cites W2053154970 @default.
- W1488348033 cites W2053482511 @default.
- W1488348033 cites W2056320444 @default.
- W1488348033 cites W2073449354 @default.
- W1488348033 cites W2079783326 @default.
- W1488348033 cites W2086039194 @default.
- W1488348033 cites W2086202918 @default.
- W1488348033 cites W2092527610 @default.
- W1488348033 cites W2097333193 @default.
- W1488348033 cites W2099884836 @default.
- W1488348033 cites W2101105183 @default.
- W1488348033 cites W2103081392 @default.
- W1488348033 cites W2105051853 @default.
- W1488348033 cites W2106068492 @default.
- W1488348033 cites W2106291855 @default.
- W1488348033 cites W2107130271 @default.
- W1488348033 cites W2108701407 @default.
- W1488348033 cites W2108869098 @default.
- W1488348033 cites W2109163800 @default.
- W1488348033 cites W2110104386 @default.
- W1488348033 cites W2110481933 @default.
- W1488348033 cites W2111329418 @default.
- W1488348033 cites W2112514265 @default.
- W1488348033 cites W2112740777 @default.
- W1488348033 cites W2113343614 @default.
- W1488348033 cites W2114013702 @default.
- W1488348033 cites W2115410424 @default.
- W1488348033 cites W2118021410 @default.
- W1488348033 cites W2119168550 @default.
- W1488348033 cites W2119532295 @default.
- W1488348033 cites W2122056984 @default.
- W1488348033 cites W2122300644 @default.
- W1488348033 cites W2124732071 @default.
- W1488348033 cites W2126649078 @default.
- W1488348033 cites W2127314673 @default.
- W1488348033 cites W2129049643 @default.
- W1488348033 cites W2129468719 @default.
- W1488348033 cites W2129765547 @default.
- W1488348033 cites W2131526828 @default.
- W1488348033 cites W2132019450 @default.
- W1488348033 cites W2132069633 @default.
- W1488348033 cites W2133137812 @default.
- W1488348033 cites W2133512280 @default.
- W1488348033 cites W2133798162 @default.
- W1488348033 cites W2137007854 @default.
- W1488348033 cites W2138553032 @default.
- W1488348033 cites W2138787466 @default.
- W1488348033 cites W2138810035 @default.
- W1488348033 cites W2143927888 @default.
- W1488348033 cites W2145685230 @default.
- W1488348033 cites W2146574666 @default.