Matches in SemOpenAlex for { <https://semopenalex.org/work/W137929102> ?p ?o ?g. }
- W137929102 abstract "Machine Translation (MT) is a task with multiple components, each of which can be very challenging. This thesis focuses on a difficult language pair—Chinese to English—and works on several language-specific aspects that make translation more difficult. The first challenge this thesis focuses on is the differences in the writing systems. In Chinese there are no explicit boundaries between words, and even the definition of a is unclear. We build a general purpose Chinese word segmenter with linguistically inspired features that performs very well on the SIGHAN 2005 bakeoff data. Then we study how Chinese word segmenter performance is related to MT performance, and provide a way to tune the unit in Chinese so that it can better match up with the English word granularity, and therefore improve MT performance. The second challenge we address is different word order between Chinese and English. We first perform error analysis on three state-of-the-art MT systems to see what the most prominent problems are, especially how different word orders cause translation errors. According to our findings, we propose two solutions to improve Chinese-to-English MT systems. First, word reordering, especially over longer distances, has caused many errors. Even though Chinese and English are both Subject-Verb-Object (SVO) languages, they usually use different word orders in noun phrases, prepositional phrases, etc. Many of these different word orders can be long distance ones and cause difficulty for MT systems. There have been many previous studies on this. In this thesis, we introduce a richer set of Chinese grammatical relations that describes more semantically abstract relations between words. We are able to integrate these Chinese grammatical relations into the most used, state-of-the-art phrase-based MT system and to improve its performance. Second, we study the behavior of the most common Chinese word (DE), which does not have a direct mapping to English. DE serves different functions in Chinese, and therefore can be ambiguous when translating to English. It might also cause longer distance reordering when translating to English. We propose a classifier to disambiguate DEs in Chinese text. Using this classifier, we improve the English translation quality because we can make the Chinese word orders much more similar to English, and we also disambiguate when a DE should be translated to different constructions (e.g., relative clause, prepositional phrase, etc.)." @default.
- W137929102 created "2016-06-24" @default.
- W137929102 creator A5046006076 @default.
- W137929102 creator A5056695196 @default.
- W137929102 date "2009-01-01" @default.
- W137929102 modified "2023-09-25" @default.
- W137929102 title "Improving chinese-english machine translation through better source-side linguistic processing" @default.
- W137929102 cites W12923685 @default.
- W137929102 cites W149848483 @default.
- W137929102 cites W1508977358 @default.
- W137929102 cites W1510052640 @default.
- W137929102 cites W1517947178 @default.
- W137929102 cites W1551202288 @default.
- W137929102 cites W1575907248 @default.
- W137929102 cites W1588242179 @default.
- W137929102 cites W1969974515 @default.
- W137929102 cites W1979145089 @default.
- W137929102 cites W2001064229 @default.
- W137929102 cites W2006969979 @default.
- W137929102 cites W201231365 @default.
- W137929102 cites W2016856586 @default.
- W137929102 cites W2032772856 @default.
- W137929102 cites W2033295622 @default.
- W137929102 cites W2035807240 @default.
- W137929102 cites W2036516910 @default.
- W137929102 cites W2038881539 @default.
- W137929102 cites W2048808408 @default.
- W137929102 cites W2056469463 @default.
- W137929102 cites W2069598966 @default.
- W137929102 cites W2086202918 @default.
- W137929102 cites W2086825453 @default.
- W137929102 cites W2096765155 @default.
- W137929102 cites W2098594428 @default.
- W137929102 cites W2098875891 @default.
- W137929102 cites W2101105183 @default.
- W137929102 cites W2102301788 @default.
- W137929102 cites W2108301419 @default.
- W137929102 cites W2108460050 @default.
- W137929102 cites W2109494378 @default.
- W137929102 cites W2117642127 @default.
- W137929102 cites W2118563017 @default.
- W137929102 cites W2122609803 @default.
- W137929102 cites W2126241965 @default.
- W137929102 cites W2131988669 @default.
- W137929102 cites W2135161317 @default.
- W137929102 cites W2136925175 @default.
- W137929102 cites W2140016149 @default.
- W137929102 cites W2143263475 @default.
- W137929102 cites W2144091461 @default.
- W137929102 cites W2144279206 @default.
- W137929102 cites W2146113428 @default.
- W137929102 cites W2146574666 @default.
- W137929102 cites W2146798791 @default.
- W137929102 cites W2147302173 @default.
- W137929102 cites W2147880316 @default.
- W137929102 cites W2152263452 @default.
- W137929102 cites W2153653739 @default.
- W137929102 cites W2153800732 @default.
- W137929102 cites W2154124206 @default.
- W137929102 cites W2156985047 @default.
- W137929102 cites W2157435188 @default.
- W137929102 cites W2158065314 @default.
- W137929102 cites W2161227214 @default.
- W137929102 cites W2162245945 @default.
- W137929102 cites W2164766438 @default.
- W137929102 cites W2166905217 @default.
- W137929102 cites W2169724380 @default.
- W137929102 cites W2171421863 @default.
- W137929102 cites W2242975712 @default.
- W137929102 cites W2252066972 @default.
- W137929102 cites W23077562 @default.
- W137929102 cites W2406273306 @default.
- W137929102 cites W2437005631 @default.
- W137929102 cites W2467575451 @default.
- W137929102 cites W25062297 @default.
- W137929102 cites W2787109023 @default.
- W137929102 cites W44506435 @default.
- W137929102 cites W786936280 @default.
- W137929102 cites W82340936 @default.
- W137929102 cites W97972967 @default.
- W137929102 hasPublicationYear "2009" @default.
- W137929102 type Work @default.
- W137929102 sameAs 137929102 @default.
- W137929102 citedByCount "1" @default.
- W137929102 countsByYear W1379291022016 @default.
- W137929102 crossrefType "journal-article" @default.
- W137929102 hasAuthorship W137929102A5046006076 @default.
- W137929102 hasAuthorship W137929102A5056695196 @default.
- W137929102 hasConcept C121934690 @default.
- W137929102 hasConcept C138885662 @default.
- W137929102 hasConcept C153962237 @default.
- W137929102 hasConcept C154945302 @default.
- W137929102 hasConcept C162324750 @default.
- W137929102 hasConcept C187736073 @default.
- W137929102 hasConcept C203005215 @default.
- W137929102 hasConcept C204321447 @default.
- W137929102 hasConcept C2776397901 @default.
- W137929102 hasConcept C2780451532 @default.
- W137929102 hasConcept C41008148 @default.
- W137929102 hasConcept C41895202 @default.