Matches in SemOpenAlex for { <https://semopenalex.org/work/W610424550> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W610424550 abstract "Alignment of words, i.e., detection of corresponding units between two sentences that are translations of each other, has been shown to be crucial for the success of many NLP applications such as statistical machine translation (MT), construction of bilingual lexicons, word-sense disambiguation, and projection of resources between languages. With the availability of large parallel texts, statistical word alignment systems have proven to be quite successful on many language pairs. However, these systems are still faced with several challenges due to the complexity of the word alignment problem; lack of enough training data, difficulty learning statistics correctly, translation divergences, and lack of a means for incremental incorporation of linguistic knowledge. This thesis presents two new frameworks to improve existing word alignments using supervised learning techniques. In the first framework, two rule-based approaches are introduced. The first approach, Divergence Unraveling for Statistical MT (DUSTer), specifically targets translation divergences and corrects the alignment links related to them using a set of manually-crafted, linguistically-motivated rules. In the second approach, Alignment Link Projection (ALP), the rules are generated automatically by adapting transformation-based error-driven learning to the word alignment problem. By conditioning the rules on initial alignment and linguistic properties of the words, ALP manages to categorize the errors of the initial system and correct them. The second framework, Multi-Align, is an alignment combination framework based on classifier ensembles. The thesis presents a neural-network based implementation of Multi-Align, called NeurAlign. By treating individual alignments as classifiers, NeurAlign builds an additional model to learn how to combine the input alignments effectively. The evaluations show that the proposed techniques yield significant improvements (up to 40% relative error reduction) over existing word alignment systems on four different language pairs, even with limited manually annotated data. More over, all three systems allow an easy integration of linguistic knowledge into statistical models without the need for large modifications to existing systems. Finally, the improvements are analyzed using various measures, including the impact of improved word alignments in an external application - phrase-based MT." @default.
- W610424550 created "2016-06-24" @default.
- W610424550 creator A5054060679 @default.
- W610424550 creator A5062598662 @default.
- W610424550 date "2005-01-01" @default.
- W610424550 modified "2023-09-25" @default.
- W610424550 title "Combining linguistic and machine learning techniques for word alignment improvement" @default.
- W610424550 hasPublicationYear "2005" @default.
- W610424550 type Work @default.
- W610424550 sameAs 610424550 @default.
- W610424550 citedByCount "5" @default.
- W610424550 crossrefType "dissertation" @default.
- W610424550 hasAuthorship W610424550A5054060679 @default.
- W610424550 hasAuthorship W610424550A5062598662 @default.
- W610424550 hasConcept C11413529 @default.
- W610424550 hasConcept C138885662 @default.
- W610424550 hasConcept C154945302 @default.
- W610424550 hasConcept C203005215 @default.
- W610424550 hasConcept C204321447 @default.
- W610424550 hasConcept C41008148 @default.
- W610424550 hasConcept C41895202 @default.
- W610424550 hasConcept C57493831 @default.
- W610424550 hasConcept C90805587 @default.
- W610424550 hasConcept C94124525 @default.
- W610424550 hasConcept C95623464 @default.
- W610424550 hasConceptScore W610424550C11413529 @default.
- W610424550 hasConceptScore W610424550C138885662 @default.
- W610424550 hasConceptScore W610424550C154945302 @default.
- W610424550 hasConceptScore W610424550C203005215 @default.
- W610424550 hasConceptScore W610424550C204321447 @default.
- W610424550 hasConceptScore W610424550C41008148 @default.
- W610424550 hasConceptScore W610424550C41895202 @default.
- W610424550 hasConceptScore W610424550C57493831 @default.
- W610424550 hasConceptScore W610424550C90805587 @default.
- W610424550 hasConceptScore W610424550C94124525 @default.
- W610424550 hasConceptScore W610424550C95623464 @default.
- W610424550 hasLocation W6104245501 @default.
- W610424550 hasOpenAccess W610424550 @default.
- W610424550 hasPrimaryLocation W6104245501 @default.
- W610424550 hasRelatedWork W128700830 @default.
- W610424550 hasRelatedWork W1964591498 @default.
- W610424550 hasRelatedWork W2000200471 @default.
- W610424550 hasRelatedWork W2100936938 @default.
- W610424550 hasRelatedWork W2108642120 @default.
- W610424550 hasRelatedWork W2113351918 @default.
- W610424550 hasRelatedWork W2121524931 @default.
- W610424550 hasRelatedWork W2124810114 @default.
- W610424550 hasRelatedWork W21943434 @default.
- W610424550 hasRelatedWork W2904757815 @default.
- W610424550 hasRelatedWork W2970045405 @default.
- W610424550 hasRelatedWork W3030128163 @default.
- W610424550 hasRelatedWork W3097232161 @default.
- W610424550 hasRelatedWork W3142026199 @default.
- W610424550 hasRelatedWork W3170138813 @default.
- W610424550 hasRelatedWork W3177376359 @default.
- W610424550 hasRelatedWork W8671328 @default.
- W610424550 hasRelatedWork W988067430 @default.
- W610424550 hasRelatedWork W2112079216 @default.
- W610424550 hasRelatedWork W2521715639 @default.
- W610424550 isParatext "false" @default.
- W610424550 isRetracted "false" @default.
- W610424550 magId "610424550" @default.
- W610424550 workType "dissertation" @default.