Matches in SemOpenAlex for { <https://semopenalex.org/work/W3213403014> ?p ?o ?g. }
- W3213403014 abstract "Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation (NMT). While filtering such pairs out is known to improve final model quality, we argue that it is suboptimal in low-resource conditions where even mined data can be limited. In our work, we propose instead, to refine the mined bitexts via automatic editing: given a sentence in a language xf, and a possibly imperfect translation of it xe, our model generates a revised version xf' or xe' that yields a more equivalent translation pair (i.e., <xf, xe'> or <xf', xe>). We use a simple editing strategy by (1) mining potentially imperfect translations for each sentence in a given bitext, (2) learning a model to reconstruct the original translations and translate, in a multi-task fashion. Experiments demonstrate that our approach successfully improves the quality of CCMatrix mined bitext for 5 low-resource language-pairs and 10 translation directions by up to 8 BLEU points, in most cases improving upon a competitive translation-based baseline." @default.
- W3213403014 created "2021-11-22" @default.
- W3213403014 creator A5011974509 @default.
- W3213403014 creator A5059930218 @default.
- W3213403014 creator A5067919401 @default.
- W3213403014 creator A5084195779 @default.
- W3213403014 date "2022-01-01" @default.
- W3213403014 modified "2023-10-01" @default.
- W3213403014 title "BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine Translation" @default.
- W3213403014 cites W2124807415 @default.
- W3213403014 cites W2145080939 @default.
- W3213403014 cites W2149327368 @default.
- W3213403014 cites W2250342921 @default.
- W3213403014 cites W2251930319 @default.
- W3213403014 cites W2419539795 @default.
- W3213403014 cites W2788330850 @default.
- W3213403014 cites W2798389157 @default.
- W3213403014 cites W2885421725 @default.
- W3213403014 cites W2891713103 @default.
- W3213403014 cites W2902319873 @default.
- W3213403014 cites W2902643185 @default.
- W3213403014 cites W2902918014 @default.
- W3213403014 cites W2903035303 @default.
- W3213403014 cites W2903151286 @default.
- W3213403014 cites W2903297715 @default.
- W3213403014 cites W2933138175 @default.
- W3213403014 cites W2962784628 @default.
- W3213403014 cites W2963088995 @default.
- W3213403014 cites W2963261349 @default.
- W3213403014 cites W2963366552 @default.
- W3213403014 cites W2963403868 @default.
- W3213403014 cites W2963506925 @default.
- W3213403014 cites W2963829526 @default.
- W3213403014 cites W2963919854 @default.
- W3213403014 cites W2964022663 @default.
- W3213403014 cites W2970461932 @default.
- W3213403014 cites W2970686691 @default.
- W3213403014 cites W2970858854 @default.
- W3213403014 cites W2970871182 @default.
- W3213403014 cites W2977458338 @default.
- W3213403014 cites W3035016936 @default.
- W3213403014 cites W3098396250 @default.
- W3213403014 cites W3103268933 @default.
- W3213403014 cites W3104273515 @default.
- W3213403014 cites W3105378761 @default.
- W3213403014 cites W3105425516 @default.
- W3213403014 cites W3107826490 @default.
- W3213403014 cites W3119000810 @default.
- W3213403014 cites W3119872155 @default.
- W3213403014 cites W3120896619 @default.
- W3213403014 cites W3121071870 @default.
- W3213403014 cites W3152788712 @default.
- W3213403014 cites W3175301726 @default.
- W3213403014 cites W3198189804 @default.
- W3213403014 doi "https://doi.org/10.18653/v1/2022.findings-naacl.110" @default.
- W3213403014 hasPublicationYear "2022" @default.
- W3213403014 type Work @default.
- W3213403014 sameAs 3213403014 @default.
- W3213403014 citedByCount "0" @default.
- W3213403014 crossrefType "proceedings-article" @default.
- W3213403014 hasAuthorship W3213403014A5011974509 @default.
- W3213403014 hasAuthorship W3213403014A5059930218 @default.
- W3213403014 hasAuthorship W3213403014A5067919401 @default.
- W3213403014 hasAuthorship W3213403014A5084195779 @default.
- W3213403014 hasBestOaLocation W32134030141 @default.
- W3213403014 hasConcept C104317684 @default.
- W3213403014 hasConcept C105580179 @default.
- W3213403014 hasConcept C111472728 @default.
- W3213403014 hasConcept C138885662 @default.
- W3213403014 hasConcept C149364088 @default.
- W3213403014 hasConcept C154945302 @default.
- W3213403014 hasConcept C162324750 @default.
- W3213403014 hasConcept C185592680 @default.
- W3213403014 hasConcept C187736073 @default.
- W3213403014 hasConcept C203005215 @default.
- W3213403014 hasConcept C204321447 @default.
- W3213403014 hasConcept C2777530160 @default.
- W3213403014 hasConcept C2779530757 @default.
- W3213403014 hasConcept C2780310539 @default.
- W3213403014 hasConcept C2780451532 @default.
- W3213403014 hasConcept C41008148 @default.
- W3213403014 hasConcept C41895202 @default.
- W3213403014 hasConcept C55493867 @default.
- W3213403014 hasConceptScore W3213403014C104317684 @default.
- W3213403014 hasConceptScore W3213403014C105580179 @default.
- W3213403014 hasConceptScore W3213403014C111472728 @default.
- W3213403014 hasConceptScore W3213403014C138885662 @default.
- W3213403014 hasConceptScore W3213403014C149364088 @default.
- W3213403014 hasConceptScore W3213403014C154945302 @default.
- W3213403014 hasConceptScore W3213403014C162324750 @default.
- W3213403014 hasConceptScore W3213403014C185592680 @default.
- W3213403014 hasConceptScore W3213403014C187736073 @default.
- W3213403014 hasConceptScore W3213403014C203005215 @default.
- W3213403014 hasConceptScore W3213403014C204321447 @default.
- W3213403014 hasConceptScore W3213403014C2777530160 @default.
- W3213403014 hasConceptScore W3213403014C2779530757 @default.
- W3213403014 hasConceptScore W3213403014C2780310539 @default.
- W3213403014 hasConceptScore W3213403014C2780451532 @default.
- W3213403014 hasConceptScore W3213403014C41008148 @default.
- W3213403014 hasConceptScore W3213403014C41895202 @default.