Matches in SemOpenAlex for { <https://semopenalex.org/work/W3169932073> ?p ?o ?g. }
- W3169932073 abstract "Document-level MT models are still far from satisfactory. Existing work extend translation unit from single sentence to multiple sentences. However, study shows that when we further enlarge the translation unit to a whole document, supervised training of Transformer can fail. In this paper, we find such failure is not caused by overfitting, but by sticking around local minima during training. Our analysis shows that the increased complexity of target-to-source attention is a reason for the failure. As a solution, we propose G-Transformer, introducing locality assumption as an inductive bias into Transformer, reducing the hypothesis space of the attention from target to source. Experiments show that G-Transformer converges faster and more stably than Transformer, achieving new state-of-the-art BLEU scores for both non-pretraining and pre-training settings on three benchmark datasets." @default.
- W3169932073 created "2021-06-22" @default.
- W3169932073 creator A5010399428 @default.
- W3169932073 creator A5010776860 @default.
- W3169932073 creator A5012543243 @default.
- W3169932073 creator A5019715118 @default.
- W3169932073 creator A5085736941 @default.
- W3169932073 date "2021-05-31" @default.
- W3169932073 modified "2023-09-27" @default.
- W3169932073 title "G-Transformer for Document-level Machine Translation" @default.
- W3169932073 cites W1753482797 @default.
- W3169932073 cites W2006969979 @default.
- W3169932073 cites W2038698865 @default.
- W3169932073 cites W203948990 @default.
- W3169932073 cites W2152263452 @default.
- W3169932073 cites W2153653739 @default.
- W3169932073 cites W2162245945 @default.
- W3169932073 cites W2514722822 @default.
- W3169932073 cites W2626778328 @default.
- W3169932073 cites W2806987872 @default.
- W3169932073 cites W2888159079 @default.
- W3169932073 cites W2891534142 @default.
- W3169932073 cites W2949888546 @default.
- W3169932073 cites W2952446148 @default.
- W3169932073 cites W2962712961 @default.
- W3169932073 cites W2962784628 @default.
- W3169932073 cites W2962802109 @default.
- W3169932073 cites W2962943802 @default.
- W3169932073 cites W2963506925 @default.
- W3169932073 cites W2964308564 @default.
- W3169932073 cites W2970529093 @default.
- W3169932073 cites W2971347700 @default.
- W3169932073 cites W2982399380 @default.
- W3169932073 cites W3007795830 @default.
- W3169932073 cites W3015468748 @default.
- W3169932073 cites W3042199843 @default.
- W3169932073 cites W3102507836 @default.
- W3169932073 cites W3103878009 @default.
- W3169932073 cites W3127320390 @default.
- W3169932073 doi "https://doi.org/10.48550/arxiv.2105.14761" @default.
- W3169932073 hasPublicationYear "2021" @default.
- W3169932073 type Work @default.
- W3169932073 sameAs 3169932073 @default.
- W3169932073 citedByCount "0" @default.
- W3169932073 crossrefType "posted-content" @default.
- W3169932073 hasAuthorship W3169932073A5010399428 @default.
- W3169932073 hasAuthorship W3169932073A5010776860 @default.
- W3169932073 hasAuthorship W3169932073A5012543243 @default.
- W3169932073 hasAuthorship W3169932073A5019715118 @default.
- W3169932073 hasAuthorship W3169932073A5085736941 @default.
- W3169932073 hasBestOaLocation W31699320731 @default.
- W3169932073 hasConcept C119599485 @default.
- W3169932073 hasConcept C119857082 @default.
- W3169932073 hasConcept C127413603 @default.
- W3169932073 hasConcept C134306372 @default.
- W3169932073 hasConcept C138885662 @default.
- W3169932073 hasConcept C154945302 @default.
- W3169932073 hasConcept C165801399 @default.
- W3169932073 hasConcept C186633575 @default.
- W3169932073 hasConcept C203005215 @default.
- W3169932073 hasConcept C204321447 @default.
- W3169932073 hasConcept C22019652 @default.
- W3169932073 hasConcept C2777530160 @default.
- W3169932073 hasConcept C2779808786 @default.
- W3169932073 hasConcept C28490314 @default.
- W3169932073 hasConcept C33923547 @default.
- W3169932073 hasConcept C41008148 @default.
- W3169932073 hasConcept C41895202 @default.
- W3169932073 hasConcept C50644808 @default.
- W3169932073 hasConcept C66322947 @default.
- W3169932073 hasConceptScore W3169932073C119599485 @default.
- W3169932073 hasConceptScore W3169932073C119857082 @default.
- W3169932073 hasConceptScore W3169932073C127413603 @default.
- W3169932073 hasConceptScore W3169932073C134306372 @default.
- W3169932073 hasConceptScore W3169932073C138885662 @default.
- W3169932073 hasConceptScore W3169932073C154945302 @default.
- W3169932073 hasConceptScore W3169932073C165801399 @default.
- W3169932073 hasConceptScore W3169932073C186633575 @default.
- W3169932073 hasConceptScore W3169932073C203005215 @default.
- W3169932073 hasConceptScore W3169932073C204321447 @default.
- W3169932073 hasConceptScore W3169932073C22019652 @default.
- W3169932073 hasConceptScore W3169932073C2777530160 @default.
- W3169932073 hasConceptScore W3169932073C2779808786 @default.
- W3169932073 hasConceptScore W3169932073C28490314 @default.
- W3169932073 hasConceptScore W3169932073C33923547 @default.
- W3169932073 hasConceptScore W3169932073C41008148 @default.
- W3169932073 hasConceptScore W3169932073C41895202 @default.
- W3169932073 hasConceptScore W3169932073C50644808 @default.
- W3169932073 hasConceptScore W3169932073C66322947 @default.
- W3169932073 hasLocation W31699320731 @default.
- W3169932073 hasOpenAccess W3169932073 @default.
- W3169932073 hasPrimaryLocation W31699320731 @default.
- W3169932073 hasRelatedWork W1517743118 @default.
- W3169932073 hasRelatedWork W1585034923 @default.
- W3169932073 hasRelatedWork W1978971213 @default.
- W3169932073 hasRelatedWork W2398825887 @default.
- W3169932073 hasRelatedWork W2989156240 @default.
- W3169932073 hasRelatedWork W2989932438 @default.
- W3169932073 hasRelatedWork W3099765033 @default.
- W3169932073 hasRelatedWork W3107474891 @default.