Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385571304> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W4385571304 abstract "Despite the huge progress in myriad generation tasks, pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts with maximization-based decoding algorithms for open-ended generation. We attribute their overestimation of token-level repetition probabilities to the learning bias: LMs capture simple repetitive patterns faster with the MLE loss. We propose self-contrastive training to penalize the output of a premature checkpoint of the same model when it incorrectly predicts repetition, which is shown to mitigate repetition effectively while maintaining fluency on two datasets. Furthermore, we find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops." @default.
- W4385571304 created "2023-08-05" @default.
- W4385571304 creator A5044042138 @default.
- W4385571304 creator A5059231561 @default.
- W4385571304 date "2023-01-01" @default.
- W4385571304 modified "2023-10-18" @default.
- W4385571304 title "Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation" @default.
- W4385571304 doi "https://doi.org/10.18653/v1/2023.findings-acl.431" @default.
- W4385571304 hasPublicationYear "2023" @default.
- W4385571304 type Work @default.
- W4385571304 citedByCount "0" @default.
- W4385571304 crossrefType "proceedings-article" @default.
- W4385571304 hasAuthorship W4385571304A5044042138 @default.
- W4385571304 hasAuthorship W4385571304A5059231561 @default.
- W4385571304 hasBestOaLocation W43855713041 @default.
- W4385571304 hasConcept C11413529 @default.
- W4385571304 hasConcept C138885662 @default.
- W4385571304 hasConcept C145420912 @default.
- W4385571304 hasConcept C154945302 @default.
- W4385571304 hasConcept C15744967 @default.
- W4385571304 hasConcept C159985019 @default.
- W4385571304 hasConcept C192562407 @default.
- W4385571304 hasConcept C204323151 @default.
- W4385571304 hasConcept C2776141515 @default.
- W4385571304 hasConcept C2777413886 @default.
- W4385571304 hasConcept C2777530160 @default.
- W4385571304 hasConcept C28490314 @default.
- W4385571304 hasConcept C38652104 @default.
- W4385571304 hasConcept C41008148 @default.
- W4385571304 hasConcept C41895202 @default.
- W4385571304 hasConcept C48145219 @default.
- W4385571304 hasConcept C57273362 @default.
- W4385571304 hasConceptScore W4385571304C11413529 @default.
- W4385571304 hasConceptScore W4385571304C138885662 @default.
- W4385571304 hasConceptScore W4385571304C145420912 @default.
- W4385571304 hasConceptScore W4385571304C154945302 @default.
- W4385571304 hasConceptScore W4385571304C15744967 @default.
- W4385571304 hasConceptScore W4385571304C159985019 @default.
- W4385571304 hasConceptScore W4385571304C192562407 @default.
- W4385571304 hasConceptScore W4385571304C204323151 @default.
- W4385571304 hasConceptScore W4385571304C2776141515 @default.
- W4385571304 hasConceptScore W4385571304C2777413886 @default.
- W4385571304 hasConceptScore W4385571304C2777530160 @default.
- W4385571304 hasConceptScore W4385571304C28490314 @default.
- W4385571304 hasConceptScore W4385571304C38652104 @default.
- W4385571304 hasConceptScore W4385571304C41008148 @default.
- W4385571304 hasConceptScore W4385571304C41895202 @default.
- W4385571304 hasConceptScore W4385571304C48145219 @default.
- W4385571304 hasConceptScore W4385571304C57273362 @default.
- W4385571304 hasLocation W43855713041 @default.
- W4385571304 hasOpenAccess W4385571304 @default.
- W4385571304 hasPrimaryLocation W43855713041 @default.
- W4385571304 hasRelatedWork W2052148722 @default.
- W4385571304 hasRelatedWork W2100622387 @default.
- W4385571304 hasRelatedWork W2140368902 @default.
- W4385571304 hasRelatedWork W2170125247 @default.
- W4385571304 hasRelatedWork W2348930196 @default.
- W4385571304 hasRelatedWork W2356170410 @default.
- W4385571304 hasRelatedWork W2375389409 @default.
- W4385571304 hasRelatedWork W2776072854 @default.
- W4385571304 hasRelatedWork W4223990875 @default.
- W4385571304 hasRelatedWork W4383469013 @default.
- W4385571304 isParatext "false" @default.
- W4385571304 isRetracted "false" @default.
- W4385571304 workType "article" @default.