Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287802662> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W4287802662 abstract "Language model fine-tuning is essential for modern natural language processing, but is computationally expensive and time-consuming. Further, the effectiveness of fine-tuning is limited by the inclusion of training examples that negatively affect performance. Here we present a general fine-tuning method that we call information gain filtration for improving the overall training efficiency and final performance of language model fine-tuning. We define the information gain of an example as the improvement on a test metric after training on that example. A secondary learner is then trained to approximate this quantity. During fine-tuning, this learner selects informative examples and skips uninformative ones. We show that our method has consistent improvement across datasets, fine-tuning tasks, and language model architectures. For example, we achieve a median perplexity of 54.0 on a books dataset compared to 57.3 for standard fine-tuning. We present statistical evidence that offers insight into the improvements of our method over standard fine-tuning. The generality of our method leads us to propose a new paradigm for language model fine-tuning -- we encourage researchers to release pretrained secondary learners on common corpora to promote efficient and effective fine-tuning, thereby improving the performance and reducing the overall energy footprint of language model fine-tuning." @default.
- W4287802662 created "2022-07-26" @default.
- W4287802662 creator A5025310174 @default.
- W4287802662 creator A5038450501 @default.
- W4287802662 creator A5075525077 @default.
- W4287802662 creator A5077310477 @default.
- W4287802662 date "2020-04-30" @default.
- W4287802662 modified "2023-09-27" @default.
- W4287802662 title "Selecting Informative Contexts Improves Language Model Finetuning" @default.
- W4287802662 doi "https://doi.org/10.48550/arxiv.2005.00175" @default.
- W4287802662 hasPublicationYear "2020" @default.
- W4287802662 type Work @default.
- W4287802662 citedByCount "0" @default.
- W4287802662 crossrefType "posted-content" @default.
- W4287802662 hasAuthorship W4287802662A5025310174 @default.
- W4287802662 hasAuthorship W4287802662A5038450501 @default.
- W4287802662 hasAuthorship W4287802662A5075525077 @default.
- W4287802662 hasAuthorship W4287802662A5077310477 @default.
- W4287802662 hasBestOaLocation W42878026621 @default.
- W4287802662 hasConcept C100279451 @default.
- W4287802662 hasConcept C119857082 @default.
- W4287802662 hasConcept C121332964 @default.
- W4287802662 hasConcept C137293760 @default.
- W4287802662 hasConcept C154945302 @default.
- W4287802662 hasConcept C15744967 @default.
- W4287802662 hasConcept C157524613 @default.
- W4287802662 hasConcept C162324750 @default.
- W4287802662 hasConcept C176217482 @default.
- W4287802662 hasConcept C187736073 @default.
- W4287802662 hasConcept C204321447 @default.
- W4287802662 hasConcept C21547014 @default.
- W4287802662 hasConcept C2778915421 @default.
- W4287802662 hasConcept C2780767217 @default.
- W4287802662 hasConcept C2780898871 @default.
- W4287802662 hasConcept C41008148 @default.
- W4287802662 hasConcept C542102704 @default.
- W4287802662 hasConcept C62520636 @default.
- W4287802662 hasConceptScore W4287802662C100279451 @default.
- W4287802662 hasConceptScore W4287802662C119857082 @default.
- W4287802662 hasConceptScore W4287802662C121332964 @default.
- W4287802662 hasConceptScore W4287802662C137293760 @default.
- W4287802662 hasConceptScore W4287802662C154945302 @default.
- W4287802662 hasConceptScore W4287802662C15744967 @default.
- W4287802662 hasConceptScore W4287802662C157524613 @default.
- W4287802662 hasConceptScore W4287802662C162324750 @default.
- W4287802662 hasConceptScore W4287802662C176217482 @default.
- W4287802662 hasConceptScore W4287802662C187736073 @default.
- W4287802662 hasConceptScore W4287802662C204321447 @default.
- W4287802662 hasConceptScore W4287802662C21547014 @default.
- W4287802662 hasConceptScore W4287802662C2778915421 @default.
- W4287802662 hasConceptScore W4287802662C2780767217 @default.
- W4287802662 hasConceptScore W4287802662C2780898871 @default.
- W4287802662 hasConceptScore W4287802662C41008148 @default.
- W4287802662 hasConceptScore W4287802662C542102704 @default.
- W4287802662 hasConceptScore W4287802662C62520636 @default.
- W4287802662 hasLocation W42878026621 @default.
- W4287802662 hasLocation W42878026622 @default.
- W4287802662 hasOpenAccess W4287802662 @default.
- W4287802662 hasPrimaryLocation W42878026621 @default.
- W4287802662 hasRelatedWork W1989705153 @default.
- W4287802662 hasRelatedWork W2131111393 @default.
- W4287802662 hasRelatedWork W2410936271 @default.
- W4287802662 hasRelatedWork W2496228846 @default.
- W4287802662 hasRelatedWork W2896411932 @default.
- W4287802662 hasRelatedWork W2962966012 @default.
- W4287802662 hasRelatedWork W3023594376 @default.
- W4287802662 hasRelatedWork W4287802662 @default.
- W4287802662 hasRelatedWork W4364858084 @default.
- W4287802662 hasRelatedWork W59929963 @default.
- W4287802662 isParatext "false" @default.
- W4287802662 isRetracted "false" @default.
- W4287802662 workType "article" @default.