Matches in SemOpenAlex for { <https://semopenalex.org/work/W4312205996> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W4312205996 abstract "Recent work has shown that fine-tuning large pre-trained language models on a collection of tasks described via instructions, a.k.a. instruction-tuning, improves their zero and few-shot generalization to unseen tasks. However, there is a limited understanding of the performance trade-offs of different decisions made during the instruction-tuning process. These decisions include the scale and diversity of the instruction-tuning benchmark, different task sampling strategies, fine-tuning with and without demonstrations, training using specialized datasets for reasoning and dialogue, and finally, the fine-tuning objectives themselves. In this paper, we characterize the effect of instruction-tuning decisions on downstream task performance when scaling both model and benchmark sizes. To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks. Through the lens of this framework, we first present insights about instruction-tuning decisions as applied to OPT-30B and further exploit these insights to train OPT-IML 30B and 175B, which are instruction-tuned versions of OPT. OPT-IML demonstrates all three generalization abilities at both scales on four different evaluation benchmarks with diverse tasks and input formats -- PromptSource, FLAN, Super-NaturalInstructions, and UnifiedSKG. Not only does it significantly outperform OPT on all benchmarks but is also highly competitive with existing models fine-tuned on each specific benchmark. We release OPT-IML at both scales, together with the OPT-IML Bench evaluation framework." @default.
- W4312205996 created "2023-01-04" @default.
- W4312205996 creator A5005030122 @default.
- W4312205996 creator A5011979551 @default.
- W4312205996 creator A5014582200 @default.
- W4312205996 creator A5017941291 @default.
- W4312205996 creator A5023509514 @default.
- W4312205996 creator A5026227210 @default.
- W4312205996 creator A5030468199 @default.
- W4312205996 creator A5048651878 @default.
- W4312205996 creator A5051808640 @default.
- W4312205996 creator A5052408504 @default.
- W4312205996 creator A5061222029 @default.
- W4312205996 creator A5062266757 @default.
- W4312205996 creator A5066820530 @default.
- W4312205996 creator A5067919401 @default.
- W4312205996 creator A5075564427 @default.
- W4312205996 creator A5087533357 @default.
- W4312205996 creator A5088004913 @default.
- W4312205996 creator A5090954457 @default.
- W4312205996 date "2022-12-22" @default.
- W4312205996 modified "2023-10-16" @default.
- W4312205996 title "OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization" @default.
- W4312205996 doi "https://doi.org/10.48550/arxiv.2212.12017" @default.
- W4312205996 hasPublicationYear "2022" @default.
- W4312205996 type Work @default.
- W4312205996 citedByCount "0" @default.
- W4312205996 crossrefType "posted-content" @default.
- W4312205996 hasAuthorship W4312205996A5005030122 @default.
- W4312205996 hasAuthorship W4312205996A5011979551 @default.
- W4312205996 hasAuthorship W4312205996A5014582200 @default.
- W4312205996 hasAuthorship W4312205996A5017941291 @default.
- W4312205996 hasAuthorship W4312205996A5023509514 @default.
- W4312205996 hasAuthorship W4312205996A5026227210 @default.
- W4312205996 hasAuthorship W4312205996A5030468199 @default.
- W4312205996 hasAuthorship W4312205996A5048651878 @default.
- W4312205996 hasAuthorship W4312205996A5051808640 @default.
- W4312205996 hasAuthorship W4312205996A5052408504 @default.
- W4312205996 hasAuthorship W4312205996A5061222029 @default.
- W4312205996 hasAuthorship W4312205996A5062266757 @default.
- W4312205996 hasAuthorship W4312205996A5066820530 @default.
- W4312205996 hasAuthorship W4312205996A5067919401 @default.
- W4312205996 hasAuthorship W4312205996A5075564427 @default.
- W4312205996 hasAuthorship W4312205996A5087533357 @default.
- W4312205996 hasAuthorship W4312205996A5088004913 @default.
- W4312205996 hasAuthorship W4312205996A5090954457 @default.
- W4312205996 hasBestOaLocation W43122059961 @default.
- W4312205996 hasConcept C119857082 @default.
- W4312205996 hasConcept C13280743 @default.
- W4312205996 hasConcept C134306372 @default.
- W4312205996 hasConcept C137293760 @default.
- W4312205996 hasConcept C154945302 @default.
- W4312205996 hasConcept C162324750 @default.
- W4312205996 hasConcept C165696696 @default.
- W4312205996 hasConcept C177148314 @default.
- W4312205996 hasConcept C185798385 @default.
- W4312205996 hasConcept C187736073 @default.
- W4312205996 hasConcept C205649164 @default.
- W4312205996 hasConcept C2780451532 @default.
- W4312205996 hasConcept C33923547 @default.
- W4312205996 hasConcept C38652104 @default.
- W4312205996 hasConcept C41008148 @default.
- W4312205996 hasConceptScore W4312205996C119857082 @default.
- W4312205996 hasConceptScore W4312205996C13280743 @default.
- W4312205996 hasConceptScore W4312205996C134306372 @default.
- W4312205996 hasConceptScore W4312205996C137293760 @default.
- W4312205996 hasConceptScore W4312205996C154945302 @default.
- W4312205996 hasConceptScore W4312205996C162324750 @default.
- W4312205996 hasConceptScore W4312205996C165696696 @default.
- W4312205996 hasConceptScore W4312205996C177148314 @default.
- W4312205996 hasConceptScore W4312205996C185798385 @default.
- W4312205996 hasConceptScore W4312205996C187736073 @default.
- W4312205996 hasConceptScore W4312205996C205649164 @default.
- W4312205996 hasConceptScore W4312205996C2780451532 @default.
- W4312205996 hasConceptScore W4312205996C33923547 @default.
- W4312205996 hasConceptScore W4312205996C38652104 @default.
- W4312205996 hasConceptScore W4312205996C41008148 @default.
- W4312205996 hasLocation W43122059961 @default.
- W4312205996 hasOpenAccess W4312205996 @default.
- W4312205996 hasPrimaryLocation W43122059961 @default.
- W4312205996 hasRelatedWork W2951006695 @default.
- W4312205996 hasRelatedWork W2983785000 @default.
- W4312205996 hasRelatedWork W2989932438 @default.
- W4312205996 hasRelatedWork W3138953784 @default.
- W4312205996 hasRelatedWork W3190203064 @default.
- W4312205996 hasRelatedWork W3210045201 @default.
- W4312205996 hasRelatedWork W4221150964 @default.
- W4312205996 hasRelatedWork W4281566512 @default.
- W4312205996 hasRelatedWork W4282028325 @default.
- W4312205996 hasRelatedWork W4287025741 @default.
- W4312205996 isParatext "false" @default.
- W4312205996 isRetracted "false" @default.
- W4312205996 workType "article" @default.