Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385570879> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4385570879 abstract "Training a large language model in low-resource settings is challenging since they are susceptible to overfitting with limited generalization abilities. Previous work addresses this issue by approaches such as tunable parameters reduction or data augmentation. However, they either limit the trained models’ expressiveness or rely on task-independent knowledge. In this paper, we propose the Bi-level Finetuning with Task-dependent Similarity Structure framework where all parameters, including the embeddings for unseen tokens, are finetuned with task-dependent information from the training data only. In this framework, a task-dependent similarity structure is learned in a data-driven fashion, which in turn is used to compose soft embeddings from conventional embeddings to be used in training to update all parameters. In order to learn the similarity structure and model parameters, we propose a bi-level optimization algorithm with two stages—search and finetune—to ensure successful learning. Results of experiments on several classification datasets in low-resource scenarios demonstrate that models trained with our method outperform strong baselines. Ablation experiments further support the effectiveness of different components in our framework. Code is available at https://github.com/Sai-Ashish/BFTSS." @default.
- W4385570879 created "2023-08-05" @default.
- W4385570879 creator A5007529850 @default.
- W4385570879 creator A5013438739 @default.
- W4385570879 creator A5016877422 @default.
- W4385570879 creator A5020634084 @default.
- W4385570879 creator A5071533156 @default.
- W4385570879 date "2023-01-01" @default.
- W4385570879 modified "2023-09-24" @default.
- W4385570879 title "Bi-level Finetuning with Task-dependent Similarity Structure for Low-resource Training" @default.
- W4385570879 doi "https://doi.org/10.18653/v1/2023.findings-acl.544" @default.
- W4385570879 hasPublicationYear "2023" @default.
- W4385570879 type Work @default.
- W4385570879 citedByCount "0" @default.
- W4385570879 crossrefType "proceedings-article" @default.
- W4385570879 hasAuthorship W4385570879A5007529850 @default.
- W4385570879 hasAuthorship W4385570879A5013438739 @default.
- W4385570879 hasAuthorship W4385570879A5016877422 @default.
- W4385570879 hasAuthorship W4385570879A5020634084 @default.
- W4385570879 hasAuthorship W4385570879A5071533156 @default.
- W4385570879 hasBestOaLocation W43855708791 @default.
- W4385570879 hasConcept C103278499 @default.
- W4385570879 hasConcept C115961682 @default.
- W4385570879 hasConcept C119857082 @default.
- W4385570879 hasConcept C124101348 @default.
- W4385570879 hasConcept C134306372 @default.
- W4385570879 hasConcept C154945302 @default.
- W4385570879 hasConcept C162324750 @default.
- W4385570879 hasConcept C177148314 @default.
- W4385570879 hasConcept C177264268 @default.
- W4385570879 hasConcept C187736073 @default.
- W4385570879 hasConcept C199360897 @default.
- W4385570879 hasConcept C206345919 @default.
- W4385570879 hasConcept C22019652 @default.
- W4385570879 hasConcept C2776760102 @default.
- W4385570879 hasConcept C2780451532 @default.
- W4385570879 hasConcept C31258907 @default.
- W4385570879 hasConcept C33923547 @default.
- W4385570879 hasConcept C41008148 @default.
- W4385570879 hasConcept C50644808 @default.
- W4385570879 hasConceptScore W4385570879C103278499 @default.
- W4385570879 hasConceptScore W4385570879C115961682 @default.
- W4385570879 hasConceptScore W4385570879C119857082 @default.
- W4385570879 hasConceptScore W4385570879C124101348 @default.
- W4385570879 hasConceptScore W4385570879C134306372 @default.
- W4385570879 hasConceptScore W4385570879C154945302 @default.
- W4385570879 hasConceptScore W4385570879C162324750 @default.
- W4385570879 hasConceptScore W4385570879C177148314 @default.
- W4385570879 hasConceptScore W4385570879C177264268 @default.
- W4385570879 hasConceptScore W4385570879C187736073 @default.
- W4385570879 hasConceptScore W4385570879C199360897 @default.
- W4385570879 hasConceptScore W4385570879C206345919 @default.
- W4385570879 hasConceptScore W4385570879C22019652 @default.
- W4385570879 hasConceptScore W4385570879C2776760102 @default.
- W4385570879 hasConceptScore W4385570879C2780451532 @default.
- W4385570879 hasConceptScore W4385570879C31258907 @default.
- W4385570879 hasConceptScore W4385570879C33923547 @default.
- W4385570879 hasConceptScore W4385570879C41008148 @default.
- W4385570879 hasConceptScore W4385570879C50644808 @default.
- W4385570879 hasLocation W43855708791 @default.
- W4385570879 hasOpenAccess W4385570879 @default.
- W4385570879 hasPrimaryLocation W43855708791 @default.
- W4385570879 hasRelatedWork W1996541855 @default.
- W4385570879 hasRelatedWork W2346074333 @default.
- W4385570879 hasRelatedWork W2940336242 @default.
- W4385570879 hasRelatedWork W2953328427 @default.
- W4385570879 hasRelatedWork W2989932438 @default.
- W4385570879 hasRelatedWork W3099765033 @default.
- W4385570879 hasRelatedWork W4210794429 @default.
- W4385570879 hasRelatedWork W4313159793 @default.
- W4385570879 hasRelatedWork W4327988962 @default.
- W4385570879 hasRelatedWork W4362499066 @default.
- W4385570879 isParatext "false" @default.
- W4385570879 isRetracted "false" @default.
- W4385570879 workType "article" @default.