Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385569751> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W4385569751 abstract "Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain. Although continued pre-training on a large domain-specific corpus is effective, it is costly to tune all the parameters on the domain. In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters. Specifically, we decouple the feed-forward networks (FFNs) of the Transformer architecture into two parts: the original pre-trained FFNs to maintain the old-domain knowledge and our novel domain-specific adapters to inject domain-specific knowledge in parallel.Then we adopt a mixture-of-adapters gate to fuse the knowledge from different domain adapters dynamically. Our proposed Mixture-of-Domain-Adapters (MixDA) employs a two-stage adapter-tuning strategy that leverages both unlabeled data and labeled data to help the domain adaptation: i) domain-specific adapter on unlabeled data; followed by ii) the task-specific adapter on labeled data. MixDA can be seamlessly plugged into the pretraining-finetuning paradigm and our experiments demonstrate that MixDA achieves superior performance on in-domain tasks (GLUE), out-of-domain tasks (ChemProt, RCT, IMDB, Amazon), and knowledge-intensive tasks (KILT).Further analyses demonstrate the reliability, scalability, and efficiency of our method." @default.
- W4385569751 created "2023-08-05" @default.
- W4385569751 creator A5037709424 @default.
- W4385569751 creator A5044785404 @default.
- W4385569751 creator A5046621815 @default.
- W4385569751 creator A5049469328 @default.
- W4385569751 creator A5058966093 @default.
- W4385569751 date "2023-01-01" @default.
- W4385569751 modified "2023-09-24" @default.
- W4385569751 title "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models’ Memories" @default.
- W4385569751 doi "https://doi.org/10.18653/v1/2023.acl-long.280" @default.
- W4385569751 hasPublicationYear "2023" @default.
- W4385569751 type Work @default.
- W4385569751 citedByCount "0" @default.
- W4385569751 crossrefType "proceedings-article" @default.
- W4385569751 hasAuthorship W4385569751A5037709424 @default.
- W4385569751 hasAuthorship W4385569751A5044785404 @default.
- W4385569751 hasAuthorship W4385569751A5046621815 @default.
- W4385569751 hasAuthorship W4385569751A5049469328 @default.
- W4385569751 hasAuthorship W4385569751A5058966093 @default.
- W4385569751 hasBestOaLocation W43855697511 @default.
- W4385569751 hasConcept C119857082 @default.
- W4385569751 hasConcept C121332964 @default.
- W4385569751 hasConcept C134306372 @default.
- W4385569751 hasConcept C137293760 @default.
- W4385569751 hasConcept C154945302 @default.
- W4385569751 hasConcept C165801399 @default.
- W4385569751 hasConcept C177284502 @default.
- W4385569751 hasConcept C207685749 @default.
- W4385569751 hasConcept C2776434776 @default.
- W4385569751 hasConcept C33923547 @default.
- W4385569751 hasConcept C36503486 @default.
- W4385569751 hasConcept C41008148 @default.
- W4385569751 hasConcept C48044578 @default.
- W4385569751 hasConcept C62520636 @default.
- W4385569751 hasConcept C66322947 @default.
- W4385569751 hasConcept C77088390 @default.
- W4385569751 hasConcept C9390403 @default.
- W4385569751 hasConcept C95623464 @default.
- W4385569751 hasConceptScore W4385569751C119857082 @default.
- W4385569751 hasConceptScore W4385569751C121332964 @default.
- W4385569751 hasConceptScore W4385569751C134306372 @default.
- W4385569751 hasConceptScore W4385569751C137293760 @default.
- W4385569751 hasConceptScore W4385569751C154945302 @default.
- W4385569751 hasConceptScore W4385569751C165801399 @default.
- W4385569751 hasConceptScore W4385569751C177284502 @default.
- W4385569751 hasConceptScore W4385569751C207685749 @default.
- W4385569751 hasConceptScore W4385569751C2776434776 @default.
- W4385569751 hasConceptScore W4385569751C33923547 @default.
- W4385569751 hasConceptScore W4385569751C36503486 @default.
- W4385569751 hasConceptScore W4385569751C41008148 @default.
- W4385569751 hasConceptScore W4385569751C48044578 @default.
- W4385569751 hasConceptScore W4385569751C62520636 @default.
- W4385569751 hasConceptScore W4385569751C66322947 @default.
- W4385569751 hasConceptScore W4385569751C77088390 @default.
- W4385569751 hasConceptScore W4385569751C9390403 @default.
- W4385569751 hasConceptScore W4385569751C95623464 @default.
- W4385569751 hasLocation W43855697511 @default.
- W4385569751 hasOpenAccess W4385569751 @default.
- W4385569751 hasPrimaryLocation W43855697511 @default.
- W4385569751 hasRelatedWork W3049591882 @default.
- W4385569751 hasRelatedWork W3105601216 @default.
- W4385569751 hasRelatedWork W3163341246 @default.
- W4385569751 hasRelatedWork W4287689101 @default.
- W4385569751 hasRelatedWork W4287816848 @default.
- W4385569751 hasRelatedWork W4307206004 @default.
- W4385569751 hasRelatedWork W4312091809 @default.
- W4385569751 hasRelatedWork W4318186067 @default.
- W4385569751 hasRelatedWork W4321593827 @default.
- W4385569751 hasRelatedWork W4380137047 @default.
- W4385569751 isParatext "false" @default.
- W4385569751 isRetracted "false" @default.
- W4385569751 workType "article" @default.