Matches in SemOpenAlex for { <https://semopenalex.org/work/W4380137047> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4380137047 abstract "Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain. Although continued pre-training on a large domain-specific corpus is effective, it is costly to tune all the parameters on the domain. In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters. Specifically, we decouple the feed-forward networks (FFNs) of the Transformer architecture into two parts: the original pre-trained FFNs to maintain the old-domain knowledge and our novel domain-specific adapters to inject domain-specific knowledge in parallel. Then we adopt a mixture-of-adapters gate to fuse the knowledge from different domain adapters dynamically. Our proposed Mixture-of-Domain-Adapters (MixDA) employs a two-stage adapter-tuning strategy that leverages both unlabeled data and labeled data to help the domain adaptation: i) domain-specific adapter on unlabeled data; followed by ii) the task-specific adapter on labeled data. MixDA can be seamlessly plugged into the pretraining-finetuning paradigm and our experiments demonstrate that MixDA achieves superior performance on in-domain tasks (GLUE), out-of-domain tasks (ChemProt, RCT, IMDB, Amazon), and knowledge-intensive tasks (KILT). Further analyses demonstrate the reliability, scalability, and efficiency of our method. The code is available at https://github.com/Amano-Aki/Mixture-of-Domain-Adapters." @default.
- W4380137047 created "2023-06-10" @default.
- W4380137047 creator A5037709424 @default.
- W4380137047 creator A5044785404 @default.
- W4380137047 creator A5046621815 @default.
- W4380137047 creator A5049469328 @default.
- W4380137047 creator A5058966093 @default.
- W4380137047 date "2023-06-08" @default.
- W4380137047 modified "2023-09-24" @default.
- W4380137047 title "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories" @default.
- W4380137047 doi "https://doi.org/10.48550/arxiv.2306.05406" @default.
- W4380137047 hasPublicationYear "2023" @default.
- W4380137047 type Work @default.
- W4380137047 citedByCount "0" @default.
- W4380137047 crossrefType "posted-content" @default.
- W4380137047 hasAuthorship W4380137047A5037709424 @default.
- W4380137047 hasAuthorship W4380137047A5044785404 @default.
- W4380137047 hasAuthorship W4380137047A5046621815 @default.
- W4380137047 hasAuthorship W4380137047A5049469328 @default.
- W4380137047 hasAuthorship W4380137047A5058966093 @default.
- W4380137047 hasBestOaLocation W43801370471 @default.
- W4380137047 hasConcept C119857082 @default.
- W4380137047 hasConcept C121332964 @default.
- W4380137047 hasConcept C127413603 @default.
- W4380137047 hasConcept C133731056 @default.
- W4380137047 hasConcept C134306372 @default.
- W4380137047 hasConcept C137293760 @default.
- W4380137047 hasConcept C154945302 @default.
- W4380137047 hasConcept C165801399 @default.
- W4380137047 hasConcept C177284502 @default.
- W4380137047 hasConcept C205606062 @default.
- W4380137047 hasConcept C207685749 @default.
- W4380137047 hasConcept C2776434776 @default.
- W4380137047 hasConcept C33923547 @default.
- W4380137047 hasConcept C36503486 @default.
- W4380137047 hasConcept C41008148 @default.
- W4380137047 hasConcept C48044578 @default.
- W4380137047 hasConcept C62520636 @default.
- W4380137047 hasConcept C66322947 @default.
- W4380137047 hasConcept C77088390 @default.
- W4380137047 hasConcept C9390403 @default.
- W4380137047 hasConcept C95623464 @default.
- W4380137047 hasConceptScore W4380137047C119857082 @default.
- W4380137047 hasConceptScore W4380137047C121332964 @default.
- W4380137047 hasConceptScore W4380137047C127413603 @default.
- W4380137047 hasConceptScore W4380137047C133731056 @default.
- W4380137047 hasConceptScore W4380137047C134306372 @default.
- W4380137047 hasConceptScore W4380137047C137293760 @default.
- W4380137047 hasConceptScore W4380137047C154945302 @default.
- W4380137047 hasConceptScore W4380137047C165801399 @default.
- W4380137047 hasConceptScore W4380137047C177284502 @default.
- W4380137047 hasConceptScore W4380137047C205606062 @default.
- W4380137047 hasConceptScore W4380137047C207685749 @default.
- W4380137047 hasConceptScore W4380137047C2776434776 @default.
- W4380137047 hasConceptScore W4380137047C33923547 @default.
- W4380137047 hasConceptScore W4380137047C36503486 @default.
- W4380137047 hasConceptScore W4380137047C41008148 @default.
- W4380137047 hasConceptScore W4380137047C48044578 @default.
- W4380137047 hasConceptScore W4380137047C62520636 @default.
- W4380137047 hasConceptScore W4380137047C66322947 @default.
- W4380137047 hasConceptScore W4380137047C77088390 @default.
- W4380137047 hasConceptScore W4380137047C9390403 @default.
- W4380137047 hasConceptScore W4380137047C95623464 @default.
- W4380137047 hasLocation W43801370471 @default.
- W4380137047 hasOpenAccess W4380137047 @default.
- W4380137047 hasPrimaryLocation W43801370471 @default.
- W4380137047 hasRelatedWork W1839843306 @default.
- W4380137047 hasRelatedWork W3049591882 @default.
- W4380137047 hasRelatedWork W3163341246 @default.
- W4380137047 hasRelatedWork W3215620590 @default.
- W4380137047 hasRelatedWork W4287689101 @default.
- W4380137047 hasRelatedWork W4287828684 @default.
- W4380137047 hasRelatedWork W4288065691 @default.
- W4380137047 hasRelatedWork W4307206004 @default.
- W4380137047 hasRelatedWork W4321593827 @default.
- W4380137047 hasRelatedWork W4375868933 @default.
- W4380137047 isParatext "false" @default.
- W4380137047 isRetracted "false" @default.
- W4380137047 workType "article" @default.