Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385571237> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4385571237 abstract "Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new languages by learning a new set of embeddings, while keeping the transformer body frozen. Despite learning a small subset of parameters, this approach is not compute-efficient, as training the new embeddings requires a full forward and backward pass over the entire model. We propose mini-model adaptation, a compute-efficient alternative that builds a shallow mini-model from a fraction of a large model’s parameters. New language-specific embeddings can then be efficiently trained over the mini-model and plugged into the aligned large model for rapid cross-lingual transfer. We explore two approaches to learn mini-models: MINIJOINT, which jointly pretrains the primary model and the mini-model using a single transformer with a secondary MLM head at a middle layer; and MINIPOST, where we start from a regular pretrained model, build a mini-model by extracting and freezing a few layers, and learn a small number of parameters on top. Experiments on XNLI, MLQA and PAWS-X show that mini-model adaptation matches the performance of the standard approach using up to 2.3x less compute on average." @default.
- W4385571237 created "2023-08-05" @default.
- W4385571237 creator A5000831279 @default.
- W4385571237 creator A5007937560 @default.
- W4385571237 creator A5023341622 @default.
- W4385571237 creator A5030659562 @default.
- W4385571237 date "2023-01-01" @default.
- W4385571237 modified "2023-09-24" @default.
- W4385571237 title "Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training" @default.
- W4385571237 doi "https://doi.org/10.18653/v1/2023.findings-acl.338" @default.
- W4385571237 hasPublicationYear "2023" @default.
- W4385571237 type Work @default.
- W4385571237 citedByCount "0" @default.
- W4385571237 crossrefType "proceedings-article" @default.
- W4385571237 hasAuthorship W4385571237A5000831279 @default.
- W4385571237 hasAuthorship W4385571237A5007937560 @default.
- W4385571237 hasAuthorship W4385571237A5023341622 @default.
- W4385571237 hasAuthorship W4385571237A5030659562 @default.
- W4385571237 hasBestOaLocation W43855712371 @default.
- W4385571237 hasConcept C119857082 @default.
- W4385571237 hasConcept C120665830 @default.
- W4385571237 hasConcept C121332964 @default.
- W4385571237 hasConcept C137293760 @default.
- W4385571237 hasConcept C139807058 @default.
- W4385571237 hasConcept C150899416 @default.
- W4385571237 hasConcept C154945302 @default.
- W4385571237 hasConcept C165801399 @default.
- W4385571237 hasConcept C177264268 @default.
- W4385571237 hasConcept C199360897 @default.
- W4385571237 hasConcept C2776434776 @default.
- W4385571237 hasConcept C41008148 @default.
- W4385571237 hasConcept C51632099 @default.
- W4385571237 hasConcept C62520636 @default.
- W4385571237 hasConcept C66322947 @default.
- W4385571237 hasConcept C95623464 @default.
- W4385571237 hasConceptScore W4385571237C119857082 @default.
- W4385571237 hasConceptScore W4385571237C120665830 @default.
- W4385571237 hasConceptScore W4385571237C121332964 @default.
- W4385571237 hasConceptScore W4385571237C137293760 @default.
- W4385571237 hasConceptScore W4385571237C139807058 @default.
- W4385571237 hasConceptScore W4385571237C150899416 @default.
- W4385571237 hasConceptScore W4385571237C154945302 @default.
- W4385571237 hasConceptScore W4385571237C165801399 @default.
- W4385571237 hasConceptScore W4385571237C177264268 @default.
- W4385571237 hasConceptScore W4385571237C199360897 @default.
- W4385571237 hasConceptScore W4385571237C2776434776 @default.
- W4385571237 hasConceptScore W4385571237C41008148 @default.
- W4385571237 hasConceptScore W4385571237C51632099 @default.
- W4385571237 hasConceptScore W4385571237C62520636 @default.
- W4385571237 hasConceptScore W4385571237C66322947 @default.
- W4385571237 hasConceptScore W4385571237C95623464 @default.
- W4385571237 hasLocation W43855712371 @default.
- W4385571237 hasOpenAccess W4385571237 @default.
- W4385571237 hasPrimaryLocation W43855712371 @default.
- W4385571237 hasRelatedWork W2280198878 @default.
- W4385571237 hasRelatedWork W2946016983 @default.
- W4385571237 hasRelatedWork W2960456850 @default.
- W4385571237 hasRelatedWork W3021430260 @default.
- W4385571237 hasRelatedWork W4281645081 @default.
- W4385571237 hasRelatedWork W4308262314 @default.
- W4385571237 hasRelatedWork W4312091809 @default.
- W4385571237 hasRelatedWork W4312200629 @default.
- W4385571237 hasRelatedWork W4318957922 @default.
- W4385571237 hasRelatedWork W4382286161 @default.
- W4385571237 isParatext "false" @default.
- W4385571237 isRetracted "false" @default.
- W4385571237 workType "article" @default.