Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226096528> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W4226096528 abstract "The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak memory utilization and latency is non-trivial. This is exacerbated by the proliferation of various hardware. We leverage the somewhat surprising empirical observation that the number of decoder parameters in autoregressive Transformers has a high rank correlation with task performance, irrespective of the architecture topology. This observation organically induces a simple Neural Architecture Search (NAS) algorithm that uses decoder parameters as a proxy for perplexity without need for any model training. The search phase of our training-free algorithm, dubbed Lightweight Transformer Search (LTS), can be run directly on target devices since it does not require GPUs. Using on-target-device measurements, LTS extracts the Pareto-frontier of perplexity versus any hardware performance cost. We evaluate LTS on diverse devices from ARM CPUs to NVIDIA GPUs and two popular autoregressive Transformer backbones: GPT-2 and Transformer-XL. Results show that the perplexity of 16-layer GPT-2 and Transformer-XL can be achieved with up to 1.5x, 2.5x faster runtime and 1.2x, 2.0x lower peak memory utilization. When evaluated in zero and one-shot settings, LTS Pareto-frontier models achieve higher average accuracy compared to the 350M parameter OPT across 14 tasks, with up to 1.6x lower latency. LTS extracts the Pareto-frontier in under 3 hours while running on a commodity laptop. We effectively remove the carbon footprint of hundreds of GPU hours of training during search, offering a strong simple baseline for future NAS methods in autoregressive language modeling." @default.
- W4226096528 created "2022-05-05" @default.
- W4226096528 creator A5019931011 @default.
- W4226096528 creator A5020666821 @default.
- W4226096528 creator A5021842384 @default.
- W4226096528 creator A5029310132 @default.
- W4226096528 creator A5033994052 @default.
- W4226096528 creator A5040161515 @default.
- W4226096528 creator A5052706169 @default.
- W4226096528 creator A5082605114 @default.
- W4226096528 creator A5086370446 @default.
- W4226096528 date "2022-03-03" @default.
- W4226096528 modified "2023-10-16" @default.
- W4226096528 title "LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models" @default.
- W4226096528 doi "https://doi.org/10.48550/arxiv.2203.02094" @default.
- W4226096528 hasPublicationYear "2022" @default.
- W4226096528 type Work @default.
- W4226096528 citedByCount "0" @default.
- W4226096528 crossrefType "posted-content" @default.
- W4226096528 hasAuthorship W4226096528A5019931011 @default.
- W4226096528 hasAuthorship W4226096528A5020666821 @default.
- W4226096528 hasAuthorship W4226096528A5021842384 @default.
- W4226096528 hasAuthorship W4226096528A5029310132 @default.
- W4226096528 hasAuthorship W4226096528A5033994052 @default.
- W4226096528 hasAuthorship W4226096528A5040161515 @default.
- W4226096528 hasAuthorship W4226096528A5052706169 @default.
- W4226096528 hasAuthorship W4226096528A5082605114 @default.
- W4226096528 hasAuthorship W4226096528A5086370446 @default.
- W4226096528 hasBestOaLocation W42260965281 @default.
- W4226096528 hasConcept C100279451 @default.
- W4226096528 hasConcept C113775141 @default.
- W4226096528 hasConcept C119599485 @default.
- W4226096528 hasConcept C126255220 @default.
- W4226096528 hasConcept C127413603 @default.
- W4226096528 hasConcept C137293760 @default.
- W4226096528 hasConcept C137635306 @default.
- W4226096528 hasConcept C149782125 @default.
- W4226096528 hasConcept C154945302 @default.
- W4226096528 hasConcept C159877910 @default.
- W4226096528 hasConcept C165801399 @default.
- W4226096528 hasConcept C173608175 @default.
- W4226096528 hasConcept C33923547 @default.
- W4226096528 hasConcept C41008148 @default.
- W4226096528 hasConcept C66322947 @default.
- W4226096528 hasConcept C9390403 @default.
- W4226096528 hasConceptScore W4226096528C100279451 @default.
- W4226096528 hasConceptScore W4226096528C113775141 @default.
- W4226096528 hasConceptScore W4226096528C119599485 @default.
- W4226096528 hasConceptScore W4226096528C126255220 @default.
- W4226096528 hasConceptScore W4226096528C127413603 @default.
- W4226096528 hasConceptScore W4226096528C137293760 @default.
- W4226096528 hasConceptScore W4226096528C137635306 @default.
- W4226096528 hasConceptScore W4226096528C149782125 @default.
- W4226096528 hasConceptScore W4226096528C154945302 @default.
- W4226096528 hasConceptScore W4226096528C159877910 @default.
- W4226096528 hasConceptScore W4226096528C165801399 @default.
- W4226096528 hasConceptScore W4226096528C173608175 @default.
- W4226096528 hasConceptScore W4226096528C33923547 @default.
- W4226096528 hasConceptScore W4226096528C41008148 @default.
- W4226096528 hasConceptScore W4226096528C66322947 @default.
- W4226096528 hasConceptScore W4226096528C9390403 @default.
- W4226096528 hasLocation W42260965281 @default.
- W4226096528 hasOpenAccess W4226096528 @default.
- W4226096528 hasPrimaryLocation W42260965281 @default.
- W4226096528 hasRelatedWork W1989705153 @default.
- W4226096528 hasRelatedWork W2107734859 @default.
- W4226096528 hasRelatedWork W2496228846 @default.
- W4226096528 hasRelatedWork W2936497627 @default.
- W4226096528 hasRelatedWork W3013624417 @default.
- W4226096528 hasRelatedWork W3016888273 @default.
- W4226096528 hasRelatedWork W3049463507 @default.
- W4226096528 hasRelatedWork W4226261765 @default.
- W4226096528 hasRelatedWork W4226285793 @default.
- W4226096528 hasRelatedWork W4287826556 @default.
- W4226096528 isParatext "false" @default.
- W4226096528 isRetracted "false" @default.
- W4226096528 workType "article" @default.