Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387010609> ?p ?o ?g. }
Showing items 1 to 76 of
76
with 100 items per page.
- W4387010609 endingPage "1" @default.
- W4387010609 startingPage "1" @default.
- W4387010609 abstract "Transformer models are continuously achieving state-of-the-art performance on a wide range of benchmarks. To meet demanding performance targets, the number of model parameters is continuously increased. As a result, state-of-the-art Transformers require substantial computational resources prohibiting their deployment on consumer-grade hardware. In literature, over-parameterized Transformers are successfully reduced in size with the help of pruning strategies. Existing works lack the ability to optimize the full architecture, without incurring significant overheads, in a fully-differentiable manner. Our work proposes a single-stage approach for training a Transformer for memory-efficient inference and various resource-constrained scenarios. Transformer blocks are extended with trainable gate parameters, which attribute importance and control information flow. Their integration into a differentiable pruning-aware training scheme allows the extraction of extremely sparse sub-networks at runtime, with minimal performance degradation. Evaluative pruning results, at the attention head and layer levels, illustrate the memory efficiency of our trained sub-networks under various memory budgets." @default.
- W4387010609 created "2023-09-26" @default.
- W4387010609 creator A5002909243 @default.
- W4387010609 creator A5006315507 @default.
- W4387010609 creator A5017470419 @default.
- W4387010609 creator A5063508488 @default.
- W4387010609 date "2023-01-01" @default.
- W4387010609 modified "2023-09-26" @default.
- W4387010609 title "Differentiable Slimming for Memory-Efficient Transformers" @default.
- W4387010609 doi "https://doi.org/10.1109/les.2023.3299638" @default.
- W4387010609 hasPublicationYear "2023" @default.
- W4387010609 type Work @default.
- W4387010609 citedByCount "0" @default.
- W4387010609 crossrefType "journal-article" @default.
- W4387010609 hasAuthorship W4387010609A5002909243 @default.
- W4387010609 hasAuthorship W4387010609A5006315507 @default.
- W4387010609 hasAuthorship W4387010609A5017470419 @default.
- W4387010609 hasAuthorship W4387010609A5063508488 @default.
- W4387010609 hasConcept C113775141 @default.
- W4387010609 hasConcept C11413529 @default.
- W4387010609 hasConcept C118524514 @default.
- W4387010609 hasConcept C120314980 @default.
- W4387010609 hasConcept C121332964 @default.
- W4387010609 hasConcept C123657996 @default.
- W4387010609 hasConcept C134306372 @default.
- W4387010609 hasConcept C142362112 @default.
- W4387010609 hasConcept C149635348 @default.
- W4387010609 hasConcept C153349607 @default.
- W4387010609 hasConcept C154945302 @default.
- W4387010609 hasConcept C165464430 @default.
- W4387010609 hasConcept C165801399 @default.
- W4387010609 hasConcept C202615002 @default.
- W4387010609 hasConcept C2776214188 @default.
- W4387010609 hasConcept C33923547 @default.
- W4387010609 hasConcept C41008148 @default.
- W4387010609 hasConcept C62520636 @default.
- W4387010609 hasConcept C66322947 @default.
- W4387010609 hasConcept C9390403 @default.
- W4387010609 hasConceptScore W4387010609C113775141 @default.
- W4387010609 hasConceptScore W4387010609C11413529 @default.
- W4387010609 hasConceptScore W4387010609C118524514 @default.
- W4387010609 hasConceptScore W4387010609C120314980 @default.
- W4387010609 hasConceptScore W4387010609C121332964 @default.
- W4387010609 hasConceptScore W4387010609C123657996 @default.
- W4387010609 hasConceptScore W4387010609C134306372 @default.
- W4387010609 hasConceptScore W4387010609C142362112 @default.
- W4387010609 hasConceptScore W4387010609C149635348 @default.
- W4387010609 hasConceptScore W4387010609C153349607 @default.
- W4387010609 hasConceptScore W4387010609C154945302 @default.
- W4387010609 hasConceptScore W4387010609C165464430 @default.
- W4387010609 hasConceptScore W4387010609C165801399 @default.
- W4387010609 hasConceptScore W4387010609C202615002 @default.
- W4387010609 hasConceptScore W4387010609C2776214188 @default.
- W4387010609 hasConceptScore W4387010609C33923547 @default.
- W4387010609 hasConceptScore W4387010609C41008148 @default.
- W4387010609 hasConceptScore W4387010609C62520636 @default.
- W4387010609 hasConceptScore W4387010609C66322947 @default.
- W4387010609 hasConceptScore W4387010609C9390403 @default.
- W4387010609 hasLocation W43870106091 @default.
- W4387010609 hasOpenAccess W4387010609 @default.
- W4387010609 hasPrimaryLocation W43870106091 @default.
- W4387010609 hasRelatedWork W1991171386 @default.
- W4387010609 hasRelatedWork W2357085366 @default.
- W4387010609 hasRelatedWork W2905887716 @default.
- W4387010609 hasRelatedWork W3130954105 @default.
- W4387010609 hasRelatedWork W3146091044 @default.
- W4387010609 hasRelatedWork W3203166921 @default.
- W4387010609 hasRelatedWork W4214588794 @default.
- W4387010609 hasRelatedWork W4286902452 @default.
- W4387010609 hasRelatedWork W4287241953 @default.
- W4387010609 hasRelatedWork W4361193654 @default.
- W4387010609 isParatext "false" @default.
- W4387010609 isRetracted "false" @default.
- W4387010609 workType "article" @default.