Matches in SemOpenAlex for { <https://semopenalex.org/work/W3208397797> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W3208397797 abstract "Pre-training and then fine-tuning large language models is commonly used to achieve state-of-the-art performance in natural language processing (NLP) tasks. However, most pre-trained models suffer from low inference speed. Deploying such large models to applications with latency constraints is challenging. In this work, we focus on accelerating the inference via conditional computations. To achieve this, we propose a novel idea, Magic Pyramid (MP), to reduce both width-wise and depth-wise computation via token pruning and early exiting for Transformer-based models, particularly BERT. The former manages to save the computation via removing non-salient tokens, while the latter can fulfill the computation reduction by terminating the inference early before reaching the final layer, if the exiting condition is met. Our empirical studies demonstrate that compared to previous state of arts, MP is not only able to achieve a speed-adjustable inference but also to surpass token pruning and early exiting by reducing up to 70% giga floating point operations (GFLOPs) with less than 0.5% accuracy drop. Token pruning and early exiting express distinctive preferences to sequences with different lengths. However, MP is capable of achieving an average of 8.06x speedup on two popular text classification tasks, regardless of the sizes of the inputs." @default.
- W3208397797 created "2021-11-08" @default.
- W3208397797 creator A5013670321 @default.
- W3208397797 creator A5035425640 @default.
- W3208397797 creator A5050384270 @default.
- W3208397797 creator A5060320196 @default.
- W3208397797 creator A5068895096 @default.
- W3208397797 creator A5082750097 @default.
- W3208397797 creator A5086488087 @default.
- W3208397797 date "2021-10-30" @default.
- W3208397797 modified "2023-09-25" @default.
- W3208397797 title "Magic Pyramid: Accelerating Inference with Early Exiting and Token Pruning" @default.
- W3208397797 cites W1821462560 @default.
- W3208397797 cites W2951244744 @default.
- W3208397797 cites W2962677625 @default.
- W3208397797 cites W2963012544 @default.
- W3208397797 cites W2963310665 @default.
- W3208397797 cites W2963341956 @default.
- W3208397797 cites W2963403868 @default.
- W3208397797 cites W2965373594 @default.
- W3208397797 cites W2970454332 @default.
- W3208397797 cites W2970565456 @default.
- W3208397797 cites W2978017171 @default.
- W3208397797 cites W2996159613 @default.
- W3208397797 cites W2998183051 @default.
- W3208397797 cites W3003530529 @default.
- W3208397797 cites W3034292689 @default.
- W3208397797 cites W3034457371 @default.
- W3208397797 cites W3034742519 @default.
- W3208397797 cites W3101731278 @default.
- W3208397797 cites W3105966348 @default.
- W3208397797 cites W3159727696 @default.
- W3208397797 cites W3170369042 @default.
- W3208397797 cites W3180037928 @default.
- W3208397797 doi "https://doi.org/10.48550/arxiv.2111.00230" @default.
- W3208397797 hasPublicationYear "2021" @default.
- W3208397797 type Work @default.
- W3208397797 sameAs 3208397797 @default.
- W3208397797 citedByCount "0" @default.
- W3208397797 crossrefType "posted-content" @default.
- W3208397797 hasAuthorship W3208397797A5013670321 @default.
- W3208397797 hasAuthorship W3208397797A5035425640 @default.
- W3208397797 hasAuthorship W3208397797A5050384270 @default.
- W3208397797 hasAuthorship W3208397797A5060320196 @default.
- W3208397797 hasAuthorship W3208397797A5068895096 @default.
- W3208397797 hasAuthorship W3208397797A5082750097 @default.
- W3208397797 hasAuthorship W3208397797A5086488087 @default.
- W3208397797 hasBestOaLocation W32083977971 @default.
- W3208397797 hasConcept C108010975 @default.
- W3208397797 hasConcept C11413529 @default.
- W3208397797 hasConcept C137293760 @default.
- W3208397797 hasConcept C154945302 @default.
- W3208397797 hasConcept C173608175 @default.
- W3208397797 hasConcept C2776214188 @default.
- W3208397797 hasConcept C3826847 @default.
- W3208397797 hasConcept C38652104 @default.
- W3208397797 hasConcept C41008148 @default.
- W3208397797 hasConcept C45374587 @default.
- W3208397797 hasConcept C48145219 @default.
- W3208397797 hasConcept C6557445 @default.
- W3208397797 hasConcept C68339613 @default.
- W3208397797 hasConcept C86803240 @default.
- W3208397797 hasConceptScore W3208397797C108010975 @default.
- W3208397797 hasConceptScore W3208397797C11413529 @default.
- W3208397797 hasConceptScore W3208397797C137293760 @default.
- W3208397797 hasConceptScore W3208397797C154945302 @default.
- W3208397797 hasConceptScore W3208397797C173608175 @default.
- W3208397797 hasConceptScore W3208397797C2776214188 @default.
- W3208397797 hasConceptScore W3208397797C3826847 @default.
- W3208397797 hasConceptScore W3208397797C38652104 @default.
- W3208397797 hasConceptScore W3208397797C41008148 @default.
- W3208397797 hasConceptScore W3208397797C45374587 @default.
- W3208397797 hasConceptScore W3208397797C48145219 @default.
- W3208397797 hasConceptScore W3208397797C6557445 @default.
- W3208397797 hasConceptScore W3208397797C68339613 @default.
- W3208397797 hasConceptScore W3208397797C86803240 @default.
- W3208397797 hasLocation W32083977971 @default.
- W3208397797 hasOpenAccess W3208397797 @default.
- W3208397797 hasPrimaryLocation W32083977971 @default.
- W3208397797 hasRelatedWork W156843270 @default.
- W3208397797 hasRelatedWork W2007449167 @default.
- W3208397797 hasRelatedWork W2042026112 @default.
- W3208397797 hasRelatedWork W2470589840 @default.
- W3208397797 hasRelatedWork W2777249922 @default.
- W3208397797 hasRelatedWork W2914925532 @default.
- W3208397797 hasRelatedWork W3208397797 @default.
- W3208397797 hasRelatedWork W4294734199 @default.
- W3208397797 hasRelatedWork W4313680512 @default.
- W3208397797 hasRelatedWork W4321177657 @default.
- W3208397797 isParatext "false" @default.
- W3208397797 isRetracted "false" @default.
- W3208397797 magId "3208397797" @default.
- W3208397797 workType "article" @default.