Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385570824> ?p ?o ?g. }
Showing items 1 to 51 of
51
with 100 items per page.
- W4385570824 abstract "We present the first unified study of the efficiency of self-attention-based Transformer variants spanning text, speech and vision. We identify input length thresholds (tipping points) at which efficient Transformer variants become more efficient than vanilla models, using a variety of efficiency metrics (latency, throughput, and memory). To conduct this analysis for speech, we introduce L-HuBERT, a novel local-attention variant of a self-supervised speech model. We observe that these thresholds are (a) much higher than typical dataset sequence lengths and (b) dependent on the metric and modality, showing that choosing the right model depends on modality, task type (long-form vs. typical context) and resource constraints (time vs. memory). By visualising the breakdown of the computational costs for transformer components, we also show that non-self-attention components exhibit significant computational costs. We release our profiling toolkit at https://github.com/ajd12342/profiling-transformers ." @default.
- W4385570824 created "2023-08-05" @default.
- W4385570824 creator A5004717608 @default.
- W4385570824 creator A5016426491 @default.
- W4385570824 creator A5035142405 @default.
- W4385570824 date "2023-01-01" @default.
- W4385570824 modified "2023-09-24" @default.
- W4385570824 title "When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants" @default.
- W4385570824 doi "https://doi.org/10.18653/v1/2023.acl-short.141" @default.
- W4385570824 hasPublicationYear "2023" @default.
- W4385570824 type Work @default.
- W4385570824 citedByCount "0" @default.
- W4385570824 crossrefType "proceedings-article" @default.
- W4385570824 hasAuthorship W4385570824A5004717608 @default.
- W4385570824 hasAuthorship W4385570824A5016426491 @default.
- W4385570824 hasAuthorship W4385570824A5035142405 @default.
- W4385570824 hasBestOaLocation W43855708241 @default.
- W4385570824 hasConcept C111919701 @default.
- W4385570824 hasConcept C121332964 @default.
- W4385570824 hasConcept C154945302 @default.
- W4385570824 hasConcept C165801399 @default.
- W4385570824 hasConcept C187191949 @default.
- W4385570824 hasConcept C28490314 @default.
- W4385570824 hasConcept C41008148 @default.
- W4385570824 hasConcept C62520636 @default.
- W4385570824 hasConcept C66322947 @default.
- W4385570824 hasConceptScore W4385570824C111919701 @default.
- W4385570824 hasConceptScore W4385570824C121332964 @default.
- W4385570824 hasConceptScore W4385570824C154945302 @default.
- W4385570824 hasConceptScore W4385570824C165801399 @default.
- W4385570824 hasConceptScore W4385570824C187191949 @default.
- W4385570824 hasConceptScore W4385570824C28490314 @default.
- W4385570824 hasConceptScore W4385570824C41008148 @default.
- W4385570824 hasConceptScore W4385570824C62520636 @default.
- W4385570824 hasConceptScore W4385570824C66322947 @default.
- W4385570824 hasLocation W43855708241 @default.
- W4385570824 hasOpenAccess W4385570824 @default.
- W4385570824 hasPrimaryLocation W43855708241 @default.
- W4385570824 hasRelatedWork W1496697599 @default.
- W4385570824 hasRelatedWork W1602801198 @default.
- W4385570824 hasRelatedWork W2016929877 @default.
- W4385570824 hasRelatedWork W2348361596 @default.
- W4385570824 hasRelatedWork W2352497206 @default.
- W4385570824 hasRelatedWork W2503961669 @default.
- W4385570824 hasRelatedWork W2778699561 @default.
- W4385570824 hasRelatedWork W3003264772 @default.
- W4385570824 hasRelatedWork W842142381 @default.
- W4385570824 hasRelatedWork W2521117258 @default.
- W4385570824 isParatext "false" @default.
- W4385570824 isRetracted "false" @default.
- W4385570824 workType "article" @default.