Matches in SemOpenAlex for { <https://semopenalex.org/work/W3213469297> ?p ?o ?g. }
Showing items 1 to 82 of
82
with 100 items per page.
- W3213469297 abstract "Transformers have achieved success in both language and vision domains. However, it is prohibitively expensive to scale them to long sequences such as long documents or high-resolution images, because self-attention mechanism has quadratic time and memory complexities with respect to the input sequence length. In this paper, we propose Long-Short Transformer (Transformer-LS), an efficient self-attention mechanism for modeling long sequences with linear complexity for both language and vision tasks. It aggregates a novel long-range attention with dynamic projection to model distant correlations and a short-term attention to capture fine-grained local correlations. We propose a dual normalization strategy to account for the scale mismatch between the two attention mechanisms. Transformer-LS can be applied to both autoregressive and bidirectional models without additional complexity. Our method outperforms the state-of-the-art models on multiple tasks in language and vision domains, including the Long Range Arena benchmark, autoregressive language modeling, and ImageNet classification. For instance, Transformer-LS achieves 0.97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware. On ImageNet, it can obtain the state-of-the-art results (e.g., a moderate size of 55.8M model solely trained on 224x224 ImageNet-1K can obtain Top-1 accuracy 84.1%), while being more scalable on high-resolution images. The source code and models are released at this https URL ." @default.
- W3213469297 created "2021-11-22" @default.
- W3213469297 creator A5005843046 @default.
- W3213469297 creator A5014498545 @default.
- W3213469297 creator A5056801663 @default.
- W3213469297 creator A5060687985 @default.
- W3213469297 creator A5066242985 @default.
- W3213469297 creator A5072436307 @default.
- W3213469297 creator A5082943001 @default.
- W3213469297 date "2021-12-06" @default.
- W3213469297 modified "2023-09-25" @default.
- W3213469297 title "Long-Short Transformer: Efficient Transformers for Language and Vision" @default.
- W3213469297 hasPublicationYear "2021" @default.
- W3213469297 type Work @default.
- W3213469297 sameAs 3213469297 @default.
- W3213469297 citedByCount "0" @default.
- W3213469297 crossrefType "proceedings-article" @default.
- W3213469297 hasAuthorship W3213469297A5005843046 @default.
- W3213469297 hasAuthorship W3213469297A5014498545 @default.
- W3213469297 hasAuthorship W3213469297A5056801663 @default.
- W3213469297 hasAuthorship W3213469297A5060687985 @default.
- W3213469297 hasAuthorship W3213469297A5066242985 @default.
- W3213469297 hasAuthorship W3213469297A5072436307 @default.
- W3213469297 hasAuthorship W3213469297A5082943001 @default.
- W3213469297 hasConcept C113775141 @default.
- W3213469297 hasConcept C11413529 @default.
- W3213469297 hasConcept C119599485 @default.
- W3213469297 hasConcept C127413603 @default.
- W3213469297 hasConcept C137293760 @default.
- W3213469297 hasConcept C149782125 @default.
- W3213469297 hasConcept C153180895 @default.
- W3213469297 hasConcept C154945302 @default.
- W3213469297 hasConcept C159877910 @default.
- W3213469297 hasConcept C165801399 @default.
- W3213469297 hasConcept C33923547 @default.
- W3213469297 hasConcept C41008148 @default.
- W3213469297 hasConcept C48044578 @default.
- W3213469297 hasConcept C66322947 @default.
- W3213469297 hasConcept C77088390 @default.
- W3213469297 hasConceptScore W3213469297C113775141 @default.
- W3213469297 hasConceptScore W3213469297C11413529 @default.
- W3213469297 hasConceptScore W3213469297C119599485 @default.
- W3213469297 hasConceptScore W3213469297C127413603 @default.
- W3213469297 hasConceptScore W3213469297C137293760 @default.
- W3213469297 hasConceptScore W3213469297C149782125 @default.
- W3213469297 hasConceptScore W3213469297C153180895 @default.
- W3213469297 hasConceptScore W3213469297C154945302 @default.
- W3213469297 hasConceptScore W3213469297C159877910 @default.
- W3213469297 hasConceptScore W3213469297C165801399 @default.
- W3213469297 hasConceptScore W3213469297C33923547 @default.
- W3213469297 hasConceptScore W3213469297C41008148 @default.
- W3213469297 hasConceptScore W3213469297C48044578 @default.
- W3213469297 hasConceptScore W3213469297C66322947 @default.
- W3213469297 hasConceptScore W3213469297C77088390 @default.
- W3213469297 hasLocation W32134692971 @default.
- W3213469297 hasOpenAccess W3213469297 @default.
- W3213469297 hasPrimaryLocation W32134692971 @default.
- W3213469297 hasRelatedWork W2937843571 @default.
- W3213469297 hasRelatedWork W3015468748 @default.
- W3213469297 hasRelatedWork W3110662498 @default.
- W3213469297 hasRelatedWork W3128729103 @default.
- W3213469297 hasRelatedWork W3129603602 @default.
- W3213469297 hasRelatedWork W3131922516 @default.
- W3213469297 hasRelatedWork W3135743858 @default.
- W3213469297 hasRelatedWork W3152720259 @default.
- W3213469297 hasRelatedWork W3166658420 @default.
- W3213469297 hasRelatedWork W3168294587 @default.
- W3213469297 hasRelatedWork W3169201434 @default.
- W3213469297 hasRelatedWork W3169938586 @default.
- W3213469297 hasRelatedWork W3170642968 @default.
- W3213469297 hasRelatedWork W3175466730 @default.
- W3213469297 hasRelatedWork W3181262653 @default.
- W3213469297 hasRelatedWork W3186032668 @default.
- W3213469297 hasRelatedWork W3199613405 @default.
- W3213469297 hasRelatedWork W3202053489 @default.
- W3213469297 hasRelatedWork W3210458440 @default.
- W3213469297 hasRelatedWork W3134454711 @default.
- W3213469297 hasVolume "34" @default.
- W3213469297 isParatext "false" @default.
- W3213469297 isRetracted "false" @default.
- W3213469297 magId "3213469297" @default.
- W3213469297 workType "article" @default.