Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285296417> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4285296417 abstract "Self-attention mechanism has been shown to be an effective approach for capturing global context dependencies in sequence modeling, but it suffers from quadratic complexity in time and memory usage. Due to the sparsity of the attention matrix, much computation is redundant. Therefore, in this paper, we design an efficient Transformer architecture, named Fourier Sparse Attention for Transformer (FSAT), for fast long-range sequence modeling. We provide a brand-new perspective for constructing sparse attention matrix, i.e. making the sparse attention matrix predictable. Two core sub-modules are: (1) A fast Fourier transform based hidden state cross module, which captures and pools L2 semantic combinations in 𝒪(Llog L) time complexity. (2) A sparse attention matrix estimation module, which predicts dominant elements of an attention matrix based on the output of the previous hidden state cross module. By reparameterization and gradient truncation, FSAT successfully learned the index of dominant elements. The overall complexity about the sequence length is reduced from 𝒪(L2) to 𝒪(Llog L). Extensive experiments (natural language, vision, and math) show that FSAT remarkably outperforms the standard multi-head attention and its variants in various long-sequence tasks with low computational costs, and achieves new state-of-the-art results on the Long Range Arena benchmark." @default.
- W4285296417 created "2022-07-14" @default.
- W4285296417 creator A5001130005 @default.
- W4285296417 creator A5017947172 @default.
- W4285296417 creator A5082491573 @default.
- W4285296417 date "2022-01-01" @default.
- W4285296417 modified "2023-09-26" @default.
- W4285296417 title "Long-range Sequence Modeling with Predictable Sparse Attention" @default.
- W4285296417 doi "https://doi.org/10.18653/v1/2022.acl-long.19" @default.
- W4285296417 hasPublicationYear "2022" @default.
- W4285296417 type Work @default.
- W4285296417 citedByCount "0" @default.
- W4285296417 crossrefType "proceedings-article" @default.
- W4285296417 hasAuthorship W4285296417A5001130005 @default.
- W4285296417 hasAuthorship W4285296417A5017947172 @default.
- W4285296417 hasAuthorship W4285296417A5082491573 @default.
- W4285296417 hasBestOaLocation W42852964171 @default.
- W4285296417 hasConcept C106487976 @default.
- W4285296417 hasConcept C11413529 @default.
- W4285296417 hasConcept C121332964 @default.
- W4285296417 hasConcept C129844170 @default.
- W4285296417 hasConcept C13280743 @default.
- W4285296417 hasConcept C154945302 @default.
- W4285296417 hasConcept C159985019 @default.
- W4285296417 hasConcept C163716315 @default.
- W4285296417 hasConcept C165801399 @default.
- W4285296417 hasConcept C179799912 @default.
- W4285296417 hasConcept C185798385 @default.
- W4285296417 hasConcept C192562407 @default.
- W4285296417 hasConcept C205649164 @default.
- W4285296417 hasConcept C2524010 @default.
- W4285296417 hasConcept C2778112365 @default.
- W4285296417 hasConcept C33923547 @default.
- W4285296417 hasConcept C41008148 @default.
- W4285296417 hasConcept C54355233 @default.
- W4285296417 hasConcept C56372850 @default.
- W4285296417 hasConcept C62520636 @default.
- W4285296417 hasConcept C66322947 @default.
- W4285296417 hasConcept C80444323 @default.
- W4285296417 hasConcept C86803240 @default.
- W4285296417 hasConceptScore W4285296417C106487976 @default.
- W4285296417 hasConceptScore W4285296417C11413529 @default.
- W4285296417 hasConceptScore W4285296417C121332964 @default.
- W4285296417 hasConceptScore W4285296417C129844170 @default.
- W4285296417 hasConceptScore W4285296417C13280743 @default.
- W4285296417 hasConceptScore W4285296417C154945302 @default.
- W4285296417 hasConceptScore W4285296417C159985019 @default.
- W4285296417 hasConceptScore W4285296417C163716315 @default.
- W4285296417 hasConceptScore W4285296417C165801399 @default.
- W4285296417 hasConceptScore W4285296417C179799912 @default.
- W4285296417 hasConceptScore W4285296417C185798385 @default.
- W4285296417 hasConceptScore W4285296417C192562407 @default.
- W4285296417 hasConceptScore W4285296417C205649164 @default.
- W4285296417 hasConceptScore W4285296417C2524010 @default.
- W4285296417 hasConceptScore W4285296417C2778112365 @default.
- W4285296417 hasConceptScore W4285296417C33923547 @default.
- W4285296417 hasConceptScore W4285296417C41008148 @default.
- W4285296417 hasConceptScore W4285296417C54355233 @default.
- W4285296417 hasConceptScore W4285296417C56372850 @default.
- W4285296417 hasConceptScore W4285296417C62520636 @default.
- W4285296417 hasConceptScore W4285296417C66322947 @default.
- W4285296417 hasConceptScore W4285296417C80444323 @default.
- W4285296417 hasConceptScore W4285296417C86803240 @default.
- W4285296417 hasLocation W42852964171 @default.
- W4285296417 hasOpenAccess W4285296417 @default.
- W4285296417 hasPrimaryLocation W42852964171 @default.
- W4285296417 hasRelatedWork W1485630101 @default.
- W4285296417 hasRelatedWork W1982884119 @default.
- W4285296417 hasRelatedWork W2165170319 @default.
- W4285296417 hasRelatedWork W2233780076 @default.
- W4285296417 hasRelatedWork W2462574632 @default.
- W4285296417 hasRelatedWork W2781872308 @default.
- W4285296417 hasRelatedWork W3201815179 @default.
- W4285296417 hasRelatedWork W3208653488 @default.
- W4285296417 hasRelatedWork W4291144638 @default.
- W4285296417 hasRelatedWork W1967605906 @default.
- W4285296417 isParatext "false" @default.
- W4285296417 isRetracted "false" @default.
- W4285296417 workType "article" @default.