Matches in SemOpenAlex for { <https://semopenalex.org/work/W3041753721> ?p ?o ?g. }
- W3041753721 abstract "Transformers have been proven a successful model for a variety of tasks in sequence modeling. However, computing the attention matrix, which is their key component, has quadratic complexity with respect to the sequence length, thus making them prohibitively expensive for large sequences. To address this, we propose clustered attention, which instead of computing the attention for every query, groups queries into clusters and computes attention just for the centroids. To further improve this approximation, we use the computed clusters to identify the keys with the highest attention per query and compute the exact key/query dot products. This results in a model with linear complexity with respect to the sequence length for a fixed number of clusters. We evaluate our approach on two automatic speech recognition datasets and show that our model consistently outperforms vanilla transformers for a given computational budget. Finally, we demonstrate that our model can approximate arbitrarily complex attention distributions with a minimal number of clusters by approximating a pretrained BERT model on GLUE and SQuAD benchmarks with only 25 clusters and no loss in performance." @default.
- W3041753721 created "2020-07-16" @default.
- W3041753721 creator A5023301142 @default.
- W3041753721 creator A5031829458 @default.
- W3041753721 creator A5076094010 @default.
- W3041753721 date "2020-07-09" @default.
- W3041753721 modified "2023-10-04" @default.
- W3041753721 title "Fast Transformers with Clustered Attention" @default.
- W3041753721 cites W1524333225 @default.
- W3041753721 cites W1577877742 @default.
- W3041753721 cites W1779452081 @default.
- W3041753721 cites W2024490156 @default.
- W3041753721 cites W2127141656 @default.
- W3041753721 cites W2166637769 @default.
- W3041753721 cites W2327501763 @default.
- W3041753721 cites W2514741789 @default.
- W3041753721 cites W2606101940 @default.
- W3041753721 cites W2724346673 @default.
- W3041753721 cites W2773781902 @default.
- W3041753721 cites W2892009249 @default.
- W3041753721 cites W2911109671 @default.
- W3041753721 cites W2940744433 @default.
- W3041753721 cites W2946567085 @default.
- W3041753721 cites W2949382160 @default.
- W3041753721 cites W2953273646 @default.
- W3041753721 cites W2963042606 @default.
- W3041753721 cites W2963056065 @default.
- W3041753721 cites W2963310665 @default.
- W3041753721 cites W2963403868 @default.
- W3041753721 cites W2964308564 @default.
- W3041753721 cites W2965373594 @default.
- W3041753721 cites W2970971581 @default.
- W3041753721 cites W2994673210 @default.
- W3041753721 cites W2994689640 @default.
- W3041753721 cites W2995428172 @default.
- W3041753721 cites W3131922516 @default.
- W3041753721 hasPublicationYear "2020" @default.
- W3041753721 type Work @default.
- W3041753721 sameAs 3041753721 @default.
- W3041753721 citedByCount "3" @default.
- W3041753721 countsByYear W30417537212020 @default.
- W3041753721 countsByYear W30417537212021 @default.
- W3041753721 crossrefType "posted-content" @default.
- W3041753721 hasAuthorship W3041753721A5023301142 @default.
- W3041753721 hasAuthorship W3041753721A5031829458 @default.
- W3041753721 hasAuthorship W3041753721A5076094010 @default.
- W3041753721 hasConcept C11413529 @default.
- W3041753721 hasConcept C121332964 @default.
- W3041753721 hasConcept C129844170 @default.
- W3041753721 hasConcept C146599234 @default.
- W3041753721 hasConcept C154945302 @default.
- W3041753721 hasConcept C165801399 @default.
- W3041753721 hasConcept C179799912 @default.
- W3041753721 hasConcept C2524010 @default.
- W3041753721 hasConcept C26517878 @default.
- W3041753721 hasConcept C2778112365 @default.
- W3041753721 hasConcept C33923547 @default.
- W3041753721 hasConcept C38652104 @default.
- W3041753721 hasConcept C41008148 @default.
- W3041753721 hasConcept C54355233 @default.
- W3041753721 hasConcept C62520636 @default.
- W3041753721 hasConcept C66322947 @default.
- W3041753721 hasConcept C80444323 @default.
- W3041753721 hasConcept C86803240 @default.
- W3041753721 hasConceptScore W3041753721C11413529 @default.
- W3041753721 hasConceptScore W3041753721C121332964 @default.
- W3041753721 hasConceptScore W3041753721C129844170 @default.
- W3041753721 hasConceptScore W3041753721C146599234 @default.
- W3041753721 hasConceptScore W3041753721C154945302 @default.
- W3041753721 hasConceptScore W3041753721C165801399 @default.
- W3041753721 hasConceptScore W3041753721C179799912 @default.
- W3041753721 hasConceptScore W3041753721C2524010 @default.
- W3041753721 hasConceptScore W3041753721C26517878 @default.
- W3041753721 hasConceptScore W3041753721C2778112365 @default.
- W3041753721 hasConceptScore W3041753721C33923547 @default.
- W3041753721 hasConceptScore W3041753721C38652104 @default.
- W3041753721 hasConceptScore W3041753721C41008148 @default.
- W3041753721 hasConceptScore W3041753721C54355233 @default.
- W3041753721 hasConceptScore W3041753721C62520636 @default.
- W3041753721 hasConceptScore W3041753721C66322947 @default.
- W3041753721 hasConceptScore W3041753721C80444323 @default.
- W3041753721 hasConceptScore W3041753721C86803240 @default.
- W3041753721 hasLocation W30417537211 @default.
- W3041753721 hasOpenAccess W3041753721 @default.
- W3041753721 hasPrimaryLocation W30417537211 @default.
- W3041753721 hasRelatedWork W1539714449 @default.
- W3041753721 hasRelatedWork W1840557586 @default.
- W3041753721 hasRelatedWork W1988944003 @default.
- W3041753721 hasRelatedWork W1999149368 @default.
- W3041753721 hasRelatedWork W2010416066 @default.
- W3041753721 hasRelatedWork W2364517059 @default.
- W3041753721 hasRelatedWork W2560596997 @default.
- W3041753721 hasRelatedWork W2590402293 @default.
- W3041753721 hasRelatedWork W2611719484 @default.
- W3041753721 hasRelatedWork W2611940423 @default.
- W3041753721 hasRelatedWork W2619410666 @default.
- W3041753721 hasRelatedWork W2752555493 @default.
- W3041753721 hasRelatedWork W2903672378 @default.
- W3041753721 hasRelatedWork W2963403868 @default.
- W3041753721 hasRelatedWork W2963722442 @default.