Matches in SemOpenAlex for { <https://semopenalex.org/work/W3207548174> ?p ?o ?g. }
- W3207548174 abstract "Multi-head, key-value attention is the backbone of the widely successful Transformer model and its variants. This attention mechanism uses multiple parallel key-value attention blocks (called heads), each performing two fundamental computations: (1) search - selection of a relevant entity from a set via query-key interactions, and (2) retrieval - extraction of relevant features from the selected entity via a value matrix. Importantly, standard attention heads learn a rigid mapping between search and retrieval. In this work, we first highlight how this static nature of the pairing can potentially: (a) lead to learning of redundant parameters in certain tasks, and (b) hinder generalization. To alleviate this problem, we propose a novel attention mechanism, called Compositional Attention, that replaces the standard head structure. The proposed mechanism disentangles search and retrieval and composes them in a dynamic, flexible and context-dependent manner through an additional soft competition stage between the query-key combination and value pairing. Through a series of numerical experiments, we show that it outperforms standard multi-head attention on a variety of tasks, including some out-of-distribution settings. Through our qualitative analysis, we demonstrate that Compositional Attention leads to dynamic specialization based on the type of retrieval needed. Our proposed mechanism generalizes multi-head attention, allows independent scaling of search and retrieval, and can easily be implemented in lieu of standard attention heads in any network architecture." @default.
- W3207548174 created "2021-10-25" @default.
- W3207548174 creator A5020059055 @default.
- W3207548174 creator A5040523178 @default.
- W3207548174 creator A5043037494 @default.
- W3207548174 creator A5055430458 @default.
- W3207548174 creator A5086198262 @default.
- W3207548174 date "2021-10-18" @default.
- W3207548174 modified "2023-10-04" @default.
- W3207548174 title "Compositional Attention: Disentangling Search and Retrieval" @default.
- W3207548174 cites W152394834 @default.
- W3207548174 cites W1902237438 @default.
- W3207548174 cites W2525332836 @default.
- W3207548174 cites W2624614404 @default.
- W3207548174 cites W2626778328 @default.
- W3207548174 cites W2759710969 @default.
- W3207548174 cites W2765831433 @default.
- W3207548174 cites W2786209943 @default.
- W3207548174 cites W2788751659 @default.
- W3207548174 cites W2866343820 @default.
- W3207548174 cites W2896457183 @default.
- W3207548174 cites W2898942365 @default.
- W3207548174 cites W2914607694 @default.
- W3207548174 cites W2940744433 @default.
- W3207548174 cites W2950527759 @default.
- W3207548174 cites W2963267799 @default.
- W3207548174 cites W2964308564 @default.
- W3207548174 cites W2976023236 @default.
- W3207548174 cites W2979065840 @default.
- W3207548174 cites W2996132992 @default.
- W3207548174 cites W3000514857 @default.
- W3207548174 cites W3017374003 @default.
- W3207548174 cites W3033529678 @default.
- W3207548174 cites W3037655549 @default.
- W3207548174 cites W3037784242 @default.
- W3207548174 cites W3085139254 @default.
- W3207548174 cites W3091156754 @default.
- W3207548174 cites W3093356033 @default.
- W3207548174 cites W3094502228 @default.
- W3207548174 cites W3108981297 @default.
- W3207548174 cites W3111739346 @default.
- W3207548174 cites W3113828129 @default.
- W3207548174 cites W3123673616 @default.
- W3207548174 cites W3127742036 @default.
- W3207548174 cites W3133029875 @default.
- W3207548174 cites W3134412209 @default.
- W3207548174 cites W3135728268 @default.
- W3207548174 cites W3163310412 @default.
- W3207548174 cites W3049326502 @default.
- W3207548174 cites W3197009789 @default.
- W3207548174 hasPublicationYear "2021" @default.
- W3207548174 type Work @default.
- W3207548174 sameAs 3207548174 @default.
- W3207548174 citedByCount "0" @default.
- W3207548174 crossrefType "posted-content" @default.
- W3207548174 hasAuthorship W3207548174A5020059055 @default.
- W3207548174 hasAuthorship W3207548174A5040523178 @default.
- W3207548174 hasAuthorship W3207548174A5043037494 @default.
- W3207548174 hasAuthorship W3207548174A5055430458 @default.
- W3207548174 hasAuthorship W3207548174A5086198262 @default.
- W3207548174 hasConcept C119857082 @default.
- W3207548174 hasConcept C124101348 @default.
- W3207548174 hasConcept C134306372 @default.
- W3207548174 hasConcept C151730666 @default.
- W3207548174 hasConcept C154945302 @default.
- W3207548174 hasConcept C177148314 @default.
- W3207548174 hasConcept C23123220 @default.
- W3207548174 hasConcept C26517878 @default.
- W3207548174 hasConcept C2779343474 @default.
- W3207548174 hasConcept C33923547 @default.
- W3207548174 hasConcept C38652104 @default.
- W3207548174 hasConcept C41008148 @default.
- W3207548174 hasConcept C80444323 @default.
- W3207548174 hasConcept C86803240 @default.
- W3207548174 hasConceptScore W3207548174C119857082 @default.
- W3207548174 hasConceptScore W3207548174C124101348 @default.
- W3207548174 hasConceptScore W3207548174C134306372 @default.
- W3207548174 hasConceptScore W3207548174C151730666 @default.
- W3207548174 hasConceptScore W3207548174C154945302 @default.
- W3207548174 hasConceptScore W3207548174C177148314 @default.
- W3207548174 hasConceptScore W3207548174C23123220 @default.
- W3207548174 hasConceptScore W3207548174C26517878 @default.
- W3207548174 hasConceptScore W3207548174C2779343474 @default.
- W3207548174 hasConceptScore W3207548174C33923547 @default.
- W3207548174 hasConceptScore W3207548174C38652104 @default.
- W3207548174 hasConceptScore W3207548174C41008148 @default.
- W3207548174 hasConceptScore W3207548174C80444323 @default.
- W3207548174 hasConceptScore W3207548174C86803240 @default.
- W3207548174 hasLocation W32075481741 @default.
- W3207548174 hasOpenAccess W3207548174 @default.
- W3207548174 hasPrimaryLocation W32075481741 @default.
- W3207548174 hasRelatedWork W1522129951 @default.
- W3207548174 hasRelatedWork W2742935102 @default.
- W3207548174 hasRelatedWork W2798456655 @default.
- W3207548174 hasRelatedWork W2933442455 @default.
- W3207548174 hasRelatedWork W2948587015 @default.
- W3207548174 hasRelatedWork W2951815760 @default.
- W3207548174 hasRelatedWork W2963937185 @default.
- W3207548174 hasRelatedWork W2968853227 @default.
- W3207548174 hasRelatedWork W2973325524 @default.