Matches in SemOpenAlex for { <https://semopenalex.org/work/W3170209909> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W3170209909 endingPage "11181" @default.
- W3170209909 startingPage "11170" @default.
- W3170209909 abstract "After their successful debut in natural language processing, Transformer architectures are now becoming the de-facto standard in many domains. An obstacle for their deployment over new modalities is the architectural configuration: the optimal depth-to-width ratio has been shown to dramatically vary across data types (e.g., $10$x larger over images than over language). We theoretically predict the existence of an embedding rank bottleneck that limits the contribution of self-attention width to the Transformer expressivity. We thus directly tie the input vocabulary size and rank to the optimal depth-to-width ratio, since a small vocabulary size or rank dictates an added advantage of depth over width. We empirically demonstrate the existence of this bottleneck and its implications on the depth-to-width interplay of Transformer architectures, linking the architecture variability across domains to the often glossed-over usage of different vocabulary sizes or embedding ranks in different domains. As an additional benefit, our rank bottlenecking framework allows us to identify size redundancies of $25%-50%$ in leading NLP models such as ALBERT and T5." @default.
- W3170209909 created "2021-06-22" @default.
- W3170209909 creator A5005177735 @default.
- W3170209909 creator A5009170119 @default.
- W3170209909 creator A5075004917 @default.
- W3170209909 creator A5081397200 @default.
- W3170209909 date "2021-07-18" @default.
- W3170209909 modified "2023-09-26" @default.
- W3170209909 title "Which transformer architecture fits my data? A vocabulary bottleneck in self-attention" @default.
- W3170209909 hasPublicationYear "2021" @default.
- W3170209909 type Work @default.
- W3170209909 sameAs 3170209909 @default.
- W3170209909 citedByCount "0" @default.
- W3170209909 crossrefType "proceedings-article" @default.
- W3170209909 hasAuthorship W3170209909A5005177735 @default.
- W3170209909 hasAuthorship W3170209909A5009170119 @default.
- W3170209909 hasAuthorship W3170209909A5075004917 @default.
- W3170209909 hasAuthorship W3170209909A5081397200 @default.
- W3170209909 hasConcept C119599485 @default.
- W3170209909 hasConcept C123657996 @default.
- W3170209909 hasConcept C127413603 @default.
- W3170209909 hasConcept C138885662 @default.
- W3170209909 hasConcept C142362112 @default.
- W3170209909 hasConcept C149635348 @default.
- W3170209909 hasConcept C153349607 @default.
- W3170209909 hasConcept C154945302 @default.
- W3170209909 hasConcept C165801399 @default.
- W3170209909 hasConcept C2777601683 @default.
- W3170209909 hasConcept C2780513914 @default.
- W3170209909 hasConcept C41008148 @default.
- W3170209909 hasConcept C41608201 @default.
- W3170209909 hasConcept C41895202 @default.
- W3170209909 hasConcept C66322947 @default.
- W3170209909 hasConceptScore W3170209909C119599485 @default.
- W3170209909 hasConceptScore W3170209909C123657996 @default.
- W3170209909 hasConceptScore W3170209909C127413603 @default.
- W3170209909 hasConceptScore W3170209909C138885662 @default.
- W3170209909 hasConceptScore W3170209909C142362112 @default.
- W3170209909 hasConceptScore W3170209909C149635348 @default.
- W3170209909 hasConceptScore W3170209909C153349607 @default.
- W3170209909 hasConceptScore W3170209909C154945302 @default.
- W3170209909 hasConceptScore W3170209909C165801399 @default.
- W3170209909 hasConceptScore W3170209909C2777601683 @default.
- W3170209909 hasConceptScore W3170209909C2780513914 @default.
- W3170209909 hasConceptScore W3170209909C41008148 @default.
- W3170209909 hasConceptScore W3170209909C41608201 @default.
- W3170209909 hasConceptScore W3170209909C41895202 @default.
- W3170209909 hasConceptScore W3170209909C66322947 @default.
- W3170209909 hasLocation W31702099091 @default.
- W3170209909 hasOpenAccess W3170209909 @default.
- W3170209909 hasPrimaryLocation W31702099091 @default.
- W3170209909 hasRelatedWork W1984630650 @default.
- W3170209909 hasRelatedWork W2306077825 @default.
- W3170209909 hasRelatedWork W2308586727 @default.
- W3170209909 hasRelatedWork W2531009629 @default.
- W3170209909 hasRelatedWork W2753465033 @default.
- W3170209909 hasRelatedWork W2788103932 @default.
- W3170209909 hasRelatedWork W2962716426 @default.
- W3170209909 hasRelatedWork W2963025229 @default.
- W3170209909 hasRelatedWork W2963616634 @default.
- W3170209909 hasRelatedWork W2995548952 @default.
- W3170209909 hasRelatedWork W2997884810 @default.
- W3170209909 hasRelatedWork W3100501425 @default.
- W3170209909 hasRelatedWork W3120960909 @default.
- W3170209909 hasRelatedWork W3127775241 @default.
- W3170209909 hasRelatedWork W3136121174 @default.
- W3170209909 hasRelatedWork W3156692528 @default.
- W3170209909 hasRelatedWork W3201882754 @default.
- W3170209909 hasRelatedWork W3207704415 @default.
- W3170209909 hasRelatedWork W4913547 @default.
- W3170209909 hasRelatedWork W637153065 @default.
- W3170209909 isParatext "false" @default.
- W3170209909 isRetracted "false" @default.
- W3170209909 magId "3170209909" @default.
- W3170209909 workType "article" @default.