Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385570072> ?p ?o ?g. }
Showing items 1 to 59 of
59
with 100 items per page.
- W4385570072 abstract "The use of positional embeddings in transformer language models is widely accepted. However, recent research has called into question the necessity of such embeddings. We further extend this inquiry by demonstrating that a randomly initialized and frozen transformer language model, devoid of positional embeddings, inherently encodes strong positional information through the shrinkage of self-attention variance. To quantify this variance, we derive the underlying distribution of each step within a transformer layer. Through empirical validation using a fully pretrained model, we show that the variance shrinkage effect still persists after extensive gradient updates. Our findings serve to justify the decision to discard positional embeddings and thus facilitate more efficient pretraining of transformer language models." @default.
- W4385570072 created "2023-08-05" @default.
- W4385570072 creator A5013217019 @default.
- W4385570072 creator A5040668817 @default.
- W4385570072 creator A5041274293 @default.
- W4385570072 creator A5054133264 @default.
- W4385570072 creator A5085851733 @default.
- W4385570072 date "2023-01-01" @default.
- W4385570072 modified "2023-09-24" @default.
- W4385570072 title "Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings" @default.
- W4385570072 doi "https://doi.org/10.18653/v1/2023.acl-short.102" @default.
- W4385570072 hasPublicationYear "2023" @default.
- W4385570072 type Work @default.
- W4385570072 citedByCount "0" @default.
- W4385570072 crossrefType "proceedings-article" @default.
- W4385570072 hasAuthorship W4385570072A5013217019 @default.
- W4385570072 hasAuthorship W4385570072A5040668817 @default.
- W4385570072 hasAuthorship W4385570072A5041274293 @default.
- W4385570072 hasAuthorship W4385570072A5054133264 @default.
- W4385570072 hasAuthorship W4385570072A5085851733 @default.
- W4385570072 hasBestOaLocation W43855700721 @default.
- W4385570072 hasConcept C119599485 @default.
- W4385570072 hasConcept C121955636 @default.
- W4385570072 hasConcept C127413603 @default.
- W4385570072 hasConcept C137293760 @default.
- W4385570072 hasConcept C144133560 @default.
- W4385570072 hasConcept C154945302 @default.
- W4385570072 hasConcept C165801399 @default.
- W4385570072 hasConcept C196083921 @default.
- W4385570072 hasConcept C204321447 @default.
- W4385570072 hasConcept C41008148 @default.
- W4385570072 hasConcept C66322947 @default.
- W4385570072 hasConceptScore W4385570072C119599485 @default.
- W4385570072 hasConceptScore W4385570072C121955636 @default.
- W4385570072 hasConceptScore W4385570072C127413603 @default.
- W4385570072 hasConceptScore W4385570072C137293760 @default.
- W4385570072 hasConceptScore W4385570072C144133560 @default.
- W4385570072 hasConceptScore W4385570072C154945302 @default.
- W4385570072 hasConceptScore W4385570072C165801399 @default.
- W4385570072 hasConceptScore W4385570072C196083921 @default.
- W4385570072 hasConceptScore W4385570072C204321447 @default.
- W4385570072 hasConceptScore W4385570072C41008148 @default.
- W4385570072 hasConceptScore W4385570072C66322947 @default.
- W4385570072 hasLocation W43855700721 @default.
- W4385570072 hasOpenAccess W4385570072 @default.
- W4385570072 hasPrimaryLocation W43855700721 @default.
- W4385570072 hasRelatedWork W2359001871 @default.
- W4385570072 hasRelatedWork W3008110149 @default.
- W4385570072 hasRelatedWork W3033862527 @default.
- W4385570072 hasRelatedWork W3033942572 @default.
- W4385570072 hasRelatedWork W3097571385 @default.
- W4385570072 hasRelatedWork W3177920269 @default.
- W4385570072 hasRelatedWork W3196747313 @default.
- W4385570072 hasRelatedWork W4287761227 @default.
- W4385570072 hasRelatedWork W4316012698 @default.
- W4385570072 hasRelatedWork W4381786178 @default.
- W4385570072 isParatext "false" @default.
- W4385570072 isRetracted "false" @default.
- W4385570072 workType "article" @default.