Matches in SemOpenAlex for { <https://semopenalex.org/work/W4320169808> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W4320169808 abstract "We propose a probabilistic interpretation of exponential dot product attention of transformers and contrastive learning based off of exponential families. The attention sublayer of transformers is equivalent to a gradient ascent step of the log normalizer, which is the log-sum-exp term in the Hopfield theory of attention. This ascent step induces a parallel expansion of points, which is counterbalanced by a contraction from layer normalization. We also state theoretical limitations of our theory and the Hopfield theory and suggest directions for resolution." @default.
- W4320169808 created "2023-02-13" @default.
- W4320169808 creator A5011631927 @default.
- W4320169808 date "2022-04-28" @default.
- W4320169808 modified "2023-09-23" @default.
- W4320169808 title "A Probabilistic Interpretation of Transformers" @default.
- W4320169808 doi "https://doi.org/10.48550/arxiv.2205.01080" @default.
- W4320169808 hasPublicationYear "2022" @default.
- W4320169808 type Work @default.
- W4320169808 citedByCount "0" @default.
- W4320169808 crossrefType "posted-content" @default.
- W4320169808 hasAuthorship W4320169808A5011631927 @default.
- W4320169808 hasBestOaLocation W43201698081 @default.
- W4320169808 hasConcept C11413529 @default.
- W4320169808 hasConcept C118615104 @default.
- W4320169808 hasConcept C121332964 @default.
- W4320169808 hasConcept C134306372 @default.
- W4320169808 hasConcept C136886441 @default.
- W4320169808 hasConcept C144024400 @default.
- W4320169808 hasConcept C151376022 @default.
- W4320169808 hasConcept C154945302 @default.
- W4320169808 hasConcept C165801399 @default.
- W4320169808 hasConcept C19165224 @default.
- W4320169808 hasConcept C202444582 @default.
- W4320169808 hasConcept C28826006 @default.
- W4320169808 hasConcept C33923547 @default.
- W4320169808 hasConcept C41008148 @default.
- W4320169808 hasConcept C49937458 @default.
- W4320169808 hasConcept C62520636 @default.
- W4320169808 hasConcept C66322947 @default.
- W4320169808 hasConcept C75174853 @default.
- W4320169808 hasConceptScore W4320169808C11413529 @default.
- W4320169808 hasConceptScore W4320169808C118615104 @default.
- W4320169808 hasConceptScore W4320169808C121332964 @default.
- W4320169808 hasConceptScore W4320169808C134306372 @default.
- W4320169808 hasConceptScore W4320169808C136886441 @default.
- W4320169808 hasConceptScore W4320169808C144024400 @default.
- W4320169808 hasConceptScore W4320169808C151376022 @default.
- W4320169808 hasConceptScore W4320169808C154945302 @default.
- W4320169808 hasConceptScore W4320169808C165801399 @default.
- W4320169808 hasConceptScore W4320169808C19165224 @default.
- W4320169808 hasConceptScore W4320169808C202444582 @default.
- W4320169808 hasConceptScore W4320169808C28826006 @default.
- W4320169808 hasConceptScore W4320169808C33923547 @default.
- W4320169808 hasConceptScore W4320169808C41008148 @default.
- W4320169808 hasConceptScore W4320169808C49937458 @default.
- W4320169808 hasConceptScore W4320169808C62520636 @default.
- W4320169808 hasConceptScore W4320169808C66322947 @default.
- W4320169808 hasConceptScore W4320169808C75174853 @default.
- W4320169808 hasLocation W43201698081 @default.
- W4320169808 hasOpenAccess W4320169808 @default.
- W4320169808 hasPrimaryLocation W43201698081 @default.
- W4320169808 hasRelatedWork W1481545552 @default.
- W4320169808 hasRelatedWork W190623762 @default.
- W4320169808 hasRelatedWork W1982697846 @default.
- W4320169808 hasRelatedWork W2004427275 @default.
- W4320169808 hasRelatedWork W2024298515 @default.
- W4320169808 hasRelatedWork W2066487227 @default.
- W4320169808 hasRelatedWork W2332684515 @default.
- W4320169808 hasRelatedWork W4236782761 @default.
- W4320169808 hasRelatedWork W4290792893 @default.
- W4320169808 hasRelatedWork W2108453824 @default.
- W4320169808 isParatext "false" @default.
- W4320169808 isRetracted "false" @default.
- W4320169808 workType "article" @default.