Matches in SemOpenAlex for { <https://semopenalex.org/work/W3167142785> ?p ?o ?g. }
- W3167142785 abstract "What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language. We map the basic components of a transformer-encoder -- attention and feed-forward computation -- into simple primitives, around which we form a programming language: the Restricted Access Sequence Processing Language (RASP). We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer, and how a Transformer can be trained to mimic a RASP solution. In particular, we provide RASP programs for histograms, sorting, and Dyck-languages. We further use our model to relate their difficulty in terms of the number of required layers and attention heads: analyzing a RASP program implies a maximum number of heads and layers necessary to encode a task in a transformer. Finally, we see how insights gained from our abstraction might be used to explain phenomena seen in recent works." @default.
- W3167142785 created "2021-06-22" @default.
- W3167142785 creator A5028476919 @default.
- W3167142785 creator A5047339088 @default.
- W3167142785 creator A5061044883 @default.
- W3167142785 date "2021-06-13" @default.
- W3167142785 modified "2023-09-28" @default.
- W3167142785 title "Thinking Like Transformers" @default.
- W3167142785 cites W1732222442 @default.
- W3167142785 cites W1902237438 @default.
- W3167142785 cites W2053987251 @default.
- W3167142785 cites W2121553911 @default.
- W3167142785 cites W2626778328 @default.
- W3167142785 cites W2809900044 @default.
- W3167142785 cites W2866343820 @default.
- W3167142785 cites W2914557243 @default.
- W3167142785 cites W2940744433 @default.
- W3167142785 cites W2948981535 @default.
- W3167142785 cites W2962981464 @default.
- W3167142785 cites W2963059228 @default.
- W3167142785 cites W2963125544 @default.
- W3167142785 cites W2963341956 @default.
- W3167142785 cites W2963753324 @default.
- W3167142785 cites W2964308564 @default.
- W3167142785 cites W2972324944 @default.
- W3167142785 cites W2995273672 @default.
- W3167142785 cites W3015468748 @default.
- W3167142785 cites W3034801370 @default.
- W3167142785 cites W3034830866 @default.
- W3167142785 cites W3035691519 @default.
- W3167142785 cites W3085139254 @default.
- W3167142785 cites W3093248298 @default.
- W3167142785 cites W3098666169 @default.
- W3167142785 cites W3104298168 @default.
- W3167142785 cites W3104613728 @default.
- W3167142785 cites W3105238007 @default.
- W3167142785 cites W3131922516 @default.
- W3167142785 cites W3146803896 @default.
- W3167142785 cites W3164536890 @default.
- W3167142785 hasPublicationYear "2021" @default.
- W3167142785 type Work @default.
- W3167142785 sameAs 3167142785 @default.
- W3167142785 citedByCount "1" @default.
- W3167142785 countsByYear W31671427852021 @default.
- W3167142785 crossrefType "posted-content" @default.
- W3167142785 hasAuthorship W3167142785A5028476919 @default.
- W3167142785 hasAuthorship W3167142785A5047339088 @default.
- W3167142785 hasAuthorship W3167142785A5061044883 @default.
- W3167142785 hasConcept C104317684 @default.
- W3167142785 hasConcept C111919701 @default.
- W3167142785 hasConcept C118505674 @default.
- W3167142785 hasConcept C119599485 @default.
- W3167142785 hasConcept C127413603 @default.
- W3167142785 hasConcept C137293760 @default.
- W3167142785 hasConcept C154945302 @default.
- W3167142785 hasConcept C165801399 @default.
- W3167142785 hasConcept C185592680 @default.
- W3167142785 hasConcept C199360897 @default.
- W3167142785 hasConcept C2775922551 @default.
- W3167142785 hasConcept C41008148 @default.
- W3167142785 hasConcept C45374587 @default.
- W3167142785 hasConcept C55493867 @default.
- W3167142785 hasConcept C66322947 @default.
- W3167142785 hasConcept C66746571 @default.
- W3167142785 hasConcept C78519656 @default.
- W3167142785 hasConcept C80444323 @default.
- W3167142785 hasConceptScore W3167142785C104317684 @default.
- W3167142785 hasConceptScore W3167142785C111919701 @default.
- W3167142785 hasConceptScore W3167142785C118505674 @default.
- W3167142785 hasConceptScore W3167142785C119599485 @default.
- W3167142785 hasConceptScore W3167142785C127413603 @default.
- W3167142785 hasConceptScore W3167142785C137293760 @default.
- W3167142785 hasConceptScore W3167142785C154945302 @default.
- W3167142785 hasConceptScore W3167142785C165801399 @default.
- W3167142785 hasConceptScore W3167142785C185592680 @default.
- W3167142785 hasConceptScore W3167142785C199360897 @default.
- W3167142785 hasConceptScore W3167142785C2775922551 @default.
- W3167142785 hasConceptScore W3167142785C41008148 @default.
- W3167142785 hasConceptScore W3167142785C45374587 @default.
- W3167142785 hasConceptScore W3167142785C55493867 @default.
- W3167142785 hasConceptScore W3167142785C66322947 @default.
- W3167142785 hasConceptScore W3167142785C66746571 @default.
- W3167142785 hasConceptScore W3167142785C78519656 @default.
- W3167142785 hasConceptScore W3167142785C80444323 @default.
- W3167142785 hasOpenAccess W3167142785 @default.
- W3167142785 hasRelatedWork W1132947466 @default.
- W3167142785 hasRelatedWork W1489921942 @default.
- W3167142785 hasRelatedWork W1554217755 @default.
- W3167142785 hasRelatedWork W1827378754 @default.
- W3167142785 hasRelatedWork W183864610 @default.
- W3167142785 hasRelatedWork W2010945517 @default.
- W3167142785 hasRelatedWork W2089143460 @default.
- W3167142785 hasRelatedWork W2102101008 @default.
- W3167142785 hasRelatedWork W2106286781 @default.
- W3167142785 hasRelatedWork W2107392084 @default.
- W3167142785 hasRelatedWork W2108828679 @default.
- W3167142785 hasRelatedWork W2188916945 @default.
- W3167142785 hasRelatedWork W2267923570 @default.
- W3167142785 hasRelatedWork W2519009229 @default.
- W3167142785 hasRelatedWork W2544648223 @default.