Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287666013> ?p ?o ?g. }
Showing items 1 to 57 of
57
with 100 items per page.
- W4287666013 abstract "Recent work on the lottery ticket hypothesis has produced highly sparse Transformers for NMT while maintaining BLEU. However, it is unclear how such pruning techniques affect a model's learned representations. By probing Transformers with more and more low-magnitude weights pruned away, we find that complex semantic information is first to be degraded. Analysis of internal activations reveals that higher layers diverge most over the course of pruning, gradually becoming less complex than their dense counterparts. Meanwhile, early layers of sparse models begin to perform more encoding. Attention mechanisms remain remarkably consistent as sparsity increases." @default.
- W4287666013 created "2022-07-25" @default.
- W4287666013 creator A5044027575 @default.
- W4287666013 creator A5079551137 @default.
- W4287666013 date "2020-09-16" @default.
- W4287666013 modified "2023-09-29" @default.
- W4287666013 title "Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation" @default.
- W4287666013 doi "https://doi.org/10.48550/arxiv.2009.13270" @default.
- W4287666013 hasPublicationYear "2020" @default.
- W4287666013 type Work @default.
- W4287666013 citedByCount "0" @default.
- W4287666013 crossrefType "posted-content" @default.
- W4287666013 hasAuthorship W4287666013A5044027575 @default.
- W4287666013 hasAuthorship W4287666013A5079551137 @default.
- W4287666013 hasBestOaLocation W42876660131 @default.
- W4287666013 hasConcept C108010975 @default.
- W4287666013 hasConcept C119599485 @default.
- W4287666013 hasConcept C119857082 @default.
- W4287666013 hasConcept C127413603 @default.
- W4287666013 hasConcept C154945302 @default.
- W4287666013 hasConcept C165801399 @default.
- W4287666013 hasConcept C203005215 @default.
- W4287666013 hasConcept C2776540713 @default.
- W4287666013 hasConcept C38652104 @default.
- W4287666013 hasConcept C41008148 @default.
- W4287666013 hasConcept C6557445 @default.
- W4287666013 hasConcept C66322947 @default.
- W4287666013 hasConcept C86803240 @default.
- W4287666013 hasConceptScore W4287666013C108010975 @default.
- W4287666013 hasConceptScore W4287666013C119599485 @default.
- W4287666013 hasConceptScore W4287666013C119857082 @default.
- W4287666013 hasConceptScore W4287666013C127413603 @default.
- W4287666013 hasConceptScore W4287666013C154945302 @default.
- W4287666013 hasConceptScore W4287666013C165801399 @default.
- W4287666013 hasConceptScore W4287666013C203005215 @default.
- W4287666013 hasConceptScore W4287666013C2776540713 @default.
- W4287666013 hasConceptScore W4287666013C38652104 @default.
- W4287666013 hasConceptScore W4287666013C41008148 @default.
- W4287666013 hasConceptScore W4287666013C6557445 @default.
- W4287666013 hasConceptScore W4287666013C66322947 @default.
- W4287666013 hasConceptScore W4287666013C86803240 @default.
- W4287666013 hasLocation W42876660131 @default.
- W4287666013 hasOpenAccess W4287666013 @default.
- W4287666013 hasPrimaryLocation W42876660131 @default.
- W4287666013 hasRelatedWork W11589964 @default.
- W4287666013 hasRelatedWork W11890898 @default.
- W4287666013 hasRelatedWork W1619002 @default.
- W4287666013 hasRelatedWork W2155528 @default.
- W4287666013 hasRelatedWork W3939803 @default.
- W4287666013 hasRelatedWork W449952 @default.
- W4287666013 hasRelatedWork W4707553 @default.
- W4287666013 hasRelatedWork W7401400 @default.
- W4287666013 hasRelatedWork W7519655 @default.
- W4287666013 hasRelatedWork W8671328 @default.
- W4287666013 isParatext "false" @default.
- W4287666013 isRetracted "false" @default.
- W4287666013 workType "article" @default.