Matches in SemOpenAlex for { <https://semopenalex.org/work/W3202629034> ?p ?o ?g. }
- W3202629034 abstract "Pruning aims to reduce the number of parameters while maintaining performance close to the original network. This work proposes a novel emph{self-distillation} based pruning strategy, whereby the representational similarity between the pruned and unpruned versions of the same network is maximized. Unlike previous approaches that treat distillation and pruning separately, we use distillation to inform the pruning criteria, without requiring a separate student network as in knowledge distillation. We show that the proposed {em cross-correlation objective for self-distilled pruning} implicitly encourages sparse solutions, naturally complementing magnitude-based pruning criteria. Experiments on the GLUE and XGLUE benchmarks show that self-distilled pruning increases mono- and cross-lingual language model performance. Self-distilled pruned models also outperform smaller Transformers with an equal number of parameters and are competitive against (6 times) larger distilled networks. We also observe that self-distillation (1) maximizes class separability, (2) increases the signal-to-noise ratio, and (3) converges faster after pruning steps, providing further insights into why self-distilled pruning improves generalization." @default.
- W3202629034 created "2021-10-11" @default.
- W3202629034 creator A5014381275 @default.
- W3202629034 creator A5023510057 @default.
- W3202629034 creator A5053023517 @default.
- W3202629034 date "2021-09-30" @default.
- W3202629034 modified "2023-09-27" @default.
- W3202629034 title "Deep Neural Compression Via Concurrent Pruning and Self-Distillation." @default.
- W3202629034 cites W1690739335 @default.
- W3202629034 cites W1821462560 @default.
- W3202629034 cites W2090824512 @default.
- W3202629034 cites W2092939357 @default.
- W3202629034 cites W2097533491 @default.
- W3202629034 cites W2109229471 @default.
- W3202629034 cites W2114766824 @default.
- W3202629034 cites W2125389748 @default.
- W3202629034 cites W2134273960 @default.
- W3202629034 cites W2138019504 @default.
- W3202629034 cites W2145085734 @default.
- W3202629034 cites W2276892413 @default.
- W3202629034 cites W2513419314 @default.
- W3202629034 cites W2515385951 @default.
- W3202629034 cites W2582745083 @default.
- W3202629034 cites W2618305643 @default.
- W3202629034 cites W2731516819 @default.
- W3202629034 cites W2786054724 @default.
- W3202629034 cites W2791091755 @default.
- W3202629034 cites W2903707108 @default.
- W3202629034 cites W2936864631 @default.
- W3202629034 cites W2945176031 @default.
- W3202629034 cites W2948818370 @default.
- W3202629034 cites W2952344559 @default.
- W3202629034 cites W2962851801 @default.
- W3202629034 cites W2963140444 @default.
- W3202629034 cites W2963310665 @default.
- W3202629034 cites W2963341956 @default.
- W3202629034 cites W2963674932 @default.
- W3202629034 cites W2963828549 @default.
- W3202629034 cites W2964125128 @default.
- W3202629034 cites W2964222566 @default.
- W3202629034 cites W2964299589 @default.
- W3202629034 cites W2975429091 @default.
- W3202629034 cites W2978017171 @default.
- W3202629034 cites W2981794819 @default.
- W3202629034 cites W2983040767 @default.
- W3202629034 cites W3005692288 @default.
- W3202629034 cites W3006051380 @default.
- W3202629034 cites W3021616083 @default.
- W3202629034 cites W3024171804 @default.
- W3202629034 cites W3037301072 @default.
- W3202629034 cites W3102483398 @default.
- W3202629034 cites W3105966348 @default.
- W3202629034 cites W3113303810 @default.
- W3202629034 cites W3121696187 @default.
- W3202629034 cites W3129870084 @default.
- W3202629034 cites W3134652006 @default.
- W3202629034 cites W3169549154 @default.
- W3202629034 cites W566555209 @default.
- W3202629034 hasPublicationYear "2021" @default.
- W3202629034 type Work @default.
- W3202629034 sameAs 3202629034 @default.
- W3202629034 citedByCount "0" @default.
- W3202629034 crossrefType "posted-content" @default.
- W3202629034 hasAuthorship W3202629034A5014381275 @default.
- W3202629034 hasAuthorship W3202629034A5023510057 @default.
- W3202629034 hasAuthorship W3202629034A5053023517 @default.
- W3202629034 hasConcept C108010975 @default.
- W3202629034 hasConcept C134306372 @default.
- W3202629034 hasConcept C154945302 @default.
- W3202629034 hasConcept C177148314 @default.
- W3202629034 hasConcept C185592680 @default.
- W3202629034 hasConcept C204030448 @default.
- W3202629034 hasConcept C33923547 @default.
- W3202629034 hasConcept C41008148 @default.
- W3202629034 hasConcept C43617362 @default.
- W3202629034 hasConcept C50644808 @default.
- W3202629034 hasConcept C59822182 @default.
- W3202629034 hasConcept C86803240 @default.
- W3202629034 hasConceptScore W3202629034C108010975 @default.
- W3202629034 hasConceptScore W3202629034C134306372 @default.
- W3202629034 hasConceptScore W3202629034C154945302 @default.
- W3202629034 hasConceptScore W3202629034C177148314 @default.
- W3202629034 hasConceptScore W3202629034C185592680 @default.
- W3202629034 hasConceptScore W3202629034C204030448 @default.
- W3202629034 hasConceptScore W3202629034C33923547 @default.
- W3202629034 hasConceptScore W3202629034C41008148 @default.
- W3202629034 hasConceptScore W3202629034C43617362 @default.
- W3202629034 hasConceptScore W3202629034C50644808 @default.
- W3202629034 hasConceptScore W3202629034C59822182 @default.
- W3202629034 hasConceptScore W3202629034C86803240 @default.
- W3202629034 hasLocation W32026290341 @default.
- W3202629034 hasOpenAccess W3202629034 @default.
- W3202629034 hasPrimaryLocation W32026290341 @default.
- W3202629034 hasRelatedWork W1568757546 @default.
- W3202629034 hasRelatedWork W2041970025 @default.
- W3202629034 hasRelatedWork W2182841986 @default.
- W3202629034 hasRelatedWork W2350224374 @default.
- W3202629034 hasRelatedWork W2913465187 @default.
- W3202629034 hasRelatedWork W2946151040 @default.
- W3202629034 hasRelatedWork W2990081069 @default.