Matches in SemOpenAlex for { <https://semopenalex.org/work/W3006051380> ?p ?o ?g. }
- W3006051380 abstract "Knowledge distillation introduced in the deep learning context is a method to transfer knowledge from one architecture to another. In particular, when the architectures are identical, this is called self-distillation. The idea is to feed in predictions of the trained model as new target values for retraining (and iterate this loop possibly a few times). It has been empirically observed that the self-distilled model often achieves higher accuracy on held out data. Why this happens, however, has been a mystery: the self-distillation dynamics does not receive any new information about the task and solely evolves by looping over training. To the best of our knowledge, there is no rigorous understanding of this phenomenon. This work provides the first theoretical analysis of self-distillation. We focus on fitting a nonlinear function to training data, where the model space is Hilbert space and fitting is subject to $ell_2$ regularization in this function space. We show that self-distillation iterations modify regularization by progressively limiting the number of basis functions that can be used to represent the solution. This implies (as we also verify empirically) that while a few rounds of self-distillation may reduce over-fitting, further rounds may lead to under-fitting and thus worse performance." @default.
- W3006051380 created "2020-02-24" @default.
- W3006051380 creator A5003384169 @default.
- W3006051380 creator A5030261391 @default.
- W3006051380 creator A5050499655 @default.
- W3006051380 date "2020-02-13" @default.
- W3006051380 modified "2023-09-27" @default.
- W3006051380 title "Self-Distillation Amplifies Regularization in Hilbert Space" @default.
- W3006051380 cites W1540155273 @default.
- W3006051380 cites W1592410721 @default.
- W3006051380 cites W1821462560 @default.
- W3006051380 cites W1981025032 @default.
- W3006051380 cites W1998419211 @default.
- W3006051380 cites W2007154098 @default.
- W3006051380 cites W2194775991 @default.
- W3006051380 cites W2294370754 @default.
- W3006051380 cites W2579923771 @default.
- W3006051380 cites W2731516819 @default.
- W3006051380 cites W2739879705 @default.
- W3006051380 cites W2809090039 @default.
- W3006051380 cites W2903396356 @default.
- W3006051380 cites W2903707108 @default.
- W3006051380 cites W2911803042 @default.
- W3006051380 cites W2936864631 @default.
- W3006051380 cites W2945289329 @default.
- W3006051380 cites W2945528222 @default.
- W3006051380 cites W2950220847 @default.
- W3006051380 cites W2952204734 @default.
- W3006051380 cites W2962835968 @default.
- W3006051380 cites W2963140444 @default.
- W3006051380 cites W2963199420 @default.
- W3006051380 cites W2963921882 @default.
- W3006051380 cites W2964118293 @default.
- W3006051380 cites W2964220233 @default.
- W3006051380 cites W2964222566 @default.
- W3006051380 cites W2964293126 @default.
- W3006051380 cites W2978544343 @default.
- W3006051380 cites W2998044330 @default.
- W3006051380 cites W2998352186 @default.
- W3006051380 cites W3000462134 @default.
- W3006051380 cites W3010154184 @default.
- W3006051380 cites W3021931813 @default.
- W3006051380 cites W3028069987 @default.
- W3006051380 cites W3031955466 @default.
- W3006051380 cites W3101069636 @default.
- W3006051380 cites W3118608800 @default.
- W3006051380 hasPublicationYear "2020" @default.
- W3006051380 type Work @default.
- W3006051380 sameAs 3006051380 @default.
- W3006051380 citedByCount "17" @default.
- W3006051380 countsByYear W30060513802020 @default.
- W3006051380 countsByYear W30060513802021 @default.
- W3006051380 countsByYear W30060513802022 @default.
- W3006051380 crossrefType "posted-content" @default.
- W3006051380 hasAuthorship W3006051380A5003384169 @default.
- W3006051380 hasAuthorship W3006051380A5030261391 @default.
- W3006051380 hasAuthorship W3006051380A5050499655 @default.
- W3006051380 hasConcept C134306372 @default.
- W3006051380 hasConcept C154945302 @default.
- W3006051380 hasConcept C178790620 @default.
- W3006051380 hasConcept C185592680 @default.
- W3006051380 hasConcept C204030448 @default.
- W3006051380 hasConcept C2776135515 @default.
- W3006051380 hasConcept C28826006 @default.
- W3006051380 hasConcept C33923547 @default.
- W3006051380 hasConcept C41008148 @default.
- W3006051380 hasConcept C62799726 @default.
- W3006051380 hasConceptScore W3006051380C134306372 @default.
- W3006051380 hasConceptScore W3006051380C154945302 @default.
- W3006051380 hasConceptScore W3006051380C178790620 @default.
- W3006051380 hasConceptScore W3006051380C185592680 @default.
- W3006051380 hasConceptScore W3006051380C204030448 @default.
- W3006051380 hasConceptScore W3006051380C2776135515 @default.
- W3006051380 hasConceptScore W3006051380C28826006 @default.
- W3006051380 hasConceptScore W3006051380C33923547 @default.
- W3006051380 hasConceptScore W3006051380C41008148 @default.
- W3006051380 hasConceptScore W3006051380C62799726 @default.
- W3006051380 hasLocation W30060513801 @default.
- W3006051380 hasOpenAccess W3006051380 @default.
- W3006051380 hasPrimaryLocation W30060513801 @default.
- W3006051380 hasRelatedWork W1821462560 @default.
- W3006051380 hasRelatedWork W1982536735 @default.
- W3006051380 hasRelatedWork W2006224934 @default.
- W3006051380 hasRelatedWork W2194775991 @default.
- W3006051380 hasRelatedWork W2294370754 @default.
- W3006051380 hasRelatedWork W2521025609 @default.
- W3006051380 hasRelatedWork W2605181823 @default.
- W3006051380 hasRelatedWork W2619002440 @default.
- W3006051380 hasRelatedWork W2883413434 @default.
- W3006051380 hasRelatedWork W2936864631 @default.
- W3006051380 hasRelatedWork W2945289329 @default.
- W3006051380 hasRelatedWork W2964222566 @default.
- W3006051380 hasRelatedWork W2987861506 @default.
- W3006051380 hasRelatedWork W3027743283 @default.
- W3006051380 hasRelatedWork W3103361051 @default.
- W3006051380 hasRelatedWork W3104259303 @default.
- W3006051380 hasRelatedWork W3118608800 @default.
- W3006051380 hasRelatedWork W3128411871 @default.
- W3006051380 hasRelatedWork W3137898953 @default.
- W3006051380 hasRelatedWork W2605989488 @default.