Matches in SemOpenAlex for { <https://semopenalex.org/work/W3091389507> ?p ?o ?g. }
- W3091389507 abstract "End-to-end training with back propagation is the standard method for training deep neural networks. However, as networks become deeper and bigger, end-to-end training becomes more challenging: highly non-convex models gets stuck easily in local optima, gradients signals are prone to vanish or explode during backpropagation, training requires computational resources and time. In this work, we propose to break away from the end-to-end paradigm in the context of Knowledge Distillation. Instead of distilling a model end-to-end, we propose to split it into smaller sub-networks - also called neighbourhoods - that are then trained independently. We empirically show that distilling networks in a non end-to-end fashion can be beneficial in a diverse range of use cases. First, we show that it speeds up Knowledge Distillation by exploiting parallelism and training on smaller networks. Second, we show that independently distilled neighbourhoods may be efficiently re-used for Neural Architecture Search. Finally, because smaller networks model simpler functions, we show that they are easier to train with synthetic data than their deeper counterparts." @default.
- W3091389507 created "2020-10-08" @default.
- W3091389507 creator A5040988045 @default.
- W3091389507 creator A5044999818 @default.
- W3091389507 creator A5086932158 @default.
- W3091389507 date "2021-05-04" @default.
- W3091389507 modified "2023-09-27" @default.
- W3091389507 title "Neighbourhood Distillation: On the benefits of non end-to-end distillation" @default.
- W3091389507 cites W1498436455 @default.
- W3091389507 cites W1522301498 @default.
- W3091389507 cites W1690739335 @default.
- W3091389507 cites W1821462560 @default.
- W3091389507 cites W1855112655 @default.
- W3091389507 cites W197865394 @default.
- W3091389507 cites W2069143585 @default.
- W3091389507 cites W2108598243 @default.
- W3091389507 cites W2112796928 @default.
- W3091389507 cites W2146502635 @default.
- W3091389507 cites W2147768505 @default.
- W3091389507 cites W2168231600 @default.
- W3091389507 cites W2184045248 @default.
- W3091389507 cites W2194775991 @default.
- W3091389507 cites W2284050935 @default.
- W3091389507 cites W2402302915 @default.
- W3091389507 cites W2408074187 @default.
- W3091389507 cites W2606722458 @default.
- W3091389507 cites W2613718673 @default.
- W3091389507 cites W2626373159 @default.
- W3091389507 cites W2736941579 @default.
- W3091389507 cites W2739879705 @default.
- W3091389507 cites W2765390540 @default.
- W3091389507 cites W2766966408 @default.
- W3091389507 cites W2771727678 @default.
- W3091389507 cites W2886756692 @default.
- W3091389507 cites W2907407288 @default.
- W3091389507 cites W2911586496 @default.
- W3091389507 cites W2911803042 @default.
- W3091389507 cites W2913190747 @default.
- W3091389507 cites W2937282529 @default.
- W3091389507 cites W2946830201 @default.
- W3091389507 cites W2946948417 @default.
- W3091389507 cites W2949117887 @default.
- W3091389507 cites W2949829435 @default.
- W3091389507 cites W2950635152 @default.
- W3091389507 cites W2951104886 @default.
- W3091389507 cites W2951574208 @default.
- W3091389507 cites W2951815760 @default.
- W3091389507 cites W2952344559 @default.
- W3091389507 cites W2953212265 @default.
- W3091389507 cites W2962746461 @default.
- W3091389507 cites W2962935923 @default.
- W3091389507 cites W2963037989 @default.
- W3091389507 cites W2963150697 @default.
- W3091389507 cites W2963341956 @default.
- W3091389507 cites W2963403868 @default.
- W3091389507 cites W2964508650 @default.
- W3091389507 cites W2970211912 @default.
- W3091389507 cites W2979567256 @default.
- W3091389507 cites W2991012968 @default.
- W3091389507 cites W2996569511 @default.
- W3091389507 cites W3004127093 @default.
- W3091389507 cites W3034406766 @default.
- W3091389507 cites W3034528892 @default.
- W3091389507 cites W3035460915 @default.
- W3091389507 cites W3118608800 @default.
- W3091389507 hasPublicationYear "2021" @default.
- W3091389507 type Work @default.
- W3091389507 sameAs 3091389507 @default.
- W3091389507 citedByCount "0" @default.
- W3091389507 crossrefType "journal-article" @default.
- W3091389507 hasAuthorship W3091389507A5040988045 @default.
- W3091389507 hasAuthorship W3091389507A5044999818 @default.
- W3091389507 hasAuthorship W3091389507A5086932158 @default.
- W3091389507 hasConcept C108583219 @default.
- W3091389507 hasConcept C119857082 @default.
- W3091389507 hasConcept C134306372 @default.
- W3091389507 hasConcept C151730666 @default.
- W3091389507 hasConcept C154945302 @default.
- W3091389507 hasConcept C155032097 @default.
- W3091389507 hasConcept C161677786 @default.
- W3091389507 hasConcept C178790620 @default.
- W3091389507 hasConcept C185592680 @default.
- W3091389507 hasConcept C204030448 @default.
- W3091389507 hasConcept C2779343474 @default.
- W3091389507 hasConcept C33923547 @default.
- W3091389507 hasConcept C41008148 @default.
- W3091389507 hasConcept C50644808 @default.
- W3091389507 hasConcept C74296488 @default.
- W3091389507 hasConcept C86803240 @default.
- W3091389507 hasConceptScore W3091389507C108583219 @default.
- W3091389507 hasConceptScore W3091389507C119857082 @default.
- W3091389507 hasConceptScore W3091389507C134306372 @default.
- W3091389507 hasConceptScore W3091389507C151730666 @default.
- W3091389507 hasConceptScore W3091389507C154945302 @default.
- W3091389507 hasConceptScore W3091389507C155032097 @default.
- W3091389507 hasConceptScore W3091389507C161677786 @default.
- W3091389507 hasConceptScore W3091389507C178790620 @default.
- W3091389507 hasConceptScore W3091389507C185592680 @default.
- W3091389507 hasConceptScore W3091389507C204030448 @default.
- W3091389507 hasConceptScore W3091389507C2779343474 @default.