Matches in SemOpenAlex for { <https://semopenalex.org/work/W3135804679> ?p ?o ?g. }
- W3135804679 abstract "In spite of advances in understanding lazy training, recent work attributes the practical success of deep learning to the rich regime with complex inductive bias. In this paper, we study rich regime training empirically with benchmark datasets, and find that while most parameters are lazy, there is always a small number of active parameters which change quite a bit during training. We show that re-initializing (resetting to their initial random values) the active parameters leads to worse generalization. Further, we show that most of the active parameters are in the bottom layers, close to the input, especially as the networks become wider. Based on such observations, we study static Layer-Wise Sparse (LWS) SGD, which only updates some subsets of layers. We find that only updating the top and bottom layers have good generalization and, as expected, only updating the top layers yields a fast algorithm. Inspired by this, we investigate probabilistic LWS-SGD, which mostly updates the top layers and occasionally updates the full network. We show that probabilistic LWS-SGD matches the generalization performance of vanilla SGD and the back-propagation time can be 2-5 times more efficient." @default.
- W3135804679 created "2021-03-15" @default.
- W3135804679 creator A5014459472 @default.
- W3135804679 creator A5056579161 @default.
- W3135804679 date "2021-02-26" @default.
- W3135804679 modified "2023-09-27" @default.
- W3135804679 title "Experiments with Rich Regime Training for Deep Learning." @default.
- W3135804679 cites W104184427 @default.
- W3135804679 cites W1533861849 @default.
- W3135804679 cites W2046869671 @default.
- W3135804679 cites W2097117768 @default.
- W3135804679 cites W2108598243 @default.
- W3135804679 cites W2112796928 @default.
- W3135804679 cites W2117539524 @default.
- W3135804679 cites W2130942839 @default.
- W3135804679 cites W2193413348 @default.
- W3135804679 cites W2194775991 @default.
- W3135804679 cites W2331143823 @default.
- W3135804679 cites W2525778437 @default.
- W3135804679 cites W2618530766 @default.
- W3135804679 cites W2750384547 @default.
- W3135804679 cites W2767204723 @default.
- W3135804679 cites W2809090039 @default.
- W3135804679 cites W2886067286 @default.
- W3135804679 cites W2890924858 @default.
- W3135804679 cites W2894604724 @default.
- W3135804679 cites W2899476926 @default.
- W3135804679 cites W2899748887 @default.
- W3135804679 cites W2903327037 @default.
- W3135804679 cites W2904243021 @default.
- W3135804679 cites W2904838594 @default.
- W3135804679 cites W2905421523 @default.
- W3135804679 cites W2913190747 @default.
- W3135804679 cites W2913473169 @default.
- W3135804679 cites W2949382160 @default.
- W3135804679 cites W2952204734 @default.
- W3135804679 cites W2962698540 @default.
- W3135804679 cites W2962835968 @default.
- W3135804679 cites W2963095610 @default.
- W3135804679 cites W2963239103 @default.
- W3135804679 cites W2963403868 @default.
- W3135804679 cites W2963766684 @default.
- W3135804679 cites W2964031251 @default.
- W3135804679 cites W2964915616 @default.
- W3135804679 cites W2970217468 @default.
- W3135804679 cites W2970259623 @default.
- W3135804679 cites W2971043187 @default.
- W3135804679 cites W2990905473 @default.
- W3135804679 cites W2996168800 @default.
- W3135804679 cites W3010825589 @default.
- W3135804679 cites W3036333857 @default.
- W3135804679 cites W3046680711 @default.
- W3135804679 cites W3046864714 @default.
- W3135804679 cites W3100200769 @default.
- W3135804679 cites W3101036738 @default.
- W3135804679 cites W3101581426 @default.
- W3135804679 cites W3103424281 @default.
- W3135804679 cites W3106042364 @default.
- W3135804679 cites W3118608800 @default.
- W3135804679 cites W3122402965 @default.
- W3135804679 cites W3126575712 @default.
- W3135804679 hasPublicationYear "2021" @default.
- W3135804679 type Work @default.
- W3135804679 sameAs 3135804679 @default.
- W3135804679 citedByCount "1" @default.
- W3135804679 countsByYear W31358046792020 @default.
- W3135804679 crossrefType "posted-content" @default.
- W3135804679 hasAuthorship W3135804679A5014459472 @default.
- W3135804679 hasAuthorship W3135804679A5056579161 @default.
- W3135804679 hasConcept C108583219 @default.
- W3135804679 hasConcept C11413529 @default.
- W3135804679 hasConcept C114466953 @default.
- W3135804679 hasConcept C117765406 @default.
- W3135804679 hasConcept C119857082 @default.
- W3135804679 hasConcept C121332964 @default.
- W3135804679 hasConcept C127313418 @default.
- W3135804679 hasConcept C13280743 @default.
- W3135804679 hasConcept C134306372 @default.
- W3135804679 hasConcept C153294291 @default.
- W3135804679 hasConcept C154945302 @default.
- W3135804679 hasConcept C177148314 @default.
- W3135804679 hasConcept C178790620 @default.
- W3135804679 hasConcept C185592680 @default.
- W3135804679 hasConcept C185798385 @default.
- W3135804679 hasConcept C199360897 @default.
- W3135804679 hasConcept C2777211547 @default.
- W3135804679 hasConcept C2779227376 @default.
- W3135804679 hasConcept C33923547 @default.
- W3135804679 hasConcept C41008148 @default.
- W3135804679 hasConcept C49937458 @default.
- W3135804679 hasConcept C50644808 @default.
- W3135804679 hasConceptScore W3135804679C108583219 @default.
- W3135804679 hasConceptScore W3135804679C11413529 @default.
- W3135804679 hasConceptScore W3135804679C114466953 @default.
- W3135804679 hasConceptScore W3135804679C117765406 @default.
- W3135804679 hasConceptScore W3135804679C119857082 @default.
- W3135804679 hasConceptScore W3135804679C121332964 @default.
- W3135804679 hasConceptScore W3135804679C127313418 @default.
- W3135804679 hasConceptScore W3135804679C13280743 @default.
- W3135804679 hasConceptScore W3135804679C134306372 @default.