Matches in SemOpenAlex for { <https://semopenalex.org/work/W3097370814> ?p ?o ?g. }
- W3097370814 abstract "In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight expansion of the network at initialization. Standard training, however, diverges from its linearization in ways that are poorly understood. We study the relationship between the training dynamics of nonlinear deep networks, the geometry of the loss landscape, and the time evolution of a data-dependent NTK. We do so through a large-scale phenomenological analysis of training, synthesizing diverse measures characterizing loss landscape geometry and NTK dynamics. In multiple neural architectures and datasets, we find these diverse measures evolve in a highly correlated manner, revealing a universal picture of the deep learning process. In this picture, deep network training exhibits a highly chaotic rapid initial transient that within 2 to 3 epochs determines the final linearly connected basin of low loss containing the end point of training. During this chaotic transient, the NTK changes rapidly, learning useful features from the training data that enables it to outperform the standard initial NTK by a factor of 3 in less than 3 to 4 epochs. After this rapid chaotic transient, the NTK changes at constant velocity, and its performance matches that of full network training in 15% to 45% of training time. Overall, our analysis reveals a striking correlation between a diverse set of metrics over training time, governed by a rapid chaotic to stable transition in the first few epochs, that together poses challenges and opportunities for the development of more accurate theories of deep learning." @default.
- W3097370814 created "2020-11-09" @default.
- W3097370814 creator A5021062973 @default.
- W3097370814 creator A5036307641 @default.
- W3097370814 creator A5042384857 @default.
- W3097370814 creator A5056551357 @default.
- W3097370814 creator A5078511923 @default.
- W3097370814 creator A5082696325 @default.
- W3097370814 date "2020-10-28" @default.
- W3097370814 modified "2023-09-26" @default.
- W3097370814 title "Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel" @default.
- W3097370814 cites W1677182931 @default.
- W3097370814 cites W2115065944 @default.
- W3097370814 cites W2187089797 @default.
- W3097370814 cites W2401231614 @default.
- W3097370814 cites W2552194003 @default.
- W3097370814 cites W2626325961 @default.
- W3097370814 cites W2768267830 @default.
- W3097370814 cites W2785533664 @default.
- W3097370814 cites W2793333878 @default.
- W3097370814 cites W2809090039 @default.
- W3097370814 cites W2899790086 @default.
- W3097370814 cites W2904142648 @default.
- W3097370814 cites W2912811302 @default.
- W3097370814 cites W2949650786 @default.
- W3097370814 cites W2962733100 @default.
- W3097370814 cites W2963384892 @default.
- W3097370814 cites W2963509076 @default.
- W3097370814 cites W2963959597 @default.
- W3097370814 cites W2964305242 @default.
- W3097370814 cites W2970249264 @default.
- W3097370814 cites W2970443625 @default.
- W3097370814 cites W2971043187 @default.
- W3097370814 cites W2971067248 @default.
- W3097370814 cites W2979646054 @default.
- W3097370814 cites W2992525328 @default.
- W3097370814 cites W2994747787 @default.
- W3097370814 cites W2994872659 @default.
- W3097370814 cites W2996141621 @default.
- W3097370814 cites W2996168800 @default.
- W3097370814 cites W3004633050 @default.
- W3097370814 cites W3008317078 @default.
- W3097370814 cites W3009686669 @default.
- W3097370814 cites W3010154184 @default.
- W3097370814 cites W3035081900 @default.
- W3097370814 cites W2991401328 @default.
- W3097370814 hasPublicationYear "2020" @default.
- W3097370814 type Work @default.
- W3097370814 sameAs 3097370814 @default.
- W3097370814 citedByCount "6" @default.
- W3097370814 countsByYear W30973708142021 @default.
- W3097370814 crossrefType "posted-content" @default.
- W3097370814 hasAuthorship W3097370814A5021062973 @default.
- W3097370814 hasAuthorship W3097370814A5036307641 @default.
- W3097370814 hasAuthorship W3097370814A5042384857 @default.
- W3097370814 hasAuthorship W3097370814A5056551357 @default.
- W3097370814 hasAuthorship W3097370814A5078511923 @default.
- W3097370814 hasAuthorship W3097370814A5082696325 @default.
- W3097370814 hasConcept C11210021 @default.
- W3097370814 hasConcept C11413529 @default.
- W3097370814 hasConcept C114466953 @default.
- W3097370814 hasConcept C118615104 @default.
- W3097370814 hasConcept C121332964 @default.
- W3097370814 hasConcept C138187205 @default.
- W3097370814 hasConcept C154945302 @default.
- W3097370814 hasConcept C158622935 @default.
- W3097370814 hasConcept C199360897 @default.
- W3097370814 hasConcept C2524010 @default.
- W3097370814 hasConcept C2777052490 @default.
- W3097370814 hasConcept C33923547 @default.
- W3097370814 hasConcept C41008148 @default.
- W3097370814 hasConcept C50644808 @default.
- W3097370814 hasConcept C62520636 @default.
- W3097370814 hasConcept C74193536 @default.
- W3097370814 hasConceptScore W3097370814C11210021 @default.
- W3097370814 hasConceptScore W3097370814C11413529 @default.
- W3097370814 hasConceptScore W3097370814C114466953 @default.
- W3097370814 hasConceptScore W3097370814C118615104 @default.
- W3097370814 hasConceptScore W3097370814C121332964 @default.
- W3097370814 hasConceptScore W3097370814C138187205 @default.
- W3097370814 hasConceptScore W3097370814C154945302 @default.
- W3097370814 hasConceptScore W3097370814C158622935 @default.
- W3097370814 hasConceptScore W3097370814C199360897 @default.
- W3097370814 hasConceptScore W3097370814C2524010 @default.
- W3097370814 hasConceptScore W3097370814C2777052490 @default.
- W3097370814 hasConceptScore W3097370814C33923547 @default.
- W3097370814 hasConceptScore W3097370814C41008148 @default.
- W3097370814 hasConceptScore W3097370814C50644808 @default.
- W3097370814 hasConceptScore W3097370814C62520636 @default.
- W3097370814 hasConceptScore W3097370814C74193536 @default.
- W3097370814 hasLocation W30973708141 @default.
- W3097370814 hasOpenAccess W3097370814 @default.
- W3097370814 hasPrimaryLocation W30973708141 @default.
- W3097370814 hasRelatedWork W2125930537 @default.
- W3097370814 hasRelatedWork W2194775991 @default.
- W3097370814 hasRelatedWork W2899748887 @default.
- W3097370814 hasRelatedWork W2946216287 @default.
- W3097370814 hasRelatedWork W2966530573 @default.
- W3097370814 hasRelatedWork W2970217468 @default.
- W3097370814 hasRelatedWork W2980674230 @default.