Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387343669> ?p ?o ?g. }
- W4387343669 abstract "Abstract We study the scaling limits of stochastic gradient descent (SGD) with constant step‐size in the high‐dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite‐dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step‐size. It yields both ballistic (ODE) and diffusive (SDE) limits, with the limit depending dramatically on the former choices. We show a critical scaling regime for the step‐size, below which the effective ballistic dynamics matches gradient flow for the population loss, but at which, a new correction term appears which changes the phase diagram. About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two‐layer networks for binary and XOR‐type Gaussian mixture models. These examples exhibit surprising phenomena including multimodal timescales to convergence as well as convergence to sub‐optimal solutions with probability bounded away from zero from random (e.g., Gaussian) initializations. At the same time, we demonstrate the benefit of overparametrization by showing that the latter probability goes to zero as the second layer width grows." @default.
- W4387343669 created "2023-10-05" @default.
- W4387343669 creator A5039747459 @default.
- W4387343669 creator A5051194866 @default.
- W4387343669 creator A5085479654 @default.
- W4387343669 date "2023-10-04" @default.
- W4387343669 modified "2023-10-06" @default.
- W4387343669 title "High‐dimensional limit theorems for SGD: Effective dynamics and critical scaling" @default.
- W4387343669 cites W1491706803 @default.
- W4387343669 cites W1520752838 @default.
- W4387343669 cites W1521738998 @default.
- W4387343669 cites W1568229137 @default.
- W4387343669 cites W173781251 @default.
- W4387343669 cites W1979344292 @default.
- W4387343669 cites W1983383403 @default.
- W4387343669 cites W1994616650 @default.
- W4387343669 cites W1995842804 @default.
- W4387343669 cites W2066459155 @default.
- W4387343669 cites W2099551908 @default.
- W4387343669 cites W2125812768 @default.
- W4387343669 cites W2132211083 @default.
- W4387343669 cites W2153486576 @default.
- W4387343669 cites W2156948165 @default.
- W4387343669 cites W2162287622 @default.
- W4387343669 cites W2290522125 @default.
- W4387343669 cites W2810555534 @default.
- W4387343669 cites W2962881498 @default.
- W4387343669 cites W2963095610 @default.
- W4387343669 cites W2963791871 @default.
- W4387343669 cites W2963877580 @default.
- W4387343669 cites W2968353065 @default.
- W4387343669 cites W3004687303 @default.
- W4387343669 cites W3043072433 @default.
- W4387343669 cites W3044069306 @default.
- W4387343669 cites W3046968338 @default.
- W4387343669 cites W3101714678 @default.
- W4387343669 cites W3101975788 @default.
- W4387343669 cites W3102677403 @default.
- W4387343669 cites W3103328814 @default.
- W4387343669 cites W4211030719 @default.
- W4387343669 cites W4213329537 @default.
- W4387343669 cites W4240234826 @default.
- W4387343669 cites W4250954493 @default.
- W4387343669 doi "https://doi.org/10.1002/cpa.22169" @default.
- W4387343669 hasPublicationYear "2023" @default.
- W4387343669 type Work @default.
- W4387343669 citedByCount "0" @default.
- W4387343669 crossrefType "journal-article" @default.
- W4387343669 hasAuthorship W4387343669A5039747459 @default.
- W4387343669 hasAuthorship W4387343669A5051194866 @default.
- W4387343669 hasAuthorship W4387343669A5085479654 @default.
- W4387343669 hasBestOaLocation W43873436691 @default.
- W4387343669 hasConcept C121332964 @default.
- W4387343669 hasConcept C121864883 @default.
- W4387343669 hasConcept C134306372 @default.
- W4387343669 hasConcept C151201525 @default.
- W4387343669 hasConcept C162324750 @default.
- W4387343669 hasConcept C163716315 @default.
- W4387343669 hasConcept C24084028 @default.
- W4387343669 hasConcept C2524010 @default.
- W4387343669 hasConcept C2777303404 @default.
- W4387343669 hasConcept C28826006 @default.
- W4387343669 hasConcept C33923547 @default.
- W4387343669 hasConcept C34388435 @default.
- W4387343669 hasConcept C50522688 @default.
- W4387343669 hasConcept C62520636 @default.
- W4387343669 hasConcept C99844830 @default.
- W4387343669 hasConceptScore W4387343669C121332964 @default.
- W4387343669 hasConceptScore W4387343669C121864883 @default.
- W4387343669 hasConceptScore W4387343669C134306372 @default.
- W4387343669 hasConceptScore W4387343669C151201525 @default.
- W4387343669 hasConceptScore W4387343669C162324750 @default.
- W4387343669 hasConceptScore W4387343669C163716315 @default.
- W4387343669 hasConceptScore W4387343669C24084028 @default.
- W4387343669 hasConceptScore W4387343669C2524010 @default.
- W4387343669 hasConceptScore W4387343669C2777303404 @default.
- W4387343669 hasConceptScore W4387343669C28826006 @default.
- W4387343669 hasConceptScore W4387343669C33923547 @default.
- W4387343669 hasConceptScore W4387343669C34388435 @default.
- W4387343669 hasConceptScore W4387343669C50522688 @default.
- W4387343669 hasConceptScore W4387343669C62520636 @default.
- W4387343669 hasConceptScore W4387343669C99844830 @default.
- W4387343669 hasFunder F4320306076 @default.
- W4387343669 hasFunder F4320309634 @default.
- W4387343669 hasFunder F4320320994 @default.
- W4387343669 hasFunder F4320334593 @default.
- W4387343669 hasLocation W43873436691 @default.
- W4387343669 hasOpenAccess W4387343669 @default.
- W4387343669 hasPrimaryLocation W43873436691 @default.
- W4387343669 hasRelatedWork W1565615755 @default.
- W4387343669 hasRelatedWork W2030798005 @default.
- W4387343669 hasRelatedWork W2032908587 @default.
- W4387343669 hasRelatedWork W2123172466 @default.
- W4387343669 hasRelatedWork W2624638112 @default.
- W4387343669 hasRelatedWork W2769746109 @default.
- W4387343669 hasRelatedWork W2963139859 @default.
- W4387343669 hasRelatedWork W4286420423 @default.
- W4387343669 hasRelatedWork W4297416378 @default.
- W4387343669 hasRelatedWork W4300973038 @default.
- W4387343669 isParatext "false" @default.