Matches in SemOpenAlex for { <https://semopenalex.org/work/W3135166055> ?p ?o ?g. }
- W3135166055 abstract "Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning, e.g., training deep neural networks, variational Bayesian inference, and etc. Despite its empirical success, there is still a lack of theoretical understanding of convergence properties of MSGD. To fill this gap, we propose to analyze the algorithmic behavior of MSGD by diffusion approximations for nonconvex optimization problems with strict saddle points and isolated local optima. Our study shows that the momentum helps escape from saddle points, but hurts the convergence within the neighborhood of optima (if without the step size annealing or momentum annealing). Our theoretical discovery partially corroborates the empirical success of MSGD in training deep neural networks." @default.
- W3135166055 created "2021-03-15" @default.
- W3135166055 creator A5016491177 @default.
- W3135166055 creator A5028930379 @default.
- W3135166055 creator A5070085914 @default.
- W3135166055 creator A5083807807 @default.
- W3135166055 date "2018-02-14" @default.
- W3135166055 modified "2023-09-27" @default.
- W3135166055 title "A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization" @default.
- W3135166055 cites W104184427 @default.
- W3135166055 cites W1499021337 @default.
- W3135166055 cites W1899249567 @default.
- W3135166055 cites W1987083649 @default.
- W3135166055 cites W1988720110 @default.
- W3135166055 cites W1994616650 @default.
- W3135166055 cites W2071983464 @default.
- W3135166055 cites W2089256099 @default.
- W3135166055 cites W2094364653 @default.
- W3135166055 cites W2183341477 @default.
- W3135166055 cites W2194775991 @default.
- W3135166055 cites W2271840356 @default.
- W3135166055 cites W2563318093 @default.
- W3135166055 cites W2594787342 @default.
- W3135166055 cites W2739461634 @default.
- W3135166055 cites W2769464037 @default.
- W3135166055 cites W2785626633 @default.
- W3135166055 cites W2798826368 @default.
- W3135166055 cites W2798986185 @default.
- W3135166055 cites W2893593889 @default.
- W3135166055 cites W2913010492 @default.
- W3135166055 cites W2913271688 @default.
- W3135166055 cites W2921395510 @default.
- W3135166055 cites W2931810883 @default.
- W3135166055 cites W2946668020 @default.
- W3135166055 cites W2963095610 @default.
- W3135166055 cites W2963251229 @default.
- W3135166055 cites W2963470657 @default.
- W3135166055 cites W2963487351 @default.
- W3135166055 cites W2963739978 @default.
- W3135166055 cites W2963959597 @default.
- W3135166055 cites W2964040701 @default.
- W3135166055 cites W2964072432 @default.
- W3135166055 cites W2964121744 @default.
- W3135166055 cites W2964156132 @default.
- W3135166055 cites W2970971581 @default.
- W3135166055 cites W3115592686 @default.
- W3135166055 cites W594357522 @default.
- W3135166055 hasPublicationYear "2018" @default.
- W3135166055 type Work @default.
- W3135166055 sameAs 3135166055 @default.
- W3135166055 citedByCount "3" @default.
- W3135166055 countsByYear W31351660552018 @default.
- W3135166055 countsByYear W31351660552020 @default.
- W3135166055 countsByYear W31351660552021 @default.
- W3135166055 crossrefType "posted-content" @default.
- W3135166055 hasAuthorship W3135166055A5016491177 @default.
- W3135166055 hasAuthorship W3135166055A5028930379 @default.
- W3135166055 hasAuthorship W3135166055A5070085914 @default.
- W3135166055 hasAuthorship W3135166055A5083807807 @default.
- W3135166055 hasConcept C10138342 @default.
- W3135166055 hasConcept C126255220 @default.
- W3135166055 hasConcept C126980161 @default.
- W3135166055 hasConcept C141934464 @default.
- W3135166055 hasConcept C153258448 @default.
- W3135166055 hasConcept C154945302 @default.
- W3135166055 hasConcept C162324750 @default.
- W3135166055 hasConcept C206688291 @default.
- W3135166055 hasConcept C2524010 @default.
- W3135166055 hasConcept C2681867 @default.
- W3135166055 hasConcept C2776214188 @default.
- W3135166055 hasConcept C2777303404 @default.
- W3135166055 hasConcept C2778049539 @default.
- W3135166055 hasConcept C28826006 @default.
- W3135166055 hasConcept C33923547 @default.
- W3135166055 hasConcept C41008148 @default.
- W3135166055 hasConcept C50522688 @default.
- W3135166055 hasConcept C50644808 @default.
- W3135166055 hasConcept C60718061 @default.
- W3135166055 hasConceptScore W3135166055C10138342 @default.
- W3135166055 hasConceptScore W3135166055C126255220 @default.
- W3135166055 hasConceptScore W3135166055C126980161 @default.
- W3135166055 hasConceptScore W3135166055C141934464 @default.
- W3135166055 hasConceptScore W3135166055C153258448 @default.
- W3135166055 hasConceptScore W3135166055C154945302 @default.
- W3135166055 hasConceptScore W3135166055C162324750 @default.
- W3135166055 hasConceptScore W3135166055C206688291 @default.
- W3135166055 hasConceptScore W3135166055C2524010 @default.
- W3135166055 hasConceptScore W3135166055C2681867 @default.
- W3135166055 hasConceptScore W3135166055C2776214188 @default.
- W3135166055 hasConceptScore W3135166055C2777303404 @default.
- W3135166055 hasConceptScore W3135166055C2778049539 @default.
- W3135166055 hasConceptScore W3135166055C28826006 @default.
- W3135166055 hasConceptScore W3135166055C33923547 @default.
- W3135166055 hasConceptScore W3135166055C41008148 @default.
- W3135166055 hasConceptScore W3135166055C50522688 @default.
- W3135166055 hasConceptScore W3135166055C50644808 @default.
- W3135166055 hasConceptScore W3135166055C60718061 @default.
- W3135166055 hasLocation W31351660551 @default.
- W3135166055 hasOpenAccess W3135166055 @default.
- W3135166055 hasPrimaryLocation W31351660551 @default.