Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287755514> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W4287755514 abstract "Stochastic Gradient Descent (SGD) is being used routinely for optimizing non-convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex setting gives a slow sublinear convergence to a stationary point. In this work, we provide several convergence theorems for SGD showing convergence to a global minimum for non-convex problems satisfying some extra structural assumptions. In particular, we focus on two large classes of structured non-convex functions: (i) Quasar (Strongly) Convex functions (a generalization of convex functions) and (ii) functions satisfying the Polyak-Lojasiewicz condition (a generalization of strongly-convex functions). Our analysis relies on an Expected Residual condition which we show is a strictly weaker assumption than previously used growth conditions, expected smoothness or bounded variance assumptions. We provide theoretical guarantees for the convergence of SGD for different step-size selections including constant, decreasing and the recently proposed stochastic Polyak step-size. In addition, all of our analysis holds for the arbitrary sampling paradigm, and as such, we give insights into the complexity of minibatching and determine an optimal minibatch size. Finally, we show that for models that interpolate the training data, we can dispense of our Expected Residual condition and give state-of-the-art results in this setting." @default.
- W4287755514 created "2022-07-26" @default.
- W4287755514 creator A5017836121 @default.
- W4287755514 creator A5045050786 @default.
- W4287755514 creator A5075390126 @default.
- W4287755514 date "2020-06-18" @default.
- W4287755514 modified "2023-09-23" @default.
- W4287755514 title "SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation" @default.
- W4287755514 doi "https://doi.org/10.48550/arxiv.2006.10311" @default.
- W4287755514 hasPublicationYear "2020" @default.
- W4287755514 type Work @default.
- W4287755514 citedByCount "0" @default.
- W4287755514 crossrefType "posted-content" @default.
- W4287755514 hasAuthorship W4287755514A5017836121 @default.
- W4287755514 hasAuthorship W4287755514A5045050786 @default.
- W4287755514 hasAuthorship W4287755514A5075390126 @default.
- W4287755514 hasBestOaLocation W42877555141 @default.
- W4287755514 hasConcept C102634674 @default.
- W4287755514 hasConcept C104114177 @default.
- W4287755514 hasConcept C112680207 @default.
- W4287755514 hasConcept C11413529 @default.
- W4287755514 hasConcept C117160843 @default.
- W4287755514 hasConcept C118615104 @default.
- W4287755514 hasConcept C126255220 @default.
- W4287755514 hasConcept C127162648 @default.
- W4287755514 hasConcept C134306372 @default.
- W4287755514 hasConcept C137800194 @default.
- W4287755514 hasConcept C14036430 @default.
- W4287755514 hasConcept C145446738 @default.
- W4287755514 hasConcept C154945302 @default.
- W4287755514 hasConcept C155512373 @default.
- W4287755514 hasConcept C162324750 @default.
- W4287755514 hasConcept C177148314 @default.
- W4287755514 hasConcept C206688291 @default.
- W4287755514 hasConcept C2524010 @default.
- W4287755514 hasConcept C2777303404 @default.
- W4287755514 hasConcept C28826006 @default.
- W4287755514 hasConcept C31258907 @default.
- W4287755514 hasConcept C33923547 @default.
- W4287755514 hasConcept C34388435 @default.
- W4287755514 hasConcept C41008148 @default.
- W4287755514 hasConcept C50522688 @default.
- W4287755514 hasConcept C50644808 @default.
- W4287755514 hasConcept C57869625 @default.
- W4287755514 hasConcept C78458016 @default.
- W4287755514 hasConcept C86803240 @default.
- W4287755514 hasConceptScore W4287755514C102634674 @default.
- W4287755514 hasConceptScore W4287755514C104114177 @default.
- W4287755514 hasConceptScore W4287755514C112680207 @default.
- W4287755514 hasConceptScore W4287755514C11413529 @default.
- W4287755514 hasConceptScore W4287755514C117160843 @default.
- W4287755514 hasConceptScore W4287755514C118615104 @default.
- W4287755514 hasConceptScore W4287755514C126255220 @default.
- W4287755514 hasConceptScore W4287755514C127162648 @default.
- W4287755514 hasConceptScore W4287755514C134306372 @default.
- W4287755514 hasConceptScore W4287755514C137800194 @default.
- W4287755514 hasConceptScore W4287755514C14036430 @default.
- W4287755514 hasConceptScore W4287755514C145446738 @default.
- W4287755514 hasConceptScore W4287755514C154945302 @default.
- W4287755514 hasConceptScore W4287755514C155512373 @default.
- W4287755514 hasConceptScore W4287755514C162324750 @default.
- W4287755514 hasConceptScore W4287755514C177148314 @default.
- W4287755514 hasConceptScore W4287755514C206688291 @default.
- W4287755514 hasConceptScore W4287755514C2524010 @default.
- W4287755514 hasConceptScore W4287755514C2777303404 @default.
- W4287755514 hasConceptScore W4287755514C28826006 @default.
- W4287755514 hasConceptScore W4287755514C31258907 @default.
- W4287755514 hasConceptScore W4287755514C33923547 @default.
- W4287755514 hasConceptScore W4287755514C34388435 @default.
- W4287755514 hasConceptScore W4287755514C41008148 @default.
- W4287755514 hasConceptScore W4287755514C50522688 @default.
- W4287755514 hasConceptScore W4287755514C50644808 @default.
- W4287755514 hasConceptScore W4287755514C57869625 @default.
- W4287755514 hasConceptScore W4287755514C78458016 @default.
- W4287755514 hasConceptScore W4287755514C86803240 @default.
- W4287755514 hasLocation W42877555141 @default.
- W4287755514 hasOpenAccess W4287755514 @default.
- W4287755514 hasPrimaryLocation W42877555141 @default.
- W4287755514 hasRelatedWork W2291950750 @default.
- W4287755514 hasRelatedWork W2378841300 @default.
- W4287755514 hasRelatedWork W2953182765 @default.
- W4287755514 hasRelatedWork W2963335821 @default.
- W4287755514 hasRelatedWork W2970671394 @default.
- W4287755514 hasRelatedWork W3035289993 @default.
- W4287755514 hasRelatedWork W3129146497 @default.
- W4287755514 hasRelatedWork W3168993225 @default.
- W4287755514 hasRelatedWork W3173723779 @default.
- W4287755514 hasRelatedWork W4287343372 @default.
- W4287755514 isParatext "false" @default.
- W4287755514 isRetracted "false" @default.
- W4287755514 workType "article" @default.