Matches in SemOpenAlex for { <https://semopenalex.org/work/W4225534213> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4225534213 abstract "As one of the most fundamental stochastic optimization algorithms, stochastic gradient descent (SGD) has been intensively developed and extensively applied in machine learning in the past decade. There have been some modified SGD-type algorithms, which outperform the SGD in many competitions and applications in terms of convergence rate and accuracy, such as momentum-based SGD (mSGD) and adaptive gradient algorithm (AdaGrad). Despite these empirical successes, the theoretical properties of these algorithms have not been well established due to technical difficulties. With this motivation, we focus on convergence analysis of mSGD and AdaGrad for any smooth (possibly non-convex) loss functions in stochastic optimization. First, we prove that the iterates of mSGD are asymptotically convergent to a connected set of stationary points with probability one, which is more general than existing works on subsequence convergence or convergence of time averages. Moreover, we prove that the loss function of mSGD decays at a certain rate faster than that of SGD. In addition, we prove the iterates of AdaGrad are asymptotically convergent to a connected set of stationary points with probability one. Also, this result extends the results from the literature on subsequence convergence and the convergence of time averages. Despite the generality of the above convergence results, we have relaxed some assumptions of gradient noises, convexity of loss functions, as well as boundedness of iterates." @default.
- W4225534213 created "2022-05-05" @default.
- W4225534213 creator A5022886989 @default.
- W4225534213 creator A5035064408 @default.
- W4225534213 creator A5078221177 @default.
- W4225534213 date "2022-01-26" @default.
- W4225534213 modified "2023-10-16" @default.
- W4225534213 title "On the Convergence of mSGD and AdaGrad for Stochastic Optimization" @default.
- W4225534213 doi "https://doi.org/10.48550/arxiv.2201.11204" @default.
- W4225534213 hasPublicationYear "2022" @default.
- W4225534213 type Work @default.
- W4225534213 citedByCount "0" @default.
- W4225534213 crossrefType "posted-content" @default.
- W4225534213 hasAuthorship W4225534213A5022886989 @default.
- W4225534213 hasAuthorship W4225534213A5035064408 @default.
- W4225534213 hasAuthorship W4225534213A5078221177 @default.
- W4225534213 hasBestOaLocation W42255342131 @default.
- W4225534213 hasConcept C106159729 @default.
- W4225534213 hasConcept C112680207 @default.
- W4225534213 hasConcept C126255220 @default.
- W4225534213 hasConcept C127162648 @default.
- W4225534213 hasConcept C134306372 @default.
- W4225534213 hasConcept C137877099 @default.
- W4225534213 hasConcept C140479938 @default.
- W4225534213 hasConcept C145446738 @default.
- W4225534213 hasConcept C154945302 @default.
- W4225534213 hasConcept C162324750 @default.
- W4225534213 hasConcept C189237950 @default.
- W4225534213 hasConcept C206688291 @default.
- W4225534213 hasConcept C2524010 @default.
- W4225534213 hasConcept C2777303404 @default.
- W4225534213 hasConcept C28826006 @default.
- W4225534213 hasConcept C31258907 @default.
- W4225534213 hasConcept C33923547 @default.
- W4225534213 hasConcept C34388435 @default.
- W4225534213 hasConcept C41008148 @default.
- W4225534213 hasConcept C50522688 @default.
- W4225534213 hasConcept C50644808 @default.
- W4225534213 hasConcept C57869625 @default.
- W4225534213 hasConcept C72134830 @default.
- W4225534213 hasConceptScore W4225534213C106159729 @default.
- W4225534213 hasConceptScore W4225534213C112680207 @default.
- W4225534213 hasConceptScore W4225534213C126255220 @default.
- W4225534213 hasConceptScore W4225534213C127162648 @default.
- W4225534213 hasConceptScore W4225534213C134306372 @default.
- W4225534213 hasConceptScore W4225534213C137877099 @default.
- W4225534213 hasConceptScore W4225534213C140479938 @default.
- W4225534213 hasConceptScore W4225534213C145446738 @default.
- W4225534213 hasConceptScore W4225534213C154945302 @default.
- W4225534213 hasConceptScore W4225534213C162324750 @default.
- W4225534213 hasConceptScore W4225534213C189237950 @default.
- W4225534213 hasConceptScore W4225534213C206688291 @default.
- W4225534213 hasConceptScore W4225534213C2524010 @default.
- W4225534213 hasConceptScore W4225534213C2777303404 @default.
- W4225534213 hasConceptScore W4225534213C28826006 @default.
- W4225534213 hasConceptScore W4225534213C31258907 @default.
- W4225534213 hasConceptScore W4225534213C33923547 @default.
- W4225534213 hasConceptScore W4225534213C34388435 @default.
- W4225534213 hasConceptScore W4225534213C41008148 @default.
- W4225534213 hasConceptScore W4225534213C50522688 @default.
- W4225534213 hasConceptScore W4225534213C50644808 @default.
- W4225534213 hasConceptScore W4225534213C57869625 @default.
- W4225534213 hasConceptScore W4225534213C72134830 @default.
- W4225534213 hasLocation W42255342131 @default.
- W4225534213 hasOpenAccess W4225534213 @default.
- W4225534213 hasPrimaryLocation W42255342131 @default.
- W4225534213 hasRelatedWork W2416347099 @default.
- W4225534213 hasRelatedWork W2952233551 @default.
- W4225534213 hasRelatedWork W3035289993 @default.
- W4225534213 hasRelatedWork W3132170983 @default.
- W4225534213 hasRelatedWork W3153183982 @default.
- W4225534213 hasRelatedWork W3201804720 @default.
- W4225534213 hasRelatedWork W4225534213 @default.
- W4225534213 hasRelatedWork W4226236775 @default.
- W4225534213 hasRelatedWork W4288322143 @default.
- W4225534213 hasRelatedWork W4298055110 @default.
- W4225534213 isParatext "false" @default.
- W4225534213 isRetracted "false" @default.
- W4225534213 workType "article" @default.