Matches in SemOpenAlex for { <https://semopenalex.org/work/W4378942360> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4378942360 abstract "We provide a simple convergence proof for AdaGrad optimizing non-convex objectives under only affine noise variance and bounded smoothness assumptions. The proof is essentially based on a novel auxiliary function $xi$ that helps eliminate the complexity of handling the correlation between the numerator and denominator of AdaGrad's update. Leveraging simple proofs, we are able to obtain tighter results than existing results citep{faw2022power} and extend the analysis to several new and important cases. Specifically, for the over-parameterized regime, we show that AdaGrad needs only $mathcal{O}(frac{1}{varepsilon^2})$ iterations to ensure the gradient norm smaller than $varepsilon$, which matches the rate of SGD and significantly tighter than existing rates $mathcal{O}(frac{1}{varepsilon^4})$ for AdaGrad. We then discard the bounded smoothness assumption and consider a realistic assumption on smoothness called $(L_0,L_1)$-smooth condition, which allows local smoothness to grow with the gradient norm. Again based on the auxiliary function $xi$, we prove that AdaGrad succeeds in converging under $(L_0,L_1)$-smooth condition as long as the learning rate is lower than a threshold. Interestingly, we further show that the requirement on learning rate under the $(L_0,L_1)$-smooth condition is necessary via proof by contradiction, in contrast with the case of uniform smoothness conditions where convergence is guaranteed regardless of learning rate choices. Together, our analyses broaden the understanding of AdaGrad and demonstrate the power of the new auxiliary function in the investigations of AdaGrad." @default.
- W4378942360 created "2023-06-01" @default.
- W4378942360 creator A5017541508 @default.
- W4378942360 creator A5042018939 @default.
- W4378942360 creator A5066928490 @default.
- W4378942360 creator A5078600050 @default.
- W4378942360 date "2023-05-29" @default.
- W4378942360 modified "2023-10-01" @default.
- W4378942360 title "Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions" @default.
- W4378942360 doi "https://doi.org/10.48550/arxiv.2305.18471" @default.
- W4378942360 hasPublicationYear "2023" @default.
- W4378942360 type Work @default.
- W4378942360 citedByCount "0" @default.
- W4378942360 crossrefType "posted-content" @default.
- W4378942360 hasAuthorship W4378942360A5017541508 @default.
- W4378942360 hasAuthorship W4378942360A5042018939 @default.
- W4378942360 hasAuthorship W4378942360A5066928490 @default.
- W4378942360 hasAuthorship W4378942360A5078600050 @default.
- W4378942360 hasBestOaLocation W43789423601 @default.
- W4378942360 hasConcept C102634674 @default.
- W4378942360 hasConcept C108710211 @default.
- W4378942360 hasConcept C111472728 @default.
- W4378942360 hasConcept C112680207 @default.
- W4378942360 hasConcept C114614502 @default.
- W4378942360 hasConcept C127162648 @default.
- W4378942360 hasConcept C134306372 @default.
- W4378942360 hasConcept C138885662 @default.
- W4378942360 hasConcept C14036430 @default.
- W4378942360 hasConcept C145446738 @default.
- W4378942360 hasConcept C165464430 @default.
- W4378942360 hasConcept C17744445 @default.
- W4378942360 hasConcept C191795146 @default.
- W4378942360 hasConcept C199539241 @default.
- W4378942360 hasConcept C2524010 @default.
- W4378942360 hasConcept C2780586882 @default.
- W4378942360 hasConcept C28826006 @default.
- W4378942360 hasConcept C31258907 @default.
- W4378942360 hasConcept C33923547 @default.
- W4378942360 hasConcept C34388435 @default.
- W4378942360 hasConcept C41008148 @default.
- W4378942360 hasConcept C57869625 @default.
- W4378942360 hasConcept C78458016 @default.
- W4378942360 hasConcept C86803240 @default.
- W4378942360 hasConceptScore W4378942360C102634674 @default.
- W4378942360 hasConceptScore W4378942360C108710211 @default.
- W4378942360 hasConceptScore W4378942360C111472728 @default.
- W4378942360 hasConceptScore W4378942360C112680207 @default.
- W4378942360 hasConceptScore W4378942360C114614502 @default.
- W4378942360 hasConceptScore W4378942360C127162648 @default.
- W4378942360 hasConceptScore W4378942360C134306372 @default.
- W4378942360 hasConceptScore W4378942360C138885662 @default.
- W4378942360 hasConceptScore W4378942360C14036430 @default.
- W4378942360 hasConceptScore W4378942360C145446738 @default.
- W4378942360 hasConceptScore W4378942360C165464430 @default.
- W4378942360 hasConceptScore W4378942360C17744445 @default.
- W4378942360 hasConceptScore W4378942360C191795146 @default.
- W4378942360 hasConceptScore W4378942360C199539241 @default.
- W4378942360 hasConceptScore W4378942360C2524010 @default.
- W4378942360 hasConceptScore W4378942360C2780586882 @default.
- W4378942360 hasConceptScore W4378942360C28826006 @default.
- W4378942360 hasConceptScore W4378942360C31258907 @default.
- W4378942360 hasConceptScore W4378942360C33923547 @default.
- W4378942360 hasConceptScore W4378942360C34388435 @default.
- W4378942360 hasConceptScore W4378942360C41008148 @default.
- W4378942360 hasConceptScore W4378942360C57869625 @default.
- W4378942360 hasConceptScore W4378942360C78458016 @default.
- W4378942360 hasConceptScore W4378942360C86803240 @default.
- W4378942360 hasLocation W43789423601 @default.
- W4378942360 hasOpenAccess W4378942360 @default.
- W4378942360 hasPrimaryLocation W43789423601 @default.
- W4378942360 hasRelatedWork W2023860049 @default.
- W4378942360 hasRelatedWork W2174650017 @default.
- W4378942360 hasRelatedWork W2400034325 @default.
- W4378942360 hasRelatedWork W2599293435 @default.
- W4378942360 hasRelatedWork W29717965 @default.
- W4378942360 hasRelatedWork W3100513551 @default.
- W4378942360 hasRelatedWork W3136792315 @default.
- W4378942360 hasRelatedWork W314650428 @default.
- W4378942360 hasRelatedWork W4221148781 @default.
- W4378942360 hasRelatedWork W4283010137 @default.
- W4378942360 isParatext "false" @default.
- W4378942360 isRetracted "false" @default.
- W4378942360 workType "article" @default.