Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387030753> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4387030753 abstract "Recent experiments have shown that, often, when training a neural network with gradient descent (GD) with a step size $eta$, the operator norm of the Hessian of the loss grows until it approximately reaches $2/eta$, after which it fluctuates around this value. The quantity $2/eta$ has been called the edge of stability based on consideration of a local quadratic approximation of the loss. We perform a similar calculation to arrive at an edge of stability for Sharpness-Aware Minimization (SAM), a variant of GD which has been shown to improve its generalization. Unlike the case for GD, the resulting SAM-edge depends on the norm of the gradient. Using three deep learning training tasks, we see empirically that SAM operates on the edge of stability identified by this analysis." @default.
- W4387030753 created "2023-09-26" @default.
- W4387030753 creator A5030261391 @default.
- W4387030753 creator A5084487318 @default.
- W4387030753 date "2023-09-21" @default.
- W4387030753 modified "2023-10-18" @default.
- W4387030753 title "Sharpness-Aware Minimization and the Edge of Stability" @default.
- W4387030753 doi "https://doi.org/10.48550/arxiv.2309.12488" @default.
- W4387030753 hasPublicationYear "2023" @default.
- W4387030753 type Work @default.
- W4387030753 citedByCount "0" @default.
- W4387030753 crossrefType "posted-content" @default.
- W4387030753 hasAuthorship W4387030753A5030261391 @default.
- W4387030753 hasAuthorship W4387030753A5084487318 @default.
- W4387030753 hasBestOaLocation W43870307531 @default.
- W4387030753 hasConcept C104317684 @default.
- W4387030753 hasConcept C112972136 @default.
- W4387030753 hasConcept C11413529 @default.
- W4387030753 hasConcept C119857082 @default.
- W4387030753 hasConcept C126255220 @default.
- W4387030753 hasConcept C129844170 @default.
- W4387030753 hasConcept C134306372 @default.
- W4387030753 hasConcept C147764199 @default.
- W4387030753 hasConcept C153258448 @default.
- W4387030753 hasConcept C154945302 @default.
- W4387030753 hasConcept C158448853 @default.
- W4387030753 hasConcept C162307627 @default.
- W4387030753 hasConcept C17020691 @default.
- W4387030753 hasConcept C177148314 @default.
- W4387030753 hasConcept C17744445 @default.
- W4387030753 hasConcept C185592680 @default.
- W4387030753 hasConcept C191795146 @default.
- W4387030753 hasConcept C199539241 @default.
- W4387030753 hasConcept C203616005 @default.
- W4387030753 hasConcept C2524010 @default.
- W4387030753 hasConcept C28826006 @default.
- W4387030753 hasConcept C33923547 @default.
- W4387030753 hasConcept C41008148 @default.
- W4387030753 hasConcept C50644808 @default.
- W4387030753 hasConcept C55493867 @default.
- W4387030753 hasConcept C86339819 @default.
- W4387030753 hasConceptScore W4387030753C104317684 @default.
- W4387030753 hasConceptScore W4387030753C112972136 @default.
- W4387030753 hasConceptScore W4387030753C11413529 @default.
- W4387030753 hasConceptScore W4387030753C119857082 @default.
- W4387030753 hasConceptScore W4387030753C126255220 @default.
- W4387030753 hasConceptScore W4387030753C129844170 @default.
- W4387030753 hasConceptScore W4387030753C134306372 @default.
- W4387030753 hasConceptScore W4387030753C147764199 @default.
- W4387030753 hasConceptScore W4387030753C153258448 @default.
- W4387030753 hasConceptScore W4387030753C154945302 @default.
- W4387030753 hasConceptScore W4387030753C158448853 @default.
- W4387030753 hasConceptScore W4387030753C162307627 @default.
- W4387030753 hasConceptScore W4387030753C17020691 @default.
- W4387030753 hasConceptScore W4387030753C177148314 @default.
- W4387030753 hasConceptScore W4387030753C17744445 @default.
- W4387030753 hasConceptScore W4387030753C185592680 @default.
- W4387030753 hasConceptScore W4387030753C191795146 @default.
- W4387030753 hasConceptScore W4387030753C199539241 @default.
- W4387030753 hasConceptScore W4387030753C203616005 @default.
- W4387030753 hasConceptScore W4387030753C2524010 @default.
- W4387030753 hasConceptScore W4387030753C28826006 @default.
- W4387030753 hasConceptScore W4387030753C33923547 @default.
- W4387030753 hasConceptScore W4387030753C41008148 @default.
- W4387030753 hasConceptScore W4387030753C50644808 @default.
- W4387030753 hasConceptScore W4387030753C55493867 @default.
- W4387030753 hasConceptScore W4387030753C86339819 @default.
- W4387030753 hasLocation W43870307531 @default.
- W4387030753 hasOpenAccess W4387030753 @default.
- W4387030753 hasPrimaryLocation W43870307531 @default.
- W4387030753 hasRelatedWork W1854082044 @default.
- W4387030753 hasRelatedWork W1983003491 @default.
- W4387030753 hasRelatedWork W1989888202 @default.
- W4387030753 hasRelatedWork W1996936972 @default.
- W4387030753 hasRelatedWork W2015677538 @default.
- W4387030753 hasRelatedWork W2355987247 @default.
- W4387030753 hasRelatedWork W3021699548 @default.
- W4387030753 hasRelatedWork W3169007055 @default.
- W4387030753 hasRelatedWork W4283017538 @default.
- W4387030753 hasRelatedWork W4297883503 @default.
- W4387030753 isParatext "false" @default.
- W4387030753 isRetracted "false" @default.
- W4387030753 workType "article" @default.