Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310658169> ?p ?o ?g. }
Showing items 1 to 85 of
85
with 100 items per page.
- W4310658169 abstract "A quadratic approximation of neural network loss landscapes has been extensively used to study the optimization process of these networks. Though, it usually holds in a very small neighborhood of the minimum, it cannot explain many phenomena observed during the optimization process. In this work, we study the structure of neural network loss functions and its implication on optimization in a region beyond the reach of a good quadratic approximation. Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly. Using the subquadratic growth, we are able to explain the Edge of Stability phenomenon [5] observed for the gradient descent (GD) method. Using the separate scales, we explain the working mechanism of learning rate decay by simple examples. Finally, we study the origin of the multiscale structure and propose that the non-convexity of the models and the non-uniformity of training data is one of the causes. By constructing a two-layer neural network problem we show that training data with different magnitudes give rise to different scales of the loss function, producing subquadratic growth and multiple separate scales." @default.
- W4310658169 created "2022-12-14" @default.
- W4310658169 creator A5011918131 @default.
- W4310658169 creator A5037955577 @default.
- W4310658169 creator A5048025879 @default.
- W4310658169 creator A5057640513 @default.
- W4310658169 date "2022-04-24" @default.
- W4310658169 modified "2023-10-16" @default.
- W4310658169 title "Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes" @default.
- W4310658169 doi "https://doi.org/10.48550/arxiv.2204.11326" @default.
- W4310658169 hasPublicationYear "2022" @default.
- W4310658169 type Work @default.
- W4310658169 citedByCount "1" @default.
- W4310658169 countsByYear W43106581692023 @default.
- W4310658169 crossrefType "posted-content" @default.
- W4310658169 hasAuthorship W4310658169A5011918131 @default.
- W4310658169 hasAuthorship W4310658169A5037955577 @default.
- W4310658169 hasAuthorship W4310658169A5048025879 @default.
- W4310658169 hasAuthorship W4310658169A5057640513 @default.
- W4310658169 hasBestOaLocation W43106581691 @default.
- W4310658169 hasConcept C106159729 @default.
- W4310658169 hasConcept C112972136 @default.
- W4310658169 hasConcept C11413529 @default.
- W4310658169 hasConcept C119857082 @default.
- W4310658169 hasConcept C121332964 @default.
- W4310658169 hasConcept C121864883 @default.
- W4310658169 hasConcept C126255220 @default.
- W4310658169 hasConcept C129844170 @default.
- W4310658169 hasConcept C134306372 @default.
- W4310658169 hasConcept C14036430 @default.
- W4310658169 hasConcept C153258448 @default.
- W4310658169 hasConcept C154945302 @default.
- W4310658169 hasConcept C162324750 @default.
- W4310658169 hasConcept C166437778 @default.
- W4310658169 hasConcept C186633575 @default.
- W4310658169 hasConcept C195956108 @default.
- W4310658169 hasConcept C2524010 @default.
- W4310658169 hasConcept C28826006 @default.
- W4310658169 hasConcept C33923547 @default.
- W4310658169 hasConcept C41008148 @default.
- W4310658169 hasConcept C50644808 @default.
- W4310658169 hasConcept C72134830 @default.
- W4310658169 hasConcept C78458016 @default.
- W4310658169 hasConcept C86803240 @default.
- W4310658169 hasConceptScore W4310658169C106159729 @default.
- W4310658169 hasConceptScore W4310658169C112972136 @default.
- W4310658169 hasConceptScore W4310658169C11413529 @default.
- W4310658169 hasConceptScore W4310658169C119857082 @default.
- W4310658169 hasConceptScore W4310658169C121332964 @default.
- W4310658169 hasConceptScore W4310658169C121864883 @default.
- W4310658169 hasConceptScore W4310658169C126255220 @default.
- W4310658169 hasConceptScore W4310658169C129844170 @default.
- W4310658169 hasConceptScore W4310658169C134306372 @default.
- W4310658169 hasConceptScore W4310658169C14036430 @default.
- W4310658169 hasConceptScore W4310658169C153258448 @default.
- W4310658169 hasConceptScore W4310658169C154945302 @default.
- W4310658169 hasConceptScore W4310658169C162324750 @default.
- W4310658169 hasConceptScore W4310658169C166437778 @default.
- W4310658169 hasConceptScore W4310658169C186633575 @default.
- W4310658169 hasConceptScore W4310658169C195956108 @default.
- W4310658169 hasConceptScore W4310658169C2524010 @default.
- W4310658169 hasConceptScore W4310658169C28826006 @default.
- W4310658169 hasConceptScore W4310658169C33923547 @default.
- W4310658169 hasConceptScore W4310658169C41008148 @default.
- W4310658169 hasConceptScore W4310658169C50644808 @default.
- W4310658169 hasConceptScore W4310658169C72134830 @default.
- W4310658169 hasConceptScore W4310658169C78458016 @default.
- W4310658169 hasConceptScore W4310658169C86803240 @default.
- W4310658169 hasLocation W43106581691 @default.
- W4310658169 hasLocation W43106581692 @default.
- W4310658169 hasOpenAccess W4310658169 @default.
- W4310658169 hasPrimaryLocation W43106581691 @default.
- W4310658169 hasRelatedWork W2082482750 @default.
- W4310658169 hasRelatedWork W2092244978 @default.
- W4310658169 hasRelatedWork W2108861020 @default.
- W4310658169 hasRelatedWork W2989472764 @default.
- W4310658169 hasRelatedWork W3037508544 @default.
- W4310658169 hasRelatedWork W3166091992 @default.
- W4310658169 hasRelatedWork W4287751218 @default.
- W4310658169 hasRelatedWork W4310658169 @default.
- W4310658169 hasRelatedWork W2622138135 @default.
- W4310658169 hasRelatedWork W4224930298 @default.
- W4310658169 isParatext "false" @default.
- W4310658169 isRetracted "false" @default.
- W4310658169 workType "article" @default.