Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288593358> ?p ?o ?g. }
Showing items 1 to 33 of
33
with 100 items per page.
- W4288593358 abstract "State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption. This has created a recent demand for memory-efficient optimizers. To this end, we investigate the limits and performance tradeoffs of memory-efficient adaptively preconditioned gradient methods. We propose extreme tensoring for high-dimensional stochastic optimization, showing that an optimizer needs very little memory to benefit from adaptive preconditioning. Our technique applies to arbitrary models (not necessarily with tensor-shaped parameters), and is accompanied by regret and convergence guarantees, which shed light on the tradeoffs between preconditioner quality and expressivity. On a large-scale NLP model, we reduce the optimizer memory overhead by three orders of magnitude, without degrading performance." @default.
- W4288593358 created "2022-07-29" @default.
- W4288593358 creator A5007875487 @default.
- W4288593358 creator A5024431603 @default.
- W4288593358 creator A5043228083 @default.
- W4288593358 creator A5044544424 @default.
- W4288593358 creator A5087275727 @default.
- W4288593358 date "2019-02-12" @default.
- W4288593358 modified "2023-09-27" @default.
- W4288593358 title "Extreme Tensoring for Low-Memory Preconditioning" @default.
- W4288593358 doi "https://doi.org/10.48550/arxiv.1902.04620" @default.
- W4288593358 hasPublicationYear "2019" @default.
- W4288593358 type Work @default.
- W4288593358 citedByCount "0" @default.
- W4288593358 crossrefType "posted-content" @default.
- W4288593358 hasAuthorship W4288593358A5007875487 @default.
- W4288593358 hasAuthorship W4288593358A5024431603 @default.
- W4288593358 hasAuthorship W4288593358A5043228083 @default.
- W4288593358 hasAuthorship W4288593358A5044544424 @default.
- W4288593358 hasAuthorship W4288593358A5087275727 @default.
- W4288593358 hasBestOaLocation W42885933581 @default.
- W4288593358 hasConcept C15744967 @default.
- W4288593358 hasConcept C180747234 @default.
- W4288593358 hasConcept C41008148 @default.
- W4288593358 hasConceptScore W4288593358C15744967 @default.
- W4288593358 hasConceptScore W4288593358C180747234 @default.
- W4288593358 hasConceptScore W4288593358C41008148 @default.
- W4288593358 hasLocation W42885933581 @default.
- W4288593358 hasOpenAccess W4288593358 @default.
- W4288593358 hasPrimaryLocation W42885933581 @default.
- W4288593358 isParatext "false" @default.
- W4288593358 isRetracted "false" @default.
- W4288593358 workType "article" @default.