Matches in SemOpenAlex for { <https://semopenalex.org/work/W4380559103> ?p ?o ?g. }
- W4380559103 abstract "Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. In this work, using concrete experiments, we argue that real progress in speeding up training requires new benchmarks that resolve three basic challenges faced by empirical comparisons of training algorithms: (1) how to decide when training is complete and precisely measure training time, (2) how to handle the sensitivity of measurements to exact workload details, and (3) how to fairly compare algorithms that require hyperparameter tuning. In order to address these challenges, we introduce a new, competitive, time-to-result benchmark using multiple workloads running on fixed hardware, the AlgoPerf: Training Algorithms benchmark. Our benchmark includes a set of workload variants that make it possible to detect benchmark submissions that are more robust to workload changes than current widely-used methods. Finally, we evaluate baseline submissions constructed using various optimizers that represent current practice, as well as other optimizers that have recently received attention in the literature. These baseline results collectively demonstrate the feasibility of our benchmark, show that non-trivial gaps between methods exist, and set a provisional state-of-the-art for future benchmark submissions to try and surpass." @default.
- W4380559103 created "2023-06-14" @default.
- W4380559103 creator A5007875487 @default.
- W4380559103 creator A5014035780 @default.
- W4380559103 creator A5016611202 @default.
- W4380559103 creator A5017393194 @default.
- W4380559103 creator A5021250924 @default.
- W4380559103 creator A5022872779 @default.
- W4380559103 creator A5023463902 @default.
- W4380559103 creator A5036486668 @default.
- W4380559103 creator A5037613447 @default.
- W4380559103 creator A5038526086 @default.
- W4380559103 creator A5038867895 @default.
- W4380559103 creator A5047062711 @default.
- W4380559103 creator A5048682599 @default.
- W4380559103 creator A5053488528 @default.
- W4380559103 creator A5053548074 @default.
- W4380559103 creator A5054711904 @default.
- W4380559103 creator A5056776503 @default.
- W4380559103 creator A5070181917 @default.
- W4380559103 creator A5073274411 @default.
- W4380559103 creator A5077710279 @default.
- W4380559103 creator A5078051293 @default.
- W4380559103 creator A5082238359 @default.
- W4380559103 creator A5083104292 @default.
- W4380559103 creator A5083291604 @default.
- W4380559103 creator A5091193777 @default.
- W4380559103 date "2023-06-12" @default.
- W4380559103 modified "2023-09-23" @default.
- W4380559103 title "Benchmarking Neural Network Training Algorithms" @default.
- W4380559103 doi "https://doi.org/10.48550/arxiv.2306.07179" @default.
- W4380559103 hasPublicationYear "2023" @default.
- W4380559103 type Work @default.
- W4380559103 citedByCount "0" @default.
- W4380559103 crossrefType "posted-content" @default.
- W4380559103 hasAuthorship W4380559103A5007875487 @default.
- W4380559103 hasAuthorship W4380559103A5014035780 @default.
- W4380559103 hasAuthorship W4380559103A5016611202 @default.
- W4380559103 hasAuthorship W4380559103A5017393194 @default.
- W4380559103 hasAuthorship W4380559103A5021250924 @default.
- W4380559103 hasAuthorship W4380559103A5022872779 @default.
- W4380559103 hasAuthorship W4380559103A5023463902 @default.
- W4380559103 hasAuthorship W4380559103A5036486668 @default.
- W4380559103 hasAuthorship W4380559103A5037613447 @default.
- W4380559103 hasAuthorship W4380559103A5038526086 @default.
- W4380559103 hasAuthorship W4380559103A5038867895 @default.
- W4380559103 hasAuthorship W4380559103A5047062711 @default.
- W4380559103 hasAuthorship W4380559103A5048682599 @default.
- W4380559103 hasAuthorship W4380559103A5053488528 @default.
- W4380559103 hasAuthorship W4380559103A5053548074 @default.
- W4380559103 hasAuthorship W4380559103A5054711904 @default.
- W4380559103 hasAuthorship W4380559103A5056776503 @default.
- W4380559103 hasAuthorship W4380559103A5070181917 @default.
- W4380559103 hasAuthorship W4380559103A5073274411 @default.
- W4380559103 hasAuthorship W4380559103A5077710279 @default.
- W4380559103 hasAuthorship W4380559103A5078051293 @default.
- W4380559103 hasAuthorship W4380559103A5082238359 @default.
- W4380559103 hasAuthorship W4380559103A5083104292 @default.
- W4380559103 hasAuthorship W4380559103A5083291604 @default.
- W4380559103 hasAuthorship W4380559103A5091193777 @default.
- W4380559103 hasBestOaLocation W43805591031 @default.
- W4380559103 hasConcept C111368507 @default.
- W4380559103 hasConcept C111919701 @default.
- W4380559103 hasConcept C11413529 @default.
- W4380559103 hasConcept C119857082 @default.
- W4380559103 hasConcept C12725497 @default.
- W4380559103 hasConcept C127313418 @default.
- W4380559103 hasConcept C13280743 @default.
- W4380559103 hasConcept C136197465 @default.
- W4380559103 hasConcept C144133560 @default.
- W4380559103 hasConcept C154945302 @default.
- W4380559103 hasConcept C162853370 @default.
- W4380559103 hasConcept C177264268 @default.
- W4380559103 hasConcept C185798385 @default.
- W4380559103 hasConcept C199360897 @default.
- W4380559103 hasConcept C205649164 @default.
- W4380559103 hasConcept C2778476105 @default.
- W4380559103 hasConcept C41008148 @default.
- W4380559103 hasConcept C43521106 @default.
- W4380559103 hasConcept C50644808 @default.
- W4380559103 hasConcept C86251818 @default.
- W4380559103 hasConcept C8642999 @default.
- W4380559103 hasConceptScore W4380559103C111368507 @default.
- W4380559103 hasConceptScore W4380559103C111919701 @default.
- W4380559103 hasConceptScore W4380559103C11413529 @default.
- W4380559103 hasConceptScore W4380559103C119857082 @default.
- W4380559103 hasConceptScore W4380559103C12725497 @default.
- W4380559103 hasConceptScore W4380559103C127313418 @default.
- W4380559103 hasConceptScore W4380559103C13280743 @default.
- W4380559103 hasConceptScore W4380559103C136197465 @default.
- W4380559103 hasConceptScore W4380559103C144133560 @default.
- W4380559103 hasConceptScore W4380559103C154945302 @default.
- W4380559103 hasConceptScore W4380559103C162853370 @default.
- W4380559103 hasConceptScore W4380559103C177264268 @default.
- W4380559103 hasConceptScore W4380559103C185798385 @default.
- W4380559103 hasConceptScore W4380559103C199360897 @default.
- W4380559103 hasConceptScore W4380559103C205649164 @default.
- W4380559103 hasConceptScore W4380559103C2778476105 @default.
- W4380559103 hasConceptScore W4380559103C41008148 @default.
- W4380559103 hasConceptScore W4380559103C43521106 @default.