Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311551486> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4311551486 abstract "The importance of learning rate (LR) schedules on network pruning has been observed in a few recent works. As an example, Frankle and Carbin (2019) highlighted that winning tickets (i.e., accuracy preserving subnetworks) can not be found without applying a LR warmup schedule and Renda, Frankle and Carbin (2020) demonstrated that rewinding the LR to its initial state at the end of each pruning cycle improves performance. In this paper, we go one step further by first providing a theoretical justification for the surprising effect of LR schedules. Next, we propose a LR schedule for network pruning called SILO, which stands for S-shaped Improved Learning rate Optimization. The advantages of SILO over existing state-of-the-art (SOTA) LR schedules are two-fold: (i) SILO has a strong theoretical motivation and dynamically adjusts the LR during pruning to improve generalization. Specifically, SILO increases the LR upper bound (max_lr) in an S-shape. This leads to an improvement of 2% - 4% in extensive experiments with various types of networks (e.g., Vision Transformers, ResNet) on popular datasets such as ImageNet, CIFAR-10/100. (ii) In addition to the strong theoretical motivation, SILO is empirically optimal in the sense of matching an Oracle, which exhaustively searches for the optimal value of max_lr via grid search. We find that SILO is able to precisely adjust the value of max_lr to be within the Oracle optimized interval, resulting in performance competitive with the Oracle with significantly lower complexity." @default.
- W4311551486 created "2022-12-27" @default.
- W4311551486 creator A5023038428 @default.
- W4311551486 creator A5028037531 @default.
- W4311551486 creator A5068845707 @default.
- W4311551486 creator A5069355437 @default.
- W4311551486 date "2022-12-09" @default.
- W4311551486 modified "2023-09-26" @default.
- W4311551486 title "Optimizing Learning Rate Schedules for Iterative Pruning of Deep Neural Networks" @default.
- W4311551486 doi "https://doi.org/10.48550/arxiv.2212.06144" @default.
- W4311551486 hasPublicationYear "2022" @default.
- W4311551486 type Work @default.
- W4311551486 citedByCount "0" @default.
- W4311551486 crossrefType "posted-content" @default.
- W4311551486 hasAuthorship W4311551486A5023038428 @default.
- W4311551486 hasAuthorship W4311551486A5028037531 @default.
- W4311551486 hasAuthorship W4311551486A5068845707 @default.
- W4311551486 hasAuthorship W4311551486A5069355437 @default.
- W4311551486 hasBestOaLocation W43115514861 @default.
- W4311551486 hasConcept C105795698 @default.
- W4311551486 hasConcept C108010975 @default.
- W4311551486 hasConcept C111919701 @default.
- W4311551486 hasConcept C11413529 @default.
- W4311551486 hasConcept C115903868 @default.
- W4311551486 hasConcept C119857082 @default.
- W4311551486 hasConcept C126255220 @default.
- W4311551486 hasConcept C127413603 @default.
- W4311551486 hasConcept C134306372 @default.
- W4311551486 hasConcept C154945302 @default.
- W4311551486 hasConcept C165064840 @default.
- W4311551486 hasConcept C177148314 @default.
- W4311551486 hasConcept C2776291640 @default.
- W4311551486 hasConcept C2778024958 @default.
- W4311551486 hasConcept C33923547 @default.
- W4311551486 hasConcept C41008148 @default.
- W4311551486 hasConcept C50644808 @default.
- W4311551486 hasConcept C55166926 @default.
- W4311551486 hasConcept C6557445 @default.
- W4311551486 hasConcept C68387754 @default.
- W4311551486 hasConcept C78519656 @default.
- W4311551486 hasConcept C86803240 @default.
- W4311551486 hasConceptScore W4311551486C105795698 @default.
- W4311551486 hasConceptScore W4311551486C108010975 @default.
- W4311551486 hasConceptScore W4311551486C111919701 @default.
- W4311551486 hasConceptScore W4311551486C11413529 @default.
- W4311551486 hasConceptScore W4311551486C115903868 @default.
- W4311551486 hasConceptScore W4311551486C119857082 @default.
- W4311551486 hasConceptScore W4311551486C126255220 @default.
- W4311551486 hasConceptScore W4311551486C127413603 @default.
- W4311551486 hasConceptScore W4311551486C134306372 @default.
- W4311551486 hasConceptScore W4311551486C154945302 @default.
- W4311551486 hasConceptScore W4311551486C165064840 @default.
- W4311551486 hasConceptScore W4311551486C177148314 @default.
- W4311551486 hasConceptScore W4311551486C2776291640 @default.
- W4311551486 hasConceptScore W4311551486C2778024958 @default.
- W4311551486 hasConceptScore W4311551486C33923547 @default.
- W4311551486 hasConceptScore W4311551486C41008148 @default.
- W4311551486 hasConceptScore W4311551486C50644808 @default.
- W4311551486 hasConceptScore W4311551486C55166926 @default.
- W4311551486 hasConceptScore W4311551486C6557445 @default.
- W4311551486 hasConceptScore W4311551486C68387754 @default.
- W4311551486 hasConceptScore W4311551486C78519656 @default.
- W4311551486 hasConceptScore W4311551486C86803240 @default.
- W4311551486 hasLocation W43115514861 @default.
- W4311551486 hasOpenAccess W4311551486 @default.
- W4311551486 hasPrimaryLocation W43115514861 @default.
- W4311551486 hasRelatedWork W2005759508 @default.
- W4311551486 hasRelatedWork W2368629829 @default.
- W4311551486 hasRelatedWork W2391431582 @default.
- W4311551486 hasRelatedWork W2474469336 @default.
- W4311551486 hasRelatedWork W2800185406 @default.
- W4311551486 hasRelatedWork W2989932438 @default.
- W4311551486 hasRelatedWork W3199608561 @default.
- W4311551486 hasRelatedWork W3205974354 @default.
- W4311551486 hasRelatedWork W1629725936 @default.
- W4311551486 hasRelatedWork W2125637597 @default.
- W4311551486 isParatext "false" @default.
- W4311551486 isRetracted "false" @default.
- W4311551486 workType "article" @default.