Matches in SemOpenAlex for { <https://semopenalex.org/work/W4366999617> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W4366999617 endingPage "12" @default.
- W4366999617 startingPage "1" @default.
- W4366999617 abstract "Despite simplicity, stochastic gradient descent (SGD)-like algorithms are successful in training deep neural networks (DNNs). Among various attempts to improve SGD, weight averaging (WA), which averages the weights of multiple models, has recently received much attention in the literature. Broadly, WA falls into two categories: 1) online WA, which averages the weights of multiple models trained in parallel, is designed for reducing the gradient communication overhead of parallel mini-batch SGD and 2) offline WA, which averages the weights of one model at different checkpoints, is typically used to improve the generalization ability of DNNs. Though online and offline WA are similar in form, they are seldom associated with each other. Besides, these methods typically perform either offline parameter averaging or online parameter averaging, but not both. In this work, we first attempt to incorporate online and offline WA into a general training framework termed hierarchical WA (HWA). By leveraging both the online and offline averaging manners, HWA is able to achieve both faster convergence speed and superior generalization performance without any fancy learning rate adjustment. Besides, we also analyze the issues faced by the existing WA methods, and how our HWA addresses them, empirically. Finally, extensive experiments verify that HWA outperforms the state-of-the-art methods significantly." @default.
- W4366999617 created "2023-04-27" @default.
- W4366999617 creator A5003608795 @default.
- W4366999617 creator A5003771677 @default.
- W4366999617 creator A5005498479 @default.
- W4366999617 creator A5009164482 @default.
- W4366999617 creator A5010444377 @default.
- W4366999617 creator A5015727828 @default.
- W4366999617 creator A5033982342 @default.
- W4366999617 date "2023-01-01" @default.
- W4366999617 modified "2023-10-12" @default.
- W4366999617 title "Hierarchical Weight Averaging for Deep Neural Networks" @default.
- W4366999617 doi "https://doi.org/10.1109/tnnls.2023.3255540" @default.
- W4366999617 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37104112" @default.
- W4366999617 hasPublicationYear "2023" @default.
- W4366999617 type Work @default.
- W4366999617 citedByCount "1" @default.
- W4366999617 crossrefType "journal-article" @default.
- W4366999617 hasAuthorship W4366999617A5003608795 @default.
- W4366999617 hasAuthorship W4366999617A5003771677 @default.
- W4366999617 hasAuthorship W4366999617A5005498479 @default.
- W4366999617 hasAuthorship W4366999617A5009164482 @default.
- W4366999617 hasAuthorship W4366999617A5010444377 @default.
- W4366999617 hasAuthorship W4366999617A5015727828 @default.
- W4366999617 hasAuthorship W4366999617A5033982342 @default.
- W4366999617 hasBestOaLocation W43669996172 @default.
- W4366999617 hasConcept C111919701 @default.
- W4366999617 hasConcept C11413529 @default.
- W4366999617 hasConcept C119857082 @default.
- W4366999617 hasConcept C134306372 @default.
- W4366999617 hasConcept C153258448 @default.
- W4366999617 hasConcept C154945302 @default.
- W4366999617 hasConcept C162324750 @default.
- W4366999617 hasConcept C177148314 @default.
- W4366999617 hasConcept C206688291 @default.
- W4366999617 hasConcept C26517878 @default.
- W4366999617 hasConcept C2777303404 @default.
- W4366999617 hasConcept C2779960059 @default.
- W4366999617 hasConcept C2780102126 @default.
- W4366999617 hasConcept C2984842247 @default.
- W4366999617 hasConcept C33923547 @default.
- W4366999617 hasConcept C38652104 @default.
- W4366999617 hasConcept C41008148 @default.
- W4366999617 hasConcept C50522688 @default.
- W4366999617 hasConcept C50644808 @default.
- W4366999617 hasConcept C57869625 @default.
- W4366999617 hasConceptScore W4366999617C111919701 @default.
- W4366999617 hasConceptScore W4366999617C11413529 @default.
- W4366999617 hasConceptScore W4366999617C119857082 @default.
- W4366999617 hasConceptScore W4366999617C134306372 @default.
- W4366999617 hasConceptScore W4366999617C153258448 @default.
- W4366999617 hasConceptScore W4366999617C154945302 @default.
- W4366999617 hasConceptScore W4366999617C162324750 @default.
- W4366999617 hasConceptScore W4366999617C177148314 @default.
- W4366999617 hasConceptScore W4366999617C206688291 @default.
- W4366999617 hasConceptScore W4366999617C26517878 @default.
- W4366999617 hasConceptScore W4366999617C2777303404 @default.
- W4366999617 hasConceptScore W4366999617C2779960059 @default.
- W4366999617 hasConceptScore W4366999617C2780102126 @default.
- W4366999617 hasConceptScore W4366999617C2984842247 @default.
- W4366999617 hasConceptScore W4366999617C33923547 @default.
- W4366999617 hasConceptScore W4366999617C38652104 @default.
- W4366999617 hasConceptScore W4366999617C41008148 @default.
- W4366999617 hasConceptScore W4366999617C50522688 @default.
- W4366999617 hasConceptScore W4366999617C50644808 @default.
- W4366999617 hasConceptScore W4366999617C57869625 @default.
- W4366999617 hasFunder F4320321001 @default.
- W4366999617 hasFunder F4320335777 @default.
- W4366999617 hasLocation W43669996171 @default.
- W4366999617 hasLocation W43669996172 @default.
- W4366999617 hasLocation W43669996173 @default.
- W4366999617 hasLocation W43669996174 @default.
- W4366999617 hasOpenAccess W4366999617 @default.
- W4366999617 hasPrimaryLocation W43669996171 @default.
- W4366999617 hasRelatedWork W2742801844 @default.
- W4366999617 hasRelatedWork W2979647660 @default.
- W4366999617 hasRelatedWork W2980973679 @default.
- W4366999617 hasRelatedWork W2989932438 @default.
- W4366999617 hasRelatedWork W2994635323 @default.
- W4366999617 hasRelatedWork W3000893881 @default.
- W4366999617 hasRelatedWork W3007279825 @default.
- W4366999617 hasRelatedWork W3128307059 @default.
- W4366999617 hasRelatedWork W3198535509 @default.
- W4366999617 hasRelatedWork W4224998919 @default.
- W4366999617 isParatext "false" @default.
- W4366999617 isRetracted "false" @default.
- W4366999617 workType "article" @default.