Matches in SemOpenAlex for { <https://semopenalex.org/work/W4322503680> ?p ?o ?g. }
Showing items 1 to 84 of
84
with 100 items per page.
- W4322503680 endingPage "2935" @default.
- W4322503680 startingPage "2935" @default.
- W4322503680 abstract "Averaging neural network weights sampled by a backbone stochastic gradient descent (SGD) is a simple-yet-effective approach to assist the backbone SGD in finding better optima, in terms of generalization. From a statistical perspective, weight-averaging contributes to variance reduction. Recently, a well-established stochastic weight-averaging (SWA) method was proposed, which featured the application of a cyclical or high-constant (CHC) learning-rate schedule for generating weight samples for weight-averaging. Then, a new insight on weight-averaging was introduced, which stated that weight average assisted in discovering a wider optima and resulted in better generalization. We conducted extensive experimental studies concerning SWA, involving 12 modern deep neural network model architectures and 12 open-source image, graph, and text datasets as benchmarks. We disentangled the contributions of the weight-averaging operation and the CHC learning-rate schedule for SWA, showing that the weight-averaging operation in SWA still contributed to variance reduction, and the CHC learning-rate schedule assisted in exploring the parameter space more widely than the backbone SGD, which could be be under-fitted due to a lack of training budget. We then presented an algorithm termed periodic SWA (PSWA) that comprised a series of weight-averaging operations to exploit such wide parameter space structures as explored by the CHC learning-rate schedule, and we empirically demonstrated that PSWA outperformed its backbone SGD remarkably." @default.
- W4322503680 created "2023-02-28" @default.
- W4322503680 creator A5049206805 @default.
- W4322503680 creator A5081451935 @default.
- W4322503680 creator A5090731741 @default.
- W4322503680 date "2023-02-24" @default.
- W4322503680 modified "2023-09-26" @default.
- W4322503680 title "Stochastic Weight Averaging Revisited" @default.
- W4322503680 cites W2086161653 @default.
- W4322503680 cites W2108598243 @default.
- W4322503680 cites W2117539524 @default.
- W4322503680 cites W2302255633 @default.
- W4322503680 cites W2964054038 @default.
- W4322503680 cites W2964137095 @default.
- W4322503680 cites W3107512664 @default.
- W4322503680 cites W3138582970 @default.
- W4322503680 cites W3194779779 @default.
- W4322503680 doi "https://doi.org/10.3390/app13052935" @default.
- W4322503680 hasPublicationYear "2023" @default.
- W4322503680 type Work @default.
- W4322503680 citedByCount "3" @default.
- W4322503680 countsByYear W43225036802023 @default.
- W4322503680 crossrefType "journal-article" @default.
- W4322503680 hasAuthorship W4322503680A5049206805 @default.
- W4322503680 hasAuthorship W4322503680A5081451935 @default.
- W4322503680 hasAuthorship W4322503680A5090731741 @default.
- W4322503680 hasBestOaLocation W43225036801 @default.
- W4322503680 hasConcept C105795698 @default.
- W4322503680 hasConcept C111919701 @default.
- W4322503680 hasConcept C11413529 @default.
- W4322503680 hasConcept C119857082 @default.
- W4322503680 hasConcept C121955636 @default.
- W4322503680 hasConcept C126255220 @default.
- W4322503680 hasConcept C134306372 @default.
- W4322503680 hasConcept C144133560 @default.
- W4322503680 hasConcept C154945302 @default.
- W4322503680 hasConcept C177148314 @default.
- W4322503680 hasConcept C19499675 @default.
- W4322503680 hasConcept C196083921 @default.
- W4322503680 hasConcept C206688291 @default.
- W4322503680 hasConcept C33923547 @default.
- W4322503680 hasConcept C41008148 @default.
- W4322503680 hasConcept C50644808 @default.
- W4322503680 hasConcept C62644790 @default.
- W4322503680 hasConcept C68387754 @default.
- W4322503680 hasConceptScore W4322503680C105795698 @default.
- W4322503680 hasConceptScore W4322503680C111919701 @default.
- W4322503680 hasConceptScore W4322503680C11413529 @default.
- W4322503680 hasConceptScore W4322503680C119857082 @default.
- W4322503680 hasConceptScore W4322503680C121955636 @default.
- W4322503680 hasConceptScore W4322503680C126255220 @default.
- W4322503680 hasConceptScore W4322503680C134306372 @default.
- W4322503680 hasConceptScore W4322503680C144133560 @default.
- W4322503680 hasConceptScore W4322503680C154945302 @default.
- W4322503680 hasConceptScore W4322503680C177148314 @default.
- W4322503680 hasConceptScore W4322503680C19499675 @default.
- W4322503680 hasConceptScore W4322503680C196083921 @default.
- W4322503680 hasConceptScore W4322503680C206688291 @default.
- W4322503680 hasConceptScore W4322503680C33923547 @default.
- W4322503680 hasConceptScore W4322503680C41008148 @default.
- W4322503680 hasConceptScore W4322503680C50644808 @default.
- W4322503680 hasConceptScore W4322503680C62644790 @default.
- W4322503680 hasConceptScore W4322503680C68387754 @default.
- W4322503680 hasIssue "5" @default.
- W4322503680 hasLocation W43225036801 @default.
- W4322503680 hasLocation W43225036802 @default.
- W4322503680 hasOpenAccess W4322503680 @default.
- W4322503680 hasPrimaryLocation W43225036801 @default.
- W4322503680 hasRelatedWork W2193091921 @default.
- W4322503680 hasRelatedWork W2193565203 @default.
- W4322503680 hasRelatedWork W2205410708 @default.
- W4322503680 hasRelatedWork W2303021954 @default.
- W4322503680 hasRelatedWork W2378654701 @default.
- W4322503680 hasRelatedWork W2529853123 @default.
- W4322503680 hasRelatedWork W2787191226 @default.
- W4322503680 hasRelatedWork W2792987183 @default.
- W4322503680 hasRelatedWork W2989932438 @default.
- W4322503680 hasRelatedWork W3114508735 @default.
- W4322503680 hasVolume "13" @default.
- W4322503680 isParatext "false" @default.
- W4322503680 isRetracted "false" @default.
- W4322503680 workType "article" @default.