Matches in SemOpenAlex for { <https://semopenalex.org/work/W3147263414> ?p ?o ?g. }
- W3147263414 abstract "It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers." @default.
- W3147263414 created "2021-04-13" @default.
- W3147263414 creator A5016886735 @default.
- W3147263414 creator A5045305860 @default.
- W3147263414 creator A5066773635 @default.
- W3147263414 creator A5072744508 @default.
- W3147263414 date "2021-03-31" @default.
- W3147263414 modified "2023-09-25" @default.
- W3147263414 title "Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization" @default.
- W3147263414 cites W1591801644 @default.
- W3147263414 cites W1632114991 @default.
- W3147263414 cites W1686810756 @default.
- W3147263414 cites W2013918710 @default.
- W3147263414 cites W2014384147 @default.
- W3147263414 cites W2029029543 @default.
- W3147263414 cites W2064675550 @default.
- W3147263414 cites W2097117768 @default.
- W3147263414 cites W2108598243 @default.
- W3147263414 cites W2108677974 @default.
- W3147263414 cites W2124136621 @default.
- W3147263414 cites W2143163787 @default.
- W3147263414 cites W2167433878 @default.
- W3147263414 cites W2170413022 @default.
- W3147263414 cites W2194775991 @default.
- W3147263414 cites W2263490141 @default.
- W3147263414 cites W2768267830 @default.
- W3147263414 cites W2785523195 @default.
- W3147263414 cites W2796146910 @default.
- W3147263414 cites W2799042347 @default.
- W3147263414 cites W2804386825 @default.
- W3147263414 cites W2808042107 @default.
- W3147263414 cites W2808402975 @default.
- W3147263414 cites W2891952073 @default.
- W3147263414 cites W2907225497 @default.
- W3147263414 cites W2908510526 @default.
- W3147263414 cites W2912323147 @default.
- W3147263414 cites W2912811302 @default.
- W3147263414 cites W2919115771 @default.
- W3147263414 cites W2946668020 @default.
- W3147263414 cites W2962781506 @default.
- W3147263414 cites W2962915600 @default.
- W3147263414 cites W2962932339 @default.
- W3147263414 cites W2963092340 @default.
- W3147263414 cites W2963446712 @default.
- W3147263414 cites W2963655672 @default.
- W3147263414 cites W2963702144 @default.
- W3147263414 cites W2963735582 @default.
- W3147263414 cites W2963739978 @default.
- W3147263414 cites W2963794891 @default.
- W3147263414 cites W2963959597 @default.
- W3147263414 cites W2964072432 @default.
- W3147263414 cites W2964121744 @default.
- W3147263414 cites W2970388773 @default.
- W3147263414 cites W2970490659 @default.
- W3147263414 cites W2970550417 @default.
- W3147263414 cites W2970971581 @default.
- W3147263414 cites W2994689640 @default.
- W3147263414 cites W3034731342 @default.
- W3147263414 cites W3037273417 @default.
- W3147263414 cites W3037278715 @default.
- W3147263414 cites W3109394096 @default.
- W3147263414 cites W3118431319 @default.
- W3147263414 cites W3118608800 @default.
- W3147263414 cites W3132471231 @default.
- W3147263414 cites W3137695714 @default.
- W3147263414 cites W3170349722 @default.
- W3147263414 cites W3200038001 @default.
- W3147263414 doi "https://doi.org/10.48550/arxiv.2103.17182" @default.
- W3147263414 hasPublicationYear "2021" @default.
- W3147263414 type Work @default.
- W3147263414 sameAs 3147263414 @default.
- W3147263414 citedByCount "0" @default.
- W3147263414 crossrefType "posted-content" @default.
- W3147263414 hasAuthorship W3147263414A5016886735 @default.
- W3147263414 hasAuthorship W3147263414A5045305860 @default.
- W3147263414 hasAuthorship W3147263414A5066773635 @default.
- W3147263414 hasAuthorship W3147263414A5072744508 @default.
- W3147263414 hasBestOaLocation W31472634141 @default.
- W3147263414 hasConcept C10138342 @default.
- W3147263414 hasConcept C11413529 @default.
- W3147263414 hasConcept C115961682 @default.
- W3147263414 hasConcept C126255220 @default.
- W3147263414 hasConcept C134306372 @default.
- W3147263414 hasConcept C153258448 @default.
- W3147263414 hasConcept C154945302 @default.
- W3147263414 hasConcept C162324750 @default.
- W3147263414 hasConcept C177148314 @default.
- W3147263414 hasConcept C206688291 @default.
- W3147263414 hasConcept C26517878 @default.
- W3147263414 hasConcept C2776135515 @default.
- W3147263414 hasConcept C2777303404 @default.
- W3147263414 hasConcept C28826006 @default.
- W3147263414 hasConcept C2986577269 @default.
- W3147263414 hasConcept C33923547 @default.
- W3147263414 hasConcept C38652104 @default.
- W3147263414 hasConcept C41008148 @default.
- W3147263414 hasConcept C50522688 @default.
- W3147263414 hasConcept C50644808 @default.
- W3147263414 hasConcept C57869625 @default.
- W3147263414 hasConcept C60718061 @default.