Matches in SemOpenAlex for { <https://semopenalex.org/work/W2891160178> ?p ?o ?g. }
Showing items 1 to 90 of
90
with 100 items per page.
- W2891160178 abstract "In Deep Learning, Stochastic Gradient Descent (SGD) is usually selected as a training method because of its efficiency; however, recently, a problem in SGD gains research interest: sharp minima in Deep Neural Networks (DNNs) have poor generalization; especially, large-batch SGD tends to converge to sharp minima. It becomes an open question whether escaping sharp minima can improve the generalization. To answer this question, we propose SmoothOut framework to smooth out sharp minima in DNNs and thereby improve generalization. In a nutshell, SmoothOut perturbs multiple copies of the DNN by noise injection and averages these copies. Injecting noises to SGD is widely used in the literature, but SmoothOut differs in lots of ways: (1) a de-noising process is applied before parameter updating; (2) noise strength is adapted to filter norm; (3) an alternative interpretation on the advantage of noise injection, from the perspective of sharpness and generalization; (4) usage of uniform noise instead of Gaussian noise. We prove that SmoothOut can eliminate sharp minima. Training multiple DNN copies is inefficient, we further propose an unbiased stochastic SmoothOut which only introduces the overhead of noise injecting and de-noising per batch. An adaptive variant of SmoothOut, AdaSmoothOut, is also proposed to improve generalization. In a variety of experiments, SmoothOut and AdaSmoothOut consistently improve generalization in both small-batch and large-batch training on the top of state-of-the-art solutions." @default.
- W2891160178 created "2018-09-27" @default.
- W2891160178 creator A5001629112 @default.
- W2891160178 creator A5013792176 @default.
- W2891160178 creator A5018958225 @default.
- W2891160178 creator A5048629905 @default.
- W2891160178 creator A5058073627 @default.
- W2891160178 creator A5069190300 @default.
- W2891160178 creator A5089078063 @default.
- W2891160178 date "2018-05-21" @default.
- W2891160178 modified "2023-09-24" @default.
- W2891160178 title "SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning" @default.
- W2891160178 cites W2062517214 @default.
- W2891160178 cites W2474388053 @default.
- W2891160178 cites W2617242334 @default.
- W2891160178 cites W2622263826 @default.
- W2891160178 cites W2757910899 @default.
- W2891160178 cites W2963702144 @default.
- W2891160178 cites W2963959597 @default.
- W2891160178 hasPublicationYear "2018" @default.
- W2891160178 type Work @default.
- W2891160178 sameAs 2891160178 @default.
- W2891160178 citedByCount "13" @default.
- W2891160178 countsByYear W28911601782018 @default.
- W2891160178 countsByYear W28911601782020 @default.
- W2891160178 countsByYear W28911601782021 @default.
- W2891160178 countsByYear W28911601782023 @default.
- W2891160178 crossrefType "posted-content" @default.
- W2891160178 hasAuthorship W2891160178A5001629112 @default.
- W2891160178 hasAuthorship W2891160178A5013792176 @default.
- W2891160178 hasAuthorship W2891160178A5018958225 @default.
- W2891160178 hasAuthorship W2891160178A5048629905 @default.
- W2891160178 hasAuthorship W2891160178A5058073627 @default.
- W2891160178 hasAuthorship W2891160178A5069190300 @default.
- W2891160178 hasAuthorship W2891160178A5089078063 @default.
- W2891160178 hasConcept C108583219 @default.
- W2891160178 hasConcept C11413529 @default.
- W2891160178 hasConcept C115961682 @default.
- W2891160178 hasConcept C126255220 @default.
- W2891160178 hasConcept C134306372 @default.
- W2891160178 hasConcept C154945302 @default.
- W2891160178 hasConcept C177148314 @default.
- W2891160178 hasConcept C186633575 @default.
- W2891160178 hasConcept C206688291 @default.
- W2891160178 hasConcept C2984842247 @default.
- W2891160178 hasConcept C33923547 @default.
- W2891160178 hasConcept C41008148 @default.
- W2891160178 hasConcept C50644808 @default.
- W2891160178 hasConcept C99498987 @default.
- W2891160178 hasConceptScore W2891160178C108583219 @default.
- W2891160178 hasConceptScore W2891160178C11413529 @default.
- W2891160178 hasConceptScore W2891160178C115961682 @default.
- W2891160178 hasConceptScore W2891160178C126255220 @default.
- W2891160178 hasConceptScore W2891160178C134306372 @default.
- W2891160178 hasConceptScore W2891160178C154945302 @default.
- W2891160178 hasConceptScore W2891160178C177148314 @default.
- W2891160178 hasConceptScore W2891160178C186633575 @default.
- W2891160178 hasConceptScore W2891160178C206688291 @default.
- W2891160178 hasConceptScore W2891160178C2984842247 @default.
- W2891160178 hasConceptScore W2891160178C33923547 @default.
- W2891160178 hasConceptScore W2891160178C41008148 @default.
- W2891160178 hasConceptScore W2891160178C50644808 @default.
- W2891160178 hasConceptScore W2891160178C99498987 @default.
- W2891160178 hasLocation W28911601781 @default.
- W2891160178 hasOpenAccess W2891160178 @default.
- W2891160178 hasPrimaryLocation W28911601781 @default.
- W2891160178 hasRelatedWork W1486662790 @default.
- W2891160178 hasRelatedWork W2112796928 @default.
- W2891160178 hasRelatedWork W2117539524 @default.
- W2891160178 hasRelatedWork W2194775991 @default.
- W2891160178 hasRelatedWork W2605372163 @default.
- W2891160178 hasRelatedWork W2804271728 @default.
- W2891160178 hasRelatedWork W2806440755 @default.
- W2891160178 hasRelatedWork W2891872827 @default.
- W2891160178 hasRelatedWork W2912811302 @default.
- W2891160178 hasRelatedWork W2921013684 @default.
- W2891160178 hasRelatedWork W2962933129 @default.
- W2891160178 hasRelatedWork W2963739978 @default.
- W2891160178 hasRelatedWork W2963959597 @default.
- W2891160178 hasRelatedWork W2964121744 @default.
- W2891160178 hasRelatedWork W2970028551 @default.
- W2891160178 hasRelatedWork W3049749470 @default.
- W2891160178 hasRelatedWork W3088899301 @default.
- W2891160178 hasRelatedWork W3118608800 @default.
- W2891160178 hasRelatedWork W3132373599 @default.
- W2891160178 hasRelatedWork W3212480959 @default.
- W2891160178 isParatext "false" @default.
- W2891160178 isRetracted "false" @default.
- W2891160178 magId "2891160178" @default.
- W2891160178 workType "article" @default.