Matches in SemOpenAlex for { <https://semopenalex.org/work/W2950928354> ?p ?o ?g. }
- W2950928354 abstract "Despite their overwhelming capacity to overfit, deep learning architectures tend to generalize relatively well to unseen data, allowing them to be deployed in practice. However, explaining why this is the case is still an open area of research. One standing hypothesis that is gaining popularity, e.g. Hochreiter & Schmidhuber (1997); Keskar et al. (2017), is that the flatness of minima of the loss function found by stochastic gradient based methods results in good generalization. This paper argues that most notions of flatness are problematic for deep models and can not be directly applied to explain generalization. Specifically, when focusing on deep networks with rectifier units, we can exploit the particular geometry of parameter space induced by the inherent symmetries that these architectures exhibit to build equivalent models corresponding to arbitrarily sharper minima. Furthermore, if we allow to reparametrize a function, the geometry of its parameters can change drastically without affecting its generalization properties." @default.
- W2950928354 created "2019-06-27" @default.
- W2950928354 creator A5017529415 @default.
- W2950928354 creator A5043910056 @default.
- W2950928354 creator A5053621545 @default.
- W2950928354 creator A5086198262 @default.
- W2950928354 date "2017-03-15" @default.
- W2950928354 modified "2023-10-02" @default.
- W2950928354 title "Sharp Minima Can Generalize For Deep Nets" @default.
- W2950928354 cites W114517082 @default.
- W2950928354 cites W1583912456 @default.
- W2950928354 cites W1673923490 @default.
- W2950928354 cites W1677182931 @default.
- W2950928354 cites W1899249567 @default.
- W2950928354 cites W1915968771 @default.
- W2950928354 cites W1922655562 @default.
- W2950928354 cites W2012762214 @default.
- W2950928354 cites W2028654463 @default.
- W2950928354 cites W2047229728 @default.
- W2950928354 cites W2097117768 @default.
- W2950928354 cites W2113651538 @default.
- W2950928354 cites W2125930537 @default.
- W2950928354 cites W2139338362 @default.
- W2950928354 cites W2194775991 @default.
- W2950928354 cites W2202109488 @default.
- W2950928354 cites W2228459002 @default.
- W2950928354 cites W2284050935 @default.
- W2950928354 cites W2292729293 @default.
- W2950928354 cites W2294059674 @default.
- W2950928354 cites W2327501763 @default.
- W2950928354 cites W2409550820 @default.
- W2950928354 cites W2432004435 @default.
- W2950928354 cites W2433379750 @default.
- W2950928354 cites W2436219157 @default.
- W2950928354 cites W2520160253 @default.
- W2950928354 cites W2523060838 @default.
- W2950928354 cites W2525778437 @default.
- W2950928354 cites W2546302380 @default.
- W2950928354 cites W2549189808 @default.
- W2950928354 cites W2564807118 @default.
- W2950928354 cites W2578732588 @default.
- W2950928354 cites W2752366553 @default.
- W2950928354 cites W2912811302 @default.
- W2950928354 cites W2949117887 @default.
- W2950928354 cites W2949888546 @default.
- W2950928354 cites W2950635152 @default.
- W2950928354 cites W2950855294 @default.
- W2950928354 cites W2951603627 @default.
- W2950928354 cites W2953022181 @default.
- W2950928354 cites W2962835968 @default.
- W2950928354 cites W2963207607 @default.
- W2950928354 cites W2963586744 @default.
- W2950928354 cites W2963794891 @default.
- W2950928354 cites W2963857374 @default.
- W2950928354 cites W2964308564 @default.
- W2950928354 cites W2964309400 @default.
- W2950928354 cites W3093329015 @default.
- W2950928354 cites W3137695714 @default.
- W2950928354 cites W577198184 @default.
- W2950928354 hasPublicationYear "2017" @default.
- W2950928354 type Work @default.
- W2950928354 sameAs 2950928354 @default.
- W2950928354 citedByCount "129" @default.
- W2950928354 countsByYear W29509283542016 @default.
- W2950928354 countsByYear W29509283542017 @default.
- W2950928354 countsByYear W29509283542018 @default.
- W2950928354 countsByYear W29509283542019 @default.
- W2950928354 countsByYear W29509283542020 @default.
- W2950928354 countsByYear W29509283542021 @default.
- W2950928354 crossrefType "posted-content" @default.
- W2950928354 hasAuthorship W2950928354A5017529415 @default.
- W2950928354 hasAuthorship W2950928354A5043910056 @default.
- W2950928354 hasAuthorship W2950928354A5053621545 @default.
- W2950928354 hasAuthorship W2950928354A5086198262 @default.
- W2950928354 hasConcept C108583219 @default.
- W2950928354 hasConcept C11413529 @default.
- W2950928354 hasConcept C121332964 @default.
- W2950928354 hasConcept C134306372 @default.
- W2950928354 hasConcept C14036430 @default.
- W2950928354 hasConcept C154945302 @default.
- W2950928354 hasConcept C165696696 @default.
- W2950928354 hasConcept C177148314 @default.
- W2950928354 hasConcept C186633575 @default.
- W2950928354 hasConcept C206688291 @default.
- W2950928354 hasConcept C22019652 @default.
- W2950928354 hasConcept C2524010 @default.
- W2950928354 hasConcept C26405456 @default.
- W2950928354 hasConcept C2778530986 @default.
- W2950928354 hasConcept C28826006 @default.
- W2950928354 hasConcept C33923547 @default.
- W2950928354 hasConcept C38652104 @default.
- W2950928354 hasConcept C41008148 @default.
- W2950928354 hasConcept C50644808 @default.
- W2950928354 hasConcept C62520636 @default.
- W2950928354 hasConcept C73586568 @default.
- W2950928354 hasConcept C78458016 @default.
- W2950928354 hasConcept C86803240 @default.
- W2950928354 hasConcept C96469262 @default.
- W2950928354 hasConceptScore W2950928354C108583219 @default.
- W2950928354 hasConceptScore W2950928354C11413529 @default.