Matches in SemOpenAlex for { <https://semopenalex.org/work/W4306793909> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W4306793909 endingPage "1223" @default.
- W4306793909 startingPage "1203" @default.
- W4306793909 abstract "The classical statistical learning theory implies that fitting too many parameters leads to overfitting and poor performance. That modern deep neural networks generalize well despite a large number of parameters contradicts this finding and constitutes a major unsolved problem towards explaining the success of deep learning. While previous work focuses on the implicit regularization induced by stochastic gradient descent (SGD), we study here how the local geometry of the energy landscape around local minima affects the statistical properties of SGD with Gaussian gradient noise. We argue that under reasonable assumptions, the local geometry forces SGD to stay close to a low dimensional subspace and that this induces another form of implicit regularization and results in tighter bounds on the generalization error for deep neural networks. To derive generalization error bounds for neural networks, we first introduce a notion of stagnation sets around the local minima and impose a local essential convexity property of the population risk. Under these conditions, lower bounds for SGD to remain in these stagnation sets are derived. If stagnation occurs, we derive a bound on the generalization error of deep neural networks involving the spectral norms of the weight matrices but not the number of network parameters. Technically, our proofs are based on controlling the change of parameter values in the SGD iterates and local uniform convergence of the empirical loss functions based on the entropy of suitable neighborhoods around local minima." @default.
- W4306793909 created "2022-10-20" @default.
- W4306793909 creator A5002981992 @default.
- W4306793909 creator A5066308746 @default.
- W4306793909 date "2023-02-01" @default.
- W4306793909 modified "2023-10-01" @default.
- W4306793909 title "On Generalization Bounds for Deep Networks Based on Loss Surface Implicit Regularization" @default.
- W4306793909 cites W2010353172 @default.
- W4306793909 cites W2052044664 @default.
- W4306793909 cites W2083731191 @default.
- W4306793909 cites W2194775991 @default.
- W4306793909 cites W2919115771 @default.
- W4306793909 cites W2962702650 @default.
- W4306793909 cites W2963094815 @default.
- W4306793909 cites W2963122491 @default.
- W4306793909 cites W2963248893 @default.
- W4306793909 cites W2963518130 @default.
- W4306793909 cites W2964047251 @default.
- W4306793909 cites W3018252856 @default.
- W4306793909 cites W3049059782 @default.
- W4306793909 cites W3100231902 @default.
- W4306793909 cites W3102511045 @default.
- W4306793909 cites W3104969455 @default.
- W4306793909 cites W3172995164 @default.
- W4306793909 cites W3191067499 @default.
- W4306793909 cites W3203457432 @default.
- W4306793909 cites W4226038297 @default.
- W4306793909 cites W4249716558 @default.
- W4306793909 doi "https://doi.org/10.1109/tit.2022.3215088" @default.
- W4306793909 hasPublicationYear "2023" @default.
- W4306793909 type Work @default.
- W4306793909 citedByCount "1" @default.
- W4306793909 countsByYear W43067939092023 @default.
- W4306793909 crossrefType "journal-article" @default.
- W4306793909 hasAuthorship W4306793909A5002981992 @default.
- W4306793909 hasAuthorship W4306793909A5066308746 @default.
- W4306793909 hasBestOaLocation W43067939091 @default.
- W4306793909 hasConcept C106159729 @default.
- W4306793909 hasConcept C108583219 @default.
- W4306793909 hasConcept C126255220 @default.
- W4306793909 hasConcept C134306372 @default.
- W4306793909 hasConcept C154945302 @default.
- W4306793909 hasConcept C162324750 @default.
- W4306793909 hasConcept C186633575 @default.
- W4306793909 hasConcept C206688291 @default.
- W4306793909 hasConcept C22019652 @default.
- W4306793909 hasConcept C2776135515 @default.
- W4306793909 hasConcept C28826006 @default.
- W4306793909 hasConcept C33923547 @default.
- W4306793909 hasConcept C41008148 @default.
- W4306793909 hasConcept C50644808 @default.
- W4306793909 hasConcept C72134830 @default.
- W4306793909 hasConceptScore W4306793909C106159729 @default.
- W4306793909 hasConceptScore W4306793909C108583219 @default.
- W4306793909 hasConceptScore W4306793909C126255220 @default.
- W4306793909 hasConceptScore W4306793909C134306372 @default.
- W4306793909 hasConceptScore W4306793909C154945302 @default.
- W4306793909 hasConceptScore W4306793909C162324750 @default.
- W4306793909 hasConceptScore W4306793909C186633575 @default.
- W4306793909 hasConceptScore W4306793909C206688291 @default.
- W4306793909 hasConceptScore W4306793909C22019652 @default.
- W4306793909 hasConceptScore W4306793909C2776135515 @default.
- W4306793909 hasConceptScore W4306793909C28826006 @default.
- W4306793909 hasConceptScore W4306793909C33923547 @default.
- W4306793909 hasConceptScore W4306793909C41008148 @default.
- W4306793909 hasConceptScore W4306793909C50644808 @default.
- W4306793909 hasConceptScore W4306793909C72134830 @default.
- W4306793909 hasFunder F4320321800 @default.
- W4306793909 hasFunder F4320334764 @default.
- W4306793909 hasFunder F4320334789 @default.
- W4306793909 hasIssue "2" @default.
- W4306793909 hasLocation W43067939091 @default.
- W4306793909 hasLocation W43067939092 @default.
- W4306793909 hasLocation W43067939093 @default.
- W4306793909 hasOpenAccess W4306793909 @default.
- W4306793909 hasPrimaryLocation W43067939091 @default.
- W4306793909 hasRelatedWork W2029932722 @default.
- W4306793909 hasRelatedWork W2092244978 @default.
- W4306793909 hasRelatedWork W2752159661 @default.
- W4306793909 hasRelatedWork W2948488743 @default.
- W4306793909 hasRelatedWork W2963334011 @default.
- W4306793909 hasRelatedWork W3042560000 @default.
- W4306793909 hasRelatedWork W3094963542 @default.
- W4306793909 hasRelatedWork W3111449556 @default.
- W4306793909 hasRelatedWork W4287625305 @default.
- W4306793909 hasRelatedWork W4287714231 @default.
- W4306793909 hasVolume "69" @default.
- W4306793909 isParatext "false" @default.
- W4306793909 isRetracted "false" @default.
- W4306793909 workType "article" @default.