Matches in SemOpenAlex for { <https://semopenalex.org/work/W4225854236> ?p ?o ?g. }
Showing items 1 to 57 of
57
with 100 items per page.
- W4225854236 abstract "Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit double descent with respect to both training sample counts and the neural network size. In this paper, we empirically examined how different layers of neural networks contribute differently to the model; we found that early layers generally learn representations relevant to performance on both training data and testing data. Contrarily, deeper layers only minimize training risks and fail to generalize well with testing or mislabeled data. We further illustrate the distance of trained weights to its initial value of final layers has high correlation to generalization errors and can serve as an indicator of an overfit of model. Moreover, we show evidence to support post-training regularization by re-initializing weights of final layers. Our findings provide an efficient method to estimate the generalization capability of neural networks, and the insight of those quantitative results may inspire derivation to better generalization bounds that take the internal structure of neural networks into consideration." @default.
- W4225854236 created "2022-05-05" @default.
- W4225854236 creator A5042647314 @default.
- W4225854236 creator A5078911447 @default.
- W4225854236 date "2022-01-28" @default.
- W4225854236 modified "2023-09-25" @default.
- W4225854236 title "With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization" @default.
- W4225854236 doi "https://doi.org/10.48550/arxiv.2201.11939" @default.
- W4225854236 hasPublicationYear "2022" @default.
- W4225854236 type Work @default.
- W4225854236 citedByCount "0" @default.
- W4225854236 crossrefType "posted-content" @default.
- W4225854236 hasAuthorship W4225854236A5042647314 @default.
- W4225854236 hasAuthorship W4225854236A5078911447 @default.
- W4225854236 hasBestOaLocation W42258542361 @default.
- W4225854236 hasConcept C114466953 @default.
- W4225854236 hasConcept C119857082 @default.
- W4225854236 hasConcept C12713177 @default.
- W4225854236 hasConcept C134306372 @default.
- W4225854236 hasConcept C154945302 @default.
- W4225854236 hasConcept C177148314 @default.
- W4225854236 hasConcept C199360897 @default.
- W4225854236 hasConcept C22019652 @default.
- W4225854236 hasConcept C2776135515 @default.
- W4225854236 hasConcept C33923547 @default.
- W4225854236 hasConcept C41008148 @default.
- W4225854236 hasConcept C50644808 @default.
- W4225854236 hasConcept C5465570 @default.
- W4225854236 hasConceptScore W4225854236C114466953 @default.
- W4225854236 hasConceptScore W4225854236C119857082 @default.
- W4225854236 hasConceptScore W4225854236C12713177 @default.
- W4225854236 hasConceptScore W4225854236C134306372 @default.
- W4225854236 hasConceptScore W4225854236C154945302 @default.
- W4225854236 hasConceptScore W4225854236C177148314 @default.
- W4225854236 hasConceptScore W4225854236C199360897 @default.
- W4225854236 hasConceptScore W4225854236C22019652 @default.
- W4225854236 hasConceptScore W4225854236C2776135515 @default.
- W4225854236 hasConceptScore W4225854236C33923547 @default.
- W4225854236 hasConceptScore W4225854236C41008148 @default.
- W4225854236 hasConceptScore W4225854236C50644808 @default.
- W4225854236 hasConceptScore W4225854236C5465570 @default.
- W4225854236 hasLocation W42258542361 @default.
- W4225854236 hasOpenAccess W4225854236 @default.
- W4225854236 hasPrimaryLocation W42258542361 @default.
- W4225854236 hasRelatedWork W2906967080 @default.
- W4225854236 hasRelatedWork W2989932438 @default.
- W4225854236 hasRelatedWork W3018907748 @default.
- W4225854236 hasRelatedWork W3041434171 @default.
- W4225854236 hasRelatedWork W3099765033 @default.
- W4225854236 hasRelatedWork W3216225969 @default.
- W4225854236 hasRelatedWork W4225854236 @default.
- W4225854236 hasRelatedWork W4254751698 @default.
- W4225854236 hasRelatedWork W4287725140 @default.
- W4225854236 hasRelatedWork W3040157805 @default.
- W4225854236 isParatext "false" @default.
- W4225854236 isRetracted "false" @default.
- W4225854236 workType "article" @default.