Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912497910> ?p ?o ?g. }
- W2912497910 endingPage "513" @default.
- W2912497910 startingPage "505" @default.
- W2912497910 abstract "Given two or more Deep Neural Networks (DNNs) with the same or similar architectures, and trained on the same dataset, but trained with different solvers, parameters, hyper-parameters, regularization, etc., can we predict which DNN will have the best test accuracy, and can we do so without peeking at the test data? In this paper, we show how to use a new Theory of Heavy-Tailed Self-Regularization (HT-SR) to answer this. HT-SR suggests, among other things, that modern DNNs exhibit what we call Heavy-Tailed Mechanistic Universality (HT-MU), meaning that the correlations in the layer weight matrices can be fit to a power law (PL) with exponents that lie in common Universality classes from Heavy-Tailed Random Matrix Theory (HT-RMT). From this, we develop a Universal capacity control metric that is a weighted average of PL exponents. Rather than considering small toy NNs, we examine over 50 different, large-scale pre-trained DNNs, ranging over 15 different architectures, trained on ImagetNet, each of which has been reported to have different test accuracies. We show that this new capacity metric correlates very well with the reported test accuracies of these DNNs, looking across each architecture (VGG16/…/VGG19, ResNet10/…/ResNet152, etc.). We also show how to approximate the metric by the more familiar Product Norm capacity measure, as the average of the log Frobenius norm of the layer weight matrices. Our approach requires no changes to the underlying DNN or its loss function, it does not require us to train a model (although it could be used to monitor training), and it does not even require access to the ImageNet data." @default.
- W2912497910 created "2019-02-21" @default.
- W2912497910 creator A5000385819 @default.
- W2912497910 creator A5033006662 @default.
- W2912497910 date "2020-01-01" @default.
- W2912497910 modified "2023-10-16" @default.
- W2912497910 title "Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks" @default.
- W2912497910 cites W1506954757 @default.
- W2912497910 cites W1535277709 @default.
- W2912497910 cites W1623941268 @default.
- W2912497910 cites W1674853821 @default.
- W2912497910 cites W1704468308 @default.
- W2912497910 cites W1780314406 @default.
- W2912497910 cites W1835900096 @default.
- W2912497910 cites W1904555434 @default.
- W2912497910 cites W1960588427 @default.
- W2912497910 cites W1978806035 @default.
- W2912497910 cites W2005790663 @default.
- W2912497910 cites W2008732718 @default.
- W2912497910 cites W2046351805 @default.
- W2912497910 cites W2053347059 @default.
- W2912497910 cites W2088856850 @default.
- W2912497910 cites W2128882956 @default.
- W2912497910 cites W2169413403 @default.
- W2912497910 cites W2237587256 @default.
- W2912497910 cites W2317010637 @default.
- W2912497910 cites W2326812357 @default.
- W2912497910 cites W2618017917 @default.
- W2912497910 cites W2709553318 @default.
- W2912497910 cites W2732724430 @default.
- W2912497910 cites W2766196653 @default.
- W2912497910 cites W2766346350 @default.
- W2912497910 cites W2785749378 @default.
- W2912497910 cites W2788800397 @default.
- W2912497910 cites W2804066051 @default.
- W2912497910 cites W2810862998 @default.
- W2912497910 cites W2883570905 @default.
- W2912497910 cites W2895616758 @default.
- W2912497910 cites W2911742574 @default.
- W2912497910 cites W2950627632 @default.
- W2912497910 cites W2959995783 @default.
- W2912497910 cites W2962857907 @default.
- W2912497910 cites W2963285844 @default.
- W2912497910 cites W2963695615 @default.
- W2912497910 cites W2964130005 @default.
- W2912497910 cites W3030286842 @default.
- W2912497910 cites W3100156752 @default.
- W2912497910 cites W3101347671 @default.
- W2912497910 cites W602946569 @default.
- W2912497910 cites W3141350557 @default.
- W2912497910 doi "https://doi.org/10.1137/1.9781611976236.57" @default.
- W2912497910 hasPublicationYear "2020" @default.
- W2912497910 type Work @default.
- W2912497910 sameAs 2912497910 @default.
- W2912497910 citedByCount "18" @default.
- W2912497910 countsByYear W29124979102018 @default.
- W2912497910 countsByYear W29124979102019 @default.
- W2912497910 countsByYear W29124979102020 @default.
- W2912497910 countsByYear W29124979102021 @default.
- W2912497910 countsByYear W29124979102022 @default.
- W2912497910 countsByYear W29124979102023 @default.
- W2912497910 crossrefType "book-chapter" @default.
- W2912497910 hasAuthorship W2912497910A5000385819 @default.
- W2912497910 hasAuthorship W2912497910A5033006662 @default.
- W2912497910 hasBestOaLocation W29124979101 @default.
- W2912497910 hasConcept C121332964 @default.
- W2912497910 hasConcept C153180895 @default.
- W2912497910 hasConcept C154945302 @default.
- W2912497910 hasConcept C158693339 @default.
- W2912497910 hasConcept C17744445 @default.
- W2912497910 hasConcept C183992945 @default.
- W2912497910 hasConcept C191795146 @default.
- W2912497910 hasConcept C199539241 @default.
- W2912497910 hasConcept C2776135515 @default.
- W2912497910 hasConcept C2984842247 @default.
- W2912497910 hasConcept C33923547 @default.
- W2912497910 hasConcept C41008148 @default.
- W2912497910 hasConcept C50644808 @default.
- W2912497910 hasConcept C62520636 @default.
- W2912497910 hasConcept C92207270 @default.
- W2912497910 hasConceptScore W2912497910C121332964 @default.
- W2912497910 hasConceptScore W2912497910C153180895 @default.
- W2912497910 hasConceptScore W2912497910C154945302 @default.
- W2912497910 hasConceptScore W2912497910C158693339 @default.
- W2912497910 hasConceptScore W2912497910C17744445 @default.
- W2912497910 hasConceptScore W2912497910C183992945 @default.
- W2912497910 hasConceptScore W2912497910C191795146 @default.
- W2912497910 hasConceptScore W2912497910C199539241 @default.
- W2912497910 hasConceptScore W2912497910C2776135515 @default.
- W2912497910 hasConceptScore W2912497910C2984842247 @default.
- W2912497910 hasConceptScore W2912497910C33923547 @default.
- W2912497910 hasConceptScore W2912497910C41008148 @default.
- W2912497910 hasConceptScore W2912497910C50644808 @default.
- W2912497910 hasConceptScore W2912497910C62520636 @default.
- W2912497910 hasConceptScore W2912497910C92207270 @default.
- W2912497910 hasLocation W29124979101 @default.
- W2912497910 hasLocation W29124979102 @default.
- W2912497910 hasOpenAccess W2912497910 @default.