Matches in SemOpenAlex for { <https://semopenalex.org/work/W3034491857> ?p ?o ?g. }
- W3034491857 abstract "Hessian based measures of flatness, such as the trace, Frobenius and spectral norms, have been argued, used and shown to relate to generalisation. In this paper we demonstrate that, for feed-forward neural networks under the cross-entropy loss, low-loss solutions with large neural network weights have small Hessian based measures of flatness. This implies that solutions obtained without L2 regularisation should be less sharp than those with despite generalising worse. We show this to be true for logistic regression, multi-layer perceptrons, simple convolutional, pre-activated and wide residual networks on the MNIST and CIFAR-100 datasets. Furthermore, we show that adaptive optimisation algorithms using iterate averaging, on the VGG-16 network and CIFAR-100 dataset, achieve superior generalisation to SGD but are 30× sharper. These theoretical and experimental results further advocate the need to use flatness in conjunction with the weights scale to measure generalisation citep{neyshabur2017exploring,dziugaite2017computing}." @default.
- W3034491857 created "2020-06-19" @default.
- W3034491857 creator A5038607215 @default.
- W3034491857 date "2021-05-04" @default.
- W3034491857 modified "2023-09-27" @default.
- W3034491857 title "Flatness is a False Friend" @default.
- W3034491857 cites W112019301 @default.
- W3034491857 cites W1686810756 @default.
- W3034491857 cites W1990457366 @default.
- W3034491857 cites W2006903949 @default.
- W3034491857 cites W2099111195 @default.
- W3034491857 cites W2117670920 @default.
- W3034491857 cites W2125426869 @default.
- W3034491857 cites W2144513243 @default.
- W3034491857 cites W2145615204 @default.
- W3034491857 cites W2605372163 @default.
- W3034491857 cites W2626325961 @default.
- W3034491857 cites W2731468224 @default.
- W3034491857 cites W2752366553 @default.
- W3034491857 cites W2776855315 @default.
- W3034491857 cites W2808042107 @default.
- W3034491857 cites W2900726521 @default.
- W3034491857 cites W2909468424 @default.
- W3034491857 cites W2911853241 @default.
- W3034491857 cites W2947654433 @default.
- W3034491857 cites W2950990113 @default.
- W3034491857 cites W2963069632 @default.
- W3034491857 cites W2963173418 @default.
- W3034491857 cites W2963208657 @default.
- W3034491857 cites W2963739978 @default.
- W3034491857 cites W2963959597 @default.
- W3034491857 cites W2964121744 @default.
- W3034491857 cites W2964346549 @default.
- W3034491857 cites W2971130081 @default.
- W3034491857 cites W2971353280 @default.
- W3034491857 cites W2992043355 @default.
- W3034491857 cites W2996642285 @default.
- W3034491857 cites W2997401498 @default.
- W3034491857 cites W3010051915 @default.
- W3034491857 cites W3093329015 @default.
- W3034491857 cites W3137695714 @default.
- W3034491857 hasPublicationYear "2021" @default.
- W3034491857 type Work @default.
- W3034491857 sameAs 3034491857 @default.
- W3034491857 citedByCount "2" @default.
- W3034491857 countsByYear W30344918572020 @default.
- W3034491857 countsByYear W30344918572021 @default.
- W3034491857 crossrefType "journal-article" @default.
- W3034491857 hasAuthorship W3034491857A5038607215 @default.
- W3034491857 hasConcept C11413529 @default.
- W3034491857 hasConcept C121332964 @default.
- W3034491857 hasConcept C154945302 @default.
- W3034491857 hasConcept C155512373 @default.
- W3034491857 hasConcept C190502265 @default.
- W3034491857 hasConcept C203616005 @default.
- W3034491857 hasConcept C26405456 @default.
- W3034491857 hasConcept C2778530986 @default.
- W3034491857 hasConcept C28826006 @default.
- W3034491857 hasConcept C33923547 @default.
- W3034491857 hasConcept C41008148 @default.
- W3034491857 hasConcept C50644808 @default.
- W3034491857 hasConcept C60908668 @default.
- W3034491857 hasConcept C62520636 @default.
- W3034491857 hasConcept C81363708 @default.
- W3034491857 hasConceptScore W3034491857C11413529 @default.
- W3034491857 hasConceptScore W3034491857C121332964 @default.
- W3034491857 hasConceptScore W3034491857C154945302 @default.
- W3034491857 hasConceptScore W3034491857C155512373 @default.
- W3034491857 hasConceptScore W3034491857C190502265 @default.
- W3034491857 hasConceptScore W3034491857C203616005 @default.
- W3034491857 hasConceptScore W3034491857C26405456 @default.
- W3034491857 hasConceptScore W3034491857C2778530986 @default.
- W3034491857 hasConceptScore W3034491857C28826006 @default.
- W3034491857 hasConceptScore W3034491857C33923547 @default.
- W3034491857 hasConceptScore W3034491857C41008148 @default.
- W3034491857 hasConceptScore W3034491857C50644808 @default.
- W3034491857 hasConceptScore W3034491857C60908668 @default.
- W3034491857 hasConceptScore W3034491857C62520636 @default.
- W3034491857 hasConceptScore W3034491857C81363708 @default.
- W3034491857 hasLocation W30344918571 @default.
- W3034491857 hasOpenAccess W3034491857 @default.
- W3034491857 hasPrimaryLocation W30344918571 @default.
- W3034491857 hasRelatedWork W2413204168 @default.
- W3034491857 hasRelatedWork W2619516334 @default.
- W3034491857 hasRelatedWork W2786807178 @default.
- W3034491857 hasRelatedWork W2803921058 @default.
- W3034491857 hasRelatedWork W2901831961 @default.
- W3034491857 hasRelatedWork W2913126955 @default.
- W3034491857 hasRelatedWork W2977038613 @default.
- W3034491857 hasRelatedWork W2978039092 @default.
- W3034491857 hasRelatedWork W3000593479 @default.
- W3034491857 hasRelatedWork W3005981462 @default.
- W3034491857 hasRelatedWork W3007027445 @default.
- W3034491857 hasRelatedWork W3022893496 @default.
- W3034491857 hasRelatedWork W3082600623 @default.
- W3034491857 hasRelatedWork W3092670332 @default.
- W3034491857 hasRelatedWork W3104854837 @default.
- W3034491857 hasRelatedWork W3127805089 @default.
- W3034491857 hasRelatedWork W3128741403 @default.
- W3034491857 hasRelatedWork W3133641077 @default.