Matches in SemOpenAlex for { <https://semopenalex.org/work/W2896801757> ?p ?o ?g. }
- W2896801757 abstract "Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near $ell_2$ isometries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that orthogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, from which we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, motivated by experimental results we show that a low condition number of the FIM is not predictive of faster learning." @default.
- W2896801757 created "2018-10-26" @default.
- W2896801757 creator A5051989657 @default.
- W2896801757 creator A5057494211 @default.
- W2896801757 date "2018-10-09" @default.
- W2896801757 modified "2023-09-27" @default.
- W2896801757 title "Information Geometry of Orthogonal Initializations and Training" @default.
- W2896801757 cites W1522301498 @default.
- W2896801757 cites W1567512734 @default.
- W2896801757 cites W1804110266 @default.
- W2896801757 cites W1813758485 @default.
- W2896801757 cites W2045512849 @default.
- W2896801757 cites W2125930537 @default.
- W2896801757 cites W2155894447 @default.
- W2896801757 cites W2175402905 @default.
- W2896801757 cites W2194775991 @default.
- W2896801757 cites W2278108219 @default.
- W2896801757 cites W2436219157 @default.
- W2896801757 cites W2474920236 @default.
- W2896801757 cites W2485135680 @default.
- W2896801757 cites W2546257475 @default.
- W2896801757 cites W2551557006 @default.
- W2896801757 cites W2556364298 @default.
- W2896801757 cites W2587753047 @default.
- W2896801757 cites W2610190180 @default.
- W2896801757 cites W2618381130 @default.
- W2896801757 cites W2766678531 @default.
- W2896801757 cites W2785626633 @default.
- W2896801757 cites W2789210533 @default.
- W2896801757 cites W2806504075 @default.
- W2896801757 cites W2806970737 @default.
- W2896801757 cites W2809090039 @default.
- W2896801757 cites W2895118302 @default.
- W2896801757 cites W2898211994 @default.
- W2896801757 cites W2949117887 @default.
- W2896801757 cites W2951605425 @default.
- W2896801757 cites W2952861622 @default.
- W2896801757 cites W2953324412 @default.
- W2896801757 cites W2962781217 @default.
- W2896801757 cites W2962836826 @default.
- W2896801757 cites W2963148870 @default.
- W2896801757 cites W2963570896 @default.
- W2896801757 cites W2963679562 @default.
- W2896801757 cites W2963685250 @default.
- W2896801757 cites W2964065616 @default.
- W2896801757 cites W570492555 @default.
- W2896801757 hasPublicationYear "2018" @default.
- W2896801757 type Work @default.
- W2896801757 sameAs 2896801757 @default.
- W2896801757 citedByCount "3" @default.
- W2896801757 countsByYear W28968017572018 @default.
- W2896801757 countsByYear W28968017572020 @default.
- W2896801757 countsByYear W28968017572021 @default.
- W2896801757 crossrefType "posted-content" @default.
- W2896801757 hasAuthorship W2896801757A5051989657 @default.
- W2896801757 hasAuthorship W2896801757A5057494211 @default.
- W2896801757 hasConcept C102634674 @default.
- W2896801757 hasConcept C109546454 @default.
- W2896801757 hasConcept C11413529 @default.
- W2896801757 hasConcept C114466953 @default.
- W2896801757 hasConcept C12520029 @default.
- W2896801757 hasConcept C126255220 @default.
- W2896801757 hasConcept C127413603 @default.
- W2896801757 hasConcept C134306372 @default.
- W2896801757 hasConcept C154945302 @default.
- W2896801757 hasConcept C17137986 @default.
- W2896801757 hasConcept C195065555 @default.
- W2896801757 hasConcept C199360897 @default.
- W2896801757 hasConcept C200331156 @default.
- W2896801757 hasConcept C2524010 @default.
- W2896801757 hasConcept C28826006 @default.
- W2896801757 hasConcept C33923547 @default.
- W2896801757 hasConcept C41008148 @default.
- W2896801757 hasConcept C50644808 @default.
- W2896801757 hasConcept C529865628 @default.
- W2896801757 hasConcept C78519656 @default.
- W2896801757 hasConceptScore W2896801757C102634674 @default.
- W2896801757 hasConceptScore W2896801757C109546454 @default.
- W2896801757 hasConceptScore W2896801757C11413529 @default.
- W2896801757 hasConceptScore W2896801757C114466953 @default.
- W2896801757 hasConceptScore W2896801757C12520029 @default.
- W2896801757 hasConceptScore W2896801757C126255220 @default.
- W2896801757 hasConceptScore W2896801757C127413603 @default.
- W2896801757 hasConceptScore W2896801757C134306372 @default.
- W2896801757 hasConceptScore W2896801757C154945302 @default.
- W2896801757 hasConceptScore W2896801757C17137986 @default.
- W2896801757 hasConceptScore W2896801757C195065555 @default.
- W2896801757 hasConceptScore W2896801757C199360897 @default.
- W2896801757 hasConceptScore W2896801757C200331156 @default.
- W2896801757 hasConceptScore W2896801757C2524010 @default.
- W2896801757 hasConceptScore W2896801757C28826006 @default.
- W2896801757 hasConceptScore W2896801757C33923547 @default.
- W2896801757 hasConceptScore W2896801757C41008148 @default.
- W2896801757 hasConceptScore W2896801757C50644808 @default.
- W2896801757 hasConceptScore W2896801757C529865628 @default.
- W2896801757 hasConceptScore W2896801757C78519656 @default.
- W2896801757 hasLocation W28968017571 @default.
- W2896801757 hasOpenAccess W2896801757 @default.
- W2896801757 hasPrimaryLocation W28968017571 @default.
- W2896801757 hasRelatedWork W2802031777 @default.