Matches in SemOpenAlex for { <https://semopenalex.org/work/W2767392521> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2767392521 endingPage "4796" @default.
- W2767392521 startingPage "4786" @default.
- W2767392521 abstract "There is evidence that a well-conditioned singular value distribution of the input/output Jacobian can lead to substantial improvements in training performance for deep neural networks. For deep linear networks there is conclusive evidence that initializing using orthogonal random matrices can lead to dramatic improvements to the training. However, the benefit of such initialization strategies has proven much less obvious for more realistic nonlinear networks. We use random matrix theory to study the conditioning of the Jacobian for nonlinear neural networks after random initialization. We show that the singular value distribution of the Jacobian is sensitive not only to the distribution of weights but also to the nonlinearity. Surprisingly we find that the benefit of orthogonal initialization is negligible for rectified linear networks but substantial for tanh networks. We provide a rule of thumb for initializing tanh networks such that they display dynamical isometry over their full depth. Finally, we perform experiments on MNIST and CIFAR10 using a wide array of optimizers. We show conclusively that the singular value distribution of the Jacobian is intimately related to learning dynamics. Finally, we show that the spectral density of the Jacobian evolves relatively slowly during training so good initialization affects learning dynamics far from the initial setting of the weights." @default.
- W2767392521 created "2017-11-17" @default.
- W2767392521 creator A5010225522 @default.
- W2767392521 creator A5056551357 @default.
- W2767392521 creator A5070645440 @default.
- W2767392521 date "2017-01-01" @default.
- W2767392521 modified "2023-09-24" @default.
- W2767392521 title "Investigating the learning dynamics of deep neural networks using random matrix theory" @default.
- W2767392521 hasPublicationYear "2017" @default.
- W2767392521 type Work @default.
- W2767392521 sameAs 2767392521 @default.
- W2767392521 citedByCount "0" @default.
- W2767392521 crossrefType "proceedings-article" @default.
- W2767392521 hasAuthorship W2767392521A5010225522 @default.
- W2767392521 hasAuthorship W2767392521A5056551357 @default.
- W2767392521 hasAuthorship W2767392521A5070645440 @default.
- W2767392521 hasConcept C108583219 @default.
- W2767392521 hasConcept C109282560 @default.
- W2767392521 hasConcept C11413529 @default.
- W2767392521 hasConcept C114466953 @default.
- W2767392521 hasConcept C121332964 @default.
- W2767392521 hasConcept C154945302 @default.
- W2767392521 hasConcept C158622935 @default.
- W2767392521 hasConcept C158693339 @default.
- W2767392521 hasConcept C190502265 @default.
- W2767392521 hasConcept C199360897 @default.
- W2767392521 hasConcept C200331156 @default.
- W2767392521 hasConcept C22789450 @default.
- W2767392521 hasConcept C28826006 @default.
- W2767392521 hasConcept C33923547 @default.
- W2767392521 hasConcept C41008148 @default.
- W2767392521 hasConcept C50644808 @default.
- W2767392521 hasConcept C62520636 @default.
- W2767392521 hasConcept C64812099 @default.
- W2767392521 hasConceptScore W2767392521C108583219 @default.
- W2767392521 hasConceptScore W2767392521C109282560 @default.
- W2767392521 hasConceptScore W2767392521C11413529 @default.
- W2767392521 hasConceptScore W2767392521C114466953 @default.
- W2767392521 hasConceptScore W2767392521C121332964 @default.
- W2767392521 hasConceptScore W2767392521C154945302 @default.
- W2767392521 hasConceptScore W2767392521C158622935 @default.
- W2767392521 hasConceptScore W2767392521C158693339 @default.
- W2767392521 hasConceptScore W2767392521C190502265 @default.
- W2767392521 hasConceptScore W2767392521C199360897 @default.
- W2767392521 hasConceptScore W2767392521C200331156 @default.
- W2767392521 hasConceptScore W2767392521C22789450 @default.
- W2767392521 hasConceptScore W2767392521C28826006 @default.
- W2767392521 hasConceptScore W2767392521C33923547 @default.
- W2767392521 hasConceptScore W2767392521C41008148 @default.
- W2767392521 hasConceptScore W2767392521C50644808 @default.
- W2767392521 hasConceptScore W2767392521C62520636 @default.
- W2767392521 hasConceptScore W2767392521C64812099 @default.
- W2767392521 hasLocation W27673925211 @default.
- W2767392521 hasOpenAccess W2767392521 @default.
- W2767392521 hasPrimaryLocation W27673925211 @default.
- W2767392521 hasRelatedWork W1987299193 @default.
- W2767392521 hasRelatedWork W2753358588 @default.
- W2767392521 hasRelatedWork W2775748041 @default.
- W2767392521 hasRelatedWork W2785885194 @default.
- W2767392521 hasRelatedWork W2808465607 @default.
- W2767392521 hasRelatedWork W2809468328 @default.
- W2767392521 hasRelatedWork W2889560103 @default.
- W2767392521 hasRelatedWork W2889737445 @default.
- W2767392521 hasRelatedWork W2971095480 @default.
- W2767392521 hasRelatedWork W2994716630 @default.
- W2767392521 hasRelatedWork W2996067004 @default.
- W2767392521 hasRelatedWork W2998636660 @default.
- W2767392521 hasRelatedWork W3034848862 @default.
- W2767392521 hasRelatedWork W3037559325 @default.
- W2767392521 hasRelatedWork W3094965745 @default.
- W2767392521 hasRelatedWork W3124853521 @default.
- W2767392521 hasRelatedWork W3167133407 @default.
- W2767392521 hasRelatedWork W3174558417 @default.
- W2767392521 hasRelatedWork W3200039607 @default.
- W2767392521 hasRelatedWork W3213639461 @default.
- W2767392521 isParatext "false" @default.
- W2767392521 isRetracted "false" @default.
- W2767392521 magId "2767392521" @default.
- W2767392521 workType "article" @default.