Matches in SemOpenAlex for { <https://semopenalex.org/work/W3091079659> ?p ?o ?g. }
- W3091079659 abstract "Natural Gradient Descent (NGD) helps to accelerate the convergence of gradient descent dynamics, but it requires approximations in large-scale deep neural networks because of its high computational cost. Empirical studies have confirmed that some NGD methods with approximate Fisher information converge sufficiently fast in practice. Nevertheless, it remains unclear from the theoretical perspective why and under what conditions such heuristic approximations work well. In this work, we reveal that, under specific conditions, NGD with approximate Fisher information achieves the same fast convergence to global minima as exact NGD. We consider deep neural networks in the infinite-width limit, and analyze the asymptotic training dynamics of NGD in function space via the neural tangent kernel. In the function space, the training dynamics with the approximate Fisher information are identical to those with the exact Fisher information, and they converge quickly. The fast convergence holds in layer-wise approximations; for instance, in block diagonal approximation where each block corresponds to a layer as well as in block tri-diagonal and K-FAC approximations. We also find that a unit-wise approximation achieves the same fast convergence under some assumptions. All of these different approximations have an isotropic gradient in the function space, and this plays a fundamental role in achieving the same convergence properties in training. Thus, the current study gives a novel and unified theoretical foundation with which to understand NGD methods in deep learning." @default.
- W3091079659 created "2020-10-08" @default.
- W3091079659 creator A5037944527 @default.
- W3091079659 creator A5076237237 @default.
- W3091079659 date "2020-10-02" @default.
- W3091079659 modified "2023-09-27" @default.
- W3091079659 title "Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks" @default.
- W3091079659 cites W1871231525 @default.
- W3091079659 cites W1940321820 @default.
- W3091079659 cites W1969810729 @default.
- W3091079659 cites W1970789124 @default.
- W3091079659 cites W1973297078 @default.
- W3091079659 cites W1999590492 @default.
- W3091079659 cites W2020107577 @default.
- W3091079659 cites W2047962774 @default.
- W3091079659 cites W2144962567 @default.
- W3091079659 cites W2595142274 @default.
- W3091079659 cites W2809090039 @default.
- W3091079659 cites W2888674590 @default.
- W3091079659 cites W2892218381 @default.
- W3091079659 cites W2914484425 @default.
- W3091079659 cites W2945554113 @default.
- W3091079659 cites W2947461788 @default.
- W3091079659 cites W2948508833 @default.
- W3091079659 cites W2962913334 @default.
- W3091079659 cites W2963063862 @default.
- W3091079659 cites W2963241285 @default.
- W3091079659 cites W2963470399 @default.
- W3091079659 cites W2964065616 @default.
- W3091079659 cites W2964125128 @default.
- W3091079659 cites W2964309400 @default.
- W3091079659 cites W2970154823 @default.
- W3091079659 cites W2971043187 @default.
- W3091079659 cites W2971055146 @default.
- W3091079659 cites W3038074040 @default.
- W3091079659 cites W3086499488 @default.
- W3091079659 hasPublicationYear "2020" @default.
- W3091079659 type Work @default.
- W3091079659 sameAs 3091079659 @default.
- W3091079659 citedByCount "0" @default.
- W3091079659 crossrefType "posted-content" @default.
- W3091079659 hasAuthorship W3091079659A5037944527 @default.
- W3091079659 hasAuthorship W3091079659A5076237237 @default.
- W3091079659 hasConcept C11413529 @default.
- W3091079659 hasConcept C119857082 @default.
- W3091079659 hasConcept C126255220 @default.
- W3091079659 hasConcept C134306372 @default.
- W3091079659 hasConcept C153258448 @default.
- W3091079659 hasConcept C154945302 @default.
- W3091079659 hasConcept C162324750 @default.
- W3091079659 hasConcept C186633575 @default.
- W3091079659 hasConcept C206688291 @default.
- W3091079659 hasConcept C2777303404 @default.
- W3091079659 hasConcept C28826006 @default.
- W3091079659 hasConcept C29406490 @default.
- W3091079659 hasConcept C33923547 @default.
- W3091079659 hasConcept C41008148 @default.
- W3091079659 hasConcept C50522688 @default.
- W3091079659 hasConcept C50644808 @default.
- W3091079659 hasConcept C91873725 @default.
- W3091079659 hasConceptScore W3091079659C11413529 @default.
- W3091079659 hasConceptScore W3091079659C119857082 @default.
- W3091079659 hasConceptScore W3091079659C126255220 @default.
- W3091079659 hasConceptScore W3091079659C134306372 @default.
- W3091079659 hasConceptScore W3091079659C153258448 @default.
- W3091079659 hasConceptScore W3091079659C154945302 @default.
- W3091079659 hasConceptScore W3091079659C162324750 @default.
- W3091079659 hasConceptScore W3091079659C186633575 @default.
- W3091079659 hasConceptScore W3091079659C206688291 @default.
- W3091079659 hasConceptScore W3091079659C2777303404 @default.
- W3091079659 hasConceptScore W3091079659C28826006 @default.
- W3091079659 hasConceptScore W3091079659C29406490 @default.
- W3091079659 hasConceptScore W3091079659C33923547 @default.
- W3091079659 hasConceptScore W3091079659C41008148 @default.
- W3091079659 hasConceptScore W3091079659C50522688 @default.
- W3091079659 hasConceptScore W3091079659C50644808 @default.
- W3091079659 hasConceptScore W3091079659C91873725 @default.
- W3091079659 hasLocation W30910796591 @default.
- W3091079659 hasOpenAccess W3091079659 @default.
- W3091079659 hasPrimaryLocation W30910796591 @default.
- W3091079659 hasRelatedWork W1893284378 @default.
- W3091079659 hasRelatedWork W2001596878 @default.
- W3091079659 hasRelatedWork W2112487920 @default.
- W3091079659 hasRelatedWork W2734411596 @default.
- W3091079659 hasRelatedWork W2781206762 @default.
- W3091079659 hasRelatedWork W2890321584 @default.
- W3091079659 hasRelatedWork W2903327037 @default.
- W3091079659 hasRelatedWork W2952307578 @default.
- W3091079659 hasRelatedWork W2990020588 @default.
- W3091079659 hasRelatedWork W3098534458 @default.
- W3091079659 hasRelatedWork W3098901429 @default.
- W3091079659 hasRelatedWork W3105651651 @default.
- W3091079659 hasRelatedWork W3125857478 @default.
- W3091079659 hasRelatedWork W3135891919 @default.
- W3091079659 hasRelatedWork W3137474564 @default.
- W3091079659 hasRelatedWork W3158528581 @default.
- W3091079659 hasRelatedWork W3167191501 @default.
- W3091079659 hasRelatedWork W3174075406 @default.
- W3091079659 hasRelatedWork W3175955759 @default.
- W3091079659 hasRelatedWork W3211344042 @default.