Matches in SemOpenAlex for { <https://semopenalex.org/work/W2894972989> ?p ?o ?g. }
- W2894972989 abstract "We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x mapsto W_N W_{N-1} cdots W_1 x$) by minimizing the $ell_2$ loss over whitened data. Convergence at a linear rate is guaranteed when the following hold: (i) dimensions of hidden layers are at least the minimum of the input and output dimensions; (ii) weight matrices at initialization are approximately balanced; and (iii) the initial loss is smaller than the loss of any rank-deficient solution. The assumptions on initialization (conditions (ii) and (iii)) are necessary, in the sense that violating any one of them may lead to convergence failure. Moreover, in the important case of output dimension 1, i.e. scalar regression, they are met, and thus convergence to global optimum holds, with constant probability under a random initialization scheme. Our results significantly extend previous analyses, e.g., of deep linear residual networks (Bartlett et al., 2018)." @default.
- W2894972989 created "2018-10-12" @default.
- W2894972989 creator A5060414926 @default.
- W2894972989 creator A5062378128 @default.
- W2894972989 creator A5079951047 @default.
- W2894972989 creator A5089252105 @default.
- W2894972989 date "2018-10-04" @default.
- W2894972989 modified "2023-09-28" @default.
- W2894972989 title "A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks" @default.
- W2894972989 cites W104184427 @default.
- W2894972989 cites W1560153690 @default.
- W2894972989 cites W1665214252 @default.
- W2894972989 cites W1899249567 @default.
- W2894972989 cites W2022851810 @default.
- W2894972989 cites W2061521613 @default.
- W2894972989 cites W2078134499 @default.
- W2894972989 cites W2078626246 @default.
- W2894972989 cites W21261423 @default.
- W2894972989 cites W2194775991 @default.
- W2894972989 cites W2399994860 @default.
- W2894972989 cites W2402144811 @default.
- W2894972989 cites W2474090883 @default.
- W2894972989 cites W2565538933 @default.
- W2894972989 cites W2591714514 @default.
- W2894972989 cites W2593709294 @default.
- W2894972989 cites W2610857016 @default.
- W2894972989 cites W2746420172 @default.
- W2894972989 cites W2766873928 @default.
- W2894972989 cites W2798707604 @default.
- W2894972989 cites W2806265408 @default.
- W2894972989 cites W2886685759 @default.
- W2894972989 cites W2899771611 @default.
- W2894972989 cites W2962930448 @default.
- W2894972989 cites W2963100491 @default.
- W2894972989 cites W2963188610 @default.
- W2894972989 cites W2963376662 @default.
- W2894972989 cites W2963383839 @default.
- W2894972989 cites W2963427613 @default.
- W2894972989 cites W2963446085 @default.
- W2894972989 cites W2963504252 @default.
- W2894972989 cites W2963519230 @default.
- W2894972989 cites W2963569411 @default.
- W2894972989 cites W2964031251 @default.
- W2894972989 cites W2964072429 @default.
- W2894972989 cites W2964106499 @default.
- W2894972989 cites W2964156132 @default.
- W2894972989 cites W2964204240 @default.
- W2894972989 cites W2966228138 @default.
- W2894972989 hasPublicationYear "2018" @default.
- W2894972989 type Work @default.
- W2894972989 sameAs 2894972989 @default.
- W2894972989 citedByCount "61" @default.
- W2894972989 countsByYear W28949729892018 @default.
- W2894972989 countsByYear W28949729892019 @default.
- W2894972989 countsByYear W28949729892020 @default.
- W2894972989 countsByYear W28949729892021 @default.
- W2894972989 crossrefType "posted-content" @default.
- W2894972989 hasAuthorship W2894972989A5060414926 @default.
- W2894972989 hasAuthorship W2894972989A5062378128 @default.
- W2894972989 hasAuthorship W2894972989A5079951047 @default.
- W2894972989 hasAuthorship W2894972989A5089252105 @default.
- W2894972989 hasConcept C11413529 @default.
- W2894972989 hasConcept C114466953 @default.
- W2894972989 hasConcept C114614502 @default.
- W2894972989 hasConcept C126255220 @default.
- W2894972989 hasConcept C127162648 @default.
- W2894972989 hasConcept C153258448 @default.
- W2894972989 hasConcept C154945302 @default.
- W2894972989 hasConcept C155512373 @default.
- W2894972989 hasConcept C162324750 @default.
- W2894972989 hasConcept C164226766 @default.
- W2894972989 hasConcept C165464430 @default.
- W2894972989 hasConcept C199360897 @default.
- W2894972989 hasConcept C2524010 @default.
- W2894972989 hasConcept C2777027219 @default.
- W2894972989 hasConcept C2777303404 @default.
- W2894972989 hasConcept C28826006 @default.
- W2894972989 hasConcept C33676613 @default.
- W2894972989 hasConcept C33923547 @default.
- W2894972989 hasConcept C41008148 @default.
- W2894972989 hasConcept C50522688 @default.
- W2894972989 hasConcept C50644808 @default.
- W2894972989 hasConcept C57691317 @default.
- W2894972989 hasConcept C57869625 @default.
- W2894972989 hasConcept C76155785 @default.
- W2894972989 hasConceptScore W2894972989C11413529 @default.
- W2894972989 hasConceptScore W2894972989C114466953 @default.
- W2894972989 hasConceptScore W2894972989C114614502 @default.
- W2894972989 hasConceptScore W2894972989C126255220 @default.
- W2894972989 hasConceptScore W2894972989C127162648 @default.
- W2894972989 hasConceptScore W2894972989C153258448 @default.
- W2894972989 hasConceptScore W2894972989C154945302 @default.
- W2894972989 hasConceptScore W2894972989C155512373 @default.
- W2894972989 hasConceptScore W2894972989C162324750 @default.
- W2894972989 hasConceptScore W2894972989C164226766 @default.
- W2894972989 hasConceptScore W2894972989C165464430 @default.
- W2894972989 hasConceptScore W2894972989C199360897 @default.
- W2894972989 hasConceptScore W2894972989C2524010 @default.
- W2894972989 hasConceptScore W2894972989C2777027219 @default.
- W2894972989 hasConceptScore W2894972989C2777303404 @default.