Matches in SemOpenAlex for { <https://semopenalex.org/work/W2894604724> ?p ?o ?g. }
- W2894604724 abstract "One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies this surprising phenomenon for two-layer fully connected ReLU activated neural networks. For an $m$ hidden node shallow neural network with ReLU activation and $n$ training data, we show as long as $m$ is large enough and no two inputs are parallel, randomly initialized gradient descent converges to a globally optimal solution at a linear convergence rate for the quadratic loss function. Our analysis relies on the following observation: over-parameterization and random initialization jointly restrict every weight vector to be close to its initialization for all iterations, which allows us to exploit a strong convexity-like property to show that gradient descent converges at a global linear rate to the global optimum. We believe these insights are also useful in analyzing deep models and other first order methods." @default.
- W2894604724 created "2018-10-12" @default.
- W2894604724 creator A5013695358 @default.
- W2894604724 creator A5033061754 @default.
- W2894604724 creator A5068907670 @default.
- W2894604724 creator A5077874981 @default.
- W2894604724 date "2018-10-04" @default.
- W2894604724 modified "2023-10-09" @default.
- W2894604724 title "Gradient Descent Provably Optimizes Over-parameterized Neural Networks" @default.
- W2894604724 cites W2113517874 @default.
- W2894604724 cites W2399994860 @default.
- W2894604724 cites W2593709294 @default.
- W2894604724 cites W2608609325 @default.
- W2894604724 cites W2613481513 @default.
- W2894604724 cites W2614119628 @default.
- W2894604724 cites W2617691536 @default.
- W2894604724 cites W2758053331 @default.
- W2894604724 cites W2765428107 @default.
- W2894604724 cites W2766371994 @default.
- W2894604724 cites W2769458173 @default.
- W2894604724 cites W2788800397 @default.
- W2894604724 cites W2798818710 @default.
- W2894604724 cites W2804822090 @default.
- W2894604724 cites W2806265408 @default.
- W2894604724 cites W2949804919 @default.
- W2894604724 cites W2950220847 @default.
- W2894604724 cites W2951207584 @default.
- W2894604724 cites W2951934643 @default.
- W2894604724 cites W2952104325 @default.
- W2894604724 cites W2952318479 @default.
- W2894604724 cites W2952469083 @default.
- W2894604724 cites W2952574409 @default.
- W2894604724 cites W2962930448 @default.
- W2894604724 cites W2963383839 @default.
- W2894604724 cites W2963417959 @default.
- W2894604724 cites W2963446085 @default.
- W2894604724 cites W2964106499 @default.
- W2894604724 cites W813605148 @default.
- W2894604724 hasPublicationYear "2018" @default.
- W2894604724 type Work @default.
- W2894604724 sameAs 2894604724 @default.
- W2894604724 citedByCount "332" @default.
- W2894604724 countsByYear W28946047242018 @default.
- W2894604724 countsByYear W28946047242019 @default.
- W2894604724 countsByYear W28946047242020 @default.
- W2894604724 countsByYear W28946047242021 @default.
- W2894604724 countsByYear W28946047242022 @default.
- W2894604724 crossrefType "posted-content" @default.
- W2894604724 hasAuthorship W2894604724A5013695358 @default.
- W2894604724 hasAuthorship W2894604724A5033061754 @default.
- W2894604724 hasAuthorship W2894604724A5068907670 @default.
- W2894604724 hasAuthorship W2894604724A5077874981 @default.
- W2894604724 hasConcept C106159729 @default.
- W2894604724 hasConcept C112680207 @default.
- W2894604724 hasConcept C11413529 @default.
- W2894604724 hasConcept C114466953 @default.
- W2894604724 hasConcept C126255220 @default.
- W2894604724 hasConcept C127162648 @default.
- W2894604724 hasConcept C127413603 @default.
- W2894604724 hasConcept C14036430 @default.
- W2894604724 hasConcept C145446738 @default.
- W2894604724 hasConcept C153258448 @default.
- W2894604724 hasConcept C154945302 @default.
- W2894604724 hasConcept C162324750 @default.
- W2894604724 hasConcept C165464430 @default.
- W2894604724 hasConcept C199360897 @default.
- W2894604724 hasConcept C2524010 @default.
- W2894604724 hasConcept C2777303404 @default.
- W2894604724 hasConcept C28826006 @default.
- W2894604724 hasConcept C31258907 @default.
- W2894604724 hasConcept C33923547 @default.
- W2894604724 hasConcept C38365724 @default.
- W2894604724 hasConcept C41008148 @default.
- W2894604724 hasConcept C50522688 @default.
- W2894604724 hasConcept C50644808 @default.
- W2894604724 hasConcept C57869625 @default.
- W2894604724 hasConcept C62611344 @default.
- W2894604724 hasConcept C66938386 @default.
- W2894604724 hasConcept C72134830 @default.
- W2894604724 hasConcept C78458016 @default.
- W2894604724 hasConcept C86803240 @default.
- W2894604724 hasConceptScore W2894604724C106159729 @default.
- W2894604724 hasConceptScore W2894604724C112680207 @default.
- W2894604724 hasConceptScore W2894604724C11413529 @default.
- W2894604724 hasConceptScore W2894604724C114466953 @default.
- W2894604724 hasConceptScore W2894604724C126255220 @default.
- W2894604724 hasConceptScore W2894604724C127162648 @default.
- W2894604724 hasConceptScore W2894604724C127413603 @default.
- W2894604724 hasConceptScore W2894604724C14036430 @default.
- W2894604724 hasConceptScore W2894604724C145446738 @default.
- W2894604724 hasConceptScore W2894604724C153258448 @default.
- W2894604724 hasConceptScore W2894604724C154945302 @default.
- W2894604724 hasConceptScore W2894604724C162324750 @default.
- W2894604724 hasConceptScore W2894604724C165464430 @default.
- W2894604724 hasConceptScore W2894604724C199360897 @default.
- W2894604724 hasConceptScore W2894604724C2524010 @default.
- W2894604724 hasConceptScore W2894604724C2777303404 @default.
- W2894604724 hasConceptScore W2894604724C28826006 @default.
- W2894604724 hasConceptScore W2894604724C31258907 @default.
- W2894604724 hasConceptScore W2894604724C33923547 @default.