Matches in SemOpenAlex for { <https://semopenalex.org/work/W2952126211> ?p ?o ?g. }
- W2952126211 abstract "In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks, including fully-connected and convolutional neural networks with ReLU or LeakyReLU activations. In particular, we study the gradient descent or gradient flow (i.e., gradient descent with infinitesimal step size) optimizing the logistic loss or cross-entropy loss of any homogeneous model (possibly non-smooth), and show that if the training loss decreases below a certain threshold, then we can define a smoothed version of the normalized margin which increases over time. We also formulate a natural constrained optimization problem related to margin maximization, and prove that both the normalized margin and its smoothed version converge to the objective value at a KKT point of the optimization problem. Our results generalize the previous results for logistic regression with one-layer or multi-layer linear networks, and provide more quantitative convergence results with weaker assumptions than previous results for homogeneous smooth neural networks. We conduct several experiments to justify our theoretical finding on MNIST and CIFAR-10 datasets. Finally, as margin is closely related to robustness, we discuss potential benefits of training longer for improving the robustness of the model." @default.
- W2952126211 created "2019-06-27" @default.
- W2952126211 creator A5030891201 @default.
- W2952126211 creator A5052883326 @default.
- W2952126211 date "2019-06-13" @default.
- W2952126211 modified "2023-09-23" @default.
- W2952126211 title "Gradient Descent Maximizes the Margin of Homogeneous Neural Networks" @default.
- W2952126211 cites W1551360398 @default.
- W2952126211 cites W1677182931 @default.
- W2952126211 cites W1918179283 @default.
- W2952126211 cites W1926592634 @default.
- W2952126211 cites W1975846642 @default.
- W2952126211 cites W1996195522 @default.
- W2952126211 cites W2000985550 @default.
- W2952126211 cites W2017413125 @default.
- W2952126211 cites W2041103407 @default.
- W2952126211 cites W2069936846 @default.
- W2952126211 cites W2077612096 @default.
- W2952126211 cites W2079484126 @default.
- W2952126211 cites W2155858901 @default.
- W2952126211 cites W2168885649 @default.
- W2952126211 cites W2291661959 @default.
- W2952126211 cites W2529714286 @default.
- W2952126211 cites W2567576169 @default.
- W2952126211 cites W26023427 @default.
- W2952126211 cites W2790253170 @default.
- W2952126211 cites W2803636134 @default.
- W2952126211 cites W2806252860 @default.
- W2952126211 cites W2808157231 @default.
- W2952126211 cites W2809090039 @default.
- W2952126211 cites W2889575872 @default.
- W2952126211 cites W2896721680 @default.
- W2952126211 cites W2896834587 @default.
- W2952126211 cites W2899748887 @default.
- W2952126211 cites W2900959181 @default.
- W2952126211 cites W2911742574 @default.
- W2952126211 cites W2922277331 @default.
- W2952126211 cites W2940107683 @default.
- W2952126211 cites W2949247311 @default.
- W2952126211 cites W2949559870 @default.
- W2952126211 cites W2955590385 @default.
- W2952126211 cites W2962698540 @default.
- W2952126211 cites W2962761235 @default.
- W2952126211 cites W2962767131 @default.
- W2952126211 cites W2962851953 @default.
- W2952126211 cites W2962930448 @default.
- W2952126211 cites W2963143631 @default.
- W2952126211 cites W2963208657 @default.
- W2952126211 cites W2963285844 @default.
- W2952126211 cites W2963664410 @default.
- W2952126211 cites W2963695615 @default.
- W2952126211 cites W2963798163 @default.
- W2952126211 cites W2963826371 @default.
- W2952126211 cites W2963837241 @default.
- W2952126211 cites W2963857521 @default.
- W2952126211 cites W2964031251 @default.
- W2952126211 cites W2964072686 @default.
- W2952126211 cites W2964084001 @default.
- W2952126211 cites W2964153729 @default.
- W2952126211 cites W2964210434 @default.
- W2952126211 cites W2964220724 @default.
- W2952126211 cites W2964294232 @default.
- W2952126211 cites W2965772785 @default.
- W2952126211 cites W2970166047 @default.
- W2952126211 cites W2970176397 @default.
- W2952126211 cites W2970259623 @default.
- W2952126211 cites W2971043187 @default.
- W2952126211 cites W2996578286 @default.
- W2952126211 cites W3096064157 @default.
- W2952126211 cites W3105498960 @default.
- W2952126211 cites W3119586787 @default.
- W2952126211 cites W3137695714 @default.
- W2952126211 cites W577198184 @default.
- W2952126211 cites W9657784 @default.
- W2952126211 doi "https://doi.org/10.48550/arxiv.1906.05890" @default.
- W2952126211 hasPublicationYear "2019" @default.
- W2952126211 type Work @default.
- W2952126211 sameAs 2952126211 @default.
- W2952126211 citedByCount "29" @default.
- W2952126211 countsByYear W29521262112019 @default.
- W2952126211 countsByYear W29521262112020 @default.
- W2952126211 countsByYear W29521262112021 @default.
- W2952126211 crossrefType "posted-content" @default.
- W2952126211 hasAuthorship W2952126211A5030891201 @default.
- W2952126211 hasAuthorship W2952126211A5052883326 @default.
- W2952126211 hasBestOaLocation W29521262111 @default.
- W2952126211 hasConcept C104317684 @default.
- W2952126211 hasConcept C119857082 @default.
- W2952126211 hasConcept C126255220 @default.
- W2952126211 hasConcept C153258448 @default.
- W2952126211 hasConcept C154945302 @default.
- W2952126211 hasConcept C185592680 @default.
- W2952126211 hasConcept C2776135515 @default.
- W2952126211 hasConcept C2776330181 @default.
- W2952126211 hasConcept C28826006 @default.
- W2952126211 hasConcept C33923547 @default.
- W2952126211 hasConcept C41008148 @default.
- W2952126211 hasConcept C50644808 @default.
- W2952126211 hasConcept C55493867 @default.
- W2952126211 hasConcept C63479239 @default.