Matches in SemOpenAlex for { <https://semopenalex.org/work/W4328120229> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W4328120229 endingPage "3961" @default.
- W4328120229 startingPage "3961" @default.
- W4328120229 abstract "Previous literature on deep learning theory has focused on implicit bias with small learning rates. In this work, we explore the impact of data separability on the implicit bias of deep learning algorithms under the large learning rate. Using deep linear networks for binary classification with the logistic loss under the large learning rate regime, we characterize the implicit bias effect with data separability on training dynamics. From a data analytics perspective, we claim that depending on the separation conditions of data, the gradient descent iterates will converge to a flatter minimum in the large learning rate phase, which results in improved generalization. Our theory is rigorously proven under the assumption of degenerate data by overcoming the difficulty of the non-constant Hessian of logistic loss and confirmed by experiments on both experimental and non-degenerated datasets. Our results highlight the importance of data separability in training dynamics and the benefits of learning rate annealing schemes using an initial large learning rate." @default.
- W4328120229 created "2023-03-22" @default.
- W4328120229 creator A5005288941 @default.
- W4328120229 creator A5073709711 @default.
- W4328120229 creator A5076730801 @default.
- W4328120229 date "2023-03-20" @default.
- W4328120229 modified "2023-09-26" @default.
- W4328120229 title "Implicit Bias of Deep Learning in the Large Learning Rate Phase: A Data Separability Perspective" @default.
- W4328120229 cites W2125877832 @default.
- W4328120229 cites W2734358244 @default.
- W4328120229 cites W2964137095 @default.
- W4328120229 cites W2964155802 @default.
- W4328120229 cites W2964292098 @default.
- W4328120229 cites W2970217468 @default.
- W4328120229 cites W2973136425 @default.
- W4328120229 cites W2981407587 @default.
- W4328120229 cites W3132451580 @default.
- W4328120229 cites W3189987536 @default.
- W4328120229 cites W4230471307 @default.
- W4328120229 cites W4312856860 @default.
- W4328120229 doi "https://doi.org/10.3390/app13063961" @default.
- W4328120229 hasPublicationYear "2023" @default.
- W4328120229 type Work @default.
- W4328120229 citedByCount "1" @default.
- W4328120229 countsByYear W43281202292023 @default.
- W4328120229 crossrefType "journal-article" @default.
- W4328120229 hasAuthorship W4328120229A5005288941 @default.
- W4328120229 hasAuthorship W4328120229A5073709711 @default.
- W4328120229 hasAuthorship W4328120229A5076730801 @default.
- W4328120229 hasBestOaLocation W43281202291 @default.
- W4328120229 hasConcept C108583219 @default.
- W4328120229 hasConcept C119857082 @default.
- W4328120229 hasConcept C12713177 @default.
- W4328120229 hasConcept C134306372 @default.
- W4328120229 hasConcept C149782125 @default.
- W4328120229 hasConcept C154945302 @default.
- W4328120229 hasConcept C177148314 @default.
- W4328120229 hasConcept C203616005 @default.
- W4328120229 hasConcept C28826006 @default.
- W4328120229 hasConcept C33923547 @default.
- W4328120229 hasConcept C41008148 @default.
- W4328120229 hasConceptScore W4328120229C108583219 @default.
- W4328120229 hasConceptScore W4328120229C119857082 @default.
- W4328120229 hasConceptScore W4328120229C12713177 @default.
- W4328120229 hasConceptScore W4328120229C134306372 @default.
- W4328120229 hasConceptScore W4328120229C149782125 @default.
- W4328120229 hasConceptScore W4328120229C154945302 @default.
- W4328120229 hasConceptScore W4328120229C177148314 @default.
- W4328120229 hasConceptScore W4328120229C203616005 @default.
- W4328120229 hasConceptScore W4328120229C28826006 @default.
- W4328120229 hasConceptScore W4328120229C33923547 @default.
- W4328120229 hasConceptScore W4328120229C41008148 @default.
- W4328120229 hasIssue "6" @default.
- W4328120229 hasLocation W43281202291 @default.
- W4328120229 hasOpenAccess W4328120229 @default.
- W4328120229 hasPrimaryLocation W43281202291 @default.
- W4328120229 hasRelatedWork W2795261237 @default.
- W4328120229 hasRelatedWork W3014300295 @default.
- W4328120229 hasRelatedWork W3164822677 @default.
- W4328120229 hasRelatedWork W4223943233 @default.
- W4328120229 hasRelatedWork W4225161397 @default.
- W4328120229 hasRelatedWork W4312200629 @default.
- W4328120229 hasRelatedWork W4360585206 @default.
- W4328120229 hasRelatedWork W4364306694 @default.
- W4328120229 hasRelatedWork W4380075502 @default.
- W4328120229 hasRelatedWork W4380086463 @default.
- W4328120229 hasVolume "13" @default.
- W4328120229 isParatext "false" @default.
- W4328120229 isRetracted "false" @default.
- W4328120229 workType "article" @default.