Matches in SemOpenAlex for { <https://semopenalex.org/work/W4281687132> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4281687132 abstract "The problem of optimization on Stiefel manifold, i.e., minimizing functions of (not necessarily square) matrices that satisfy orthogonality constraints, has been extensively studied. Yet, a new approach is proposed based on, for the first time, an interplay between thoughtfully designed continuous and discrete dynamics. It leads to a gradient-based optimizer with intrinsically added momentum. This method exactly preserves the manifold structure but does not require additional operation to keep momentum in the changing (co)tangent space, and thus has low computational cost and pleasant accuracy. Its generalization to adaptive learning rates is also demonstrated. Notable performances are observed in practical tasks. For instance, we found that placing orthogonal constraints on attention heads of trained-from-scratch Vision Transformer [Dosovitskiy et al. 2022] could markedly improve its performance, when our optimizer is used, and it is better that each head is made orthogonal within itself but not necessarily to other heads. This optimizer also makes the useful notion of Projection Robust Wasserstein Distance [Paty & Cuturi 2019; Lin et al. 2020] for high-dim. optimal transport even more effective." @default.
- W4281687132 created "2022-06-13" @default.
- W4281687132 creator A5073666656 @default.
- W4281687132 creator A5090499084 @default.
- W4281687132 creator A5090655358 @default.
- W4281687132 date "2022-05-27" @default.
- W4281687132 modified "2023-09-24" @default.
- W4281687132 title "Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport" @default.
- W4281687132 doi "https://doi.org/10.48550/arxiv.2205.14173" @default.
- W4281687132 hasPublicationYear "2022" @default.
- W4281687132 type Work @default.
- W4281687132 citedByCount "0" @default.
- W4281687132 crossrefType "posted-content" @default.
- W4281687132 hasAuthorship W4281687132A5073666656 @default.
- W4281687132 hasAuthorship W4281687132A5090499084 @default.
- W4281687132 hasAuthorship W4281687132A5090655358 @default.
- W4281687132 hasBestOaLocation W42816871321 @default.
- W4281687132 hasConcept C11413529 @default.
- W4281687132 hasConcept C126255220 @default.
- W4281687132 hasConcept C127413603 @default.
- W4281687132 hasConcept C134306372 @default.
- W4281687132 hasConcept C138187205 @default.
- W4281687132 hasConcept C157157409 @default.
- W4281687132 hasConcept C17137986 @default.
- W4281687132 hasConcept C177148314 @default.
- W4281687132 hasConcept C202444582 @default.
- W4281687132 hasConcept C2524010 @default.
- W4281687132 hasConcept C33923547 @default.
- W4281687132 hasConcept C41008148 @default.
- W4281687132 hasConcept C529865628 @default.
- W4281687132 hasConcept C612670 @default.
- W4281687132 hasConcept C78519656 @default.
- W4281687132 hasConceptScore W4281687132C11413529 @default.
- W4281687132 hasConceptScore W4281687132C126255220 @default.
- W4281687132 hasConceptScore W4281687132C127413603 @default.
- W4281687132 hasConceptScore W4281687132C134306372 @default.
- W4281687132 hasConceptScore W4281687132C138187205 @default.
- W4281687132 hasConceptScore W4281687132C157157409 @default.
- W4281687132 hasConceptScore W4281687132C17137986 @default.
- W4281687132 hasConceptScore W4281687132C177148314 @default.
- W4281687132 hasConceptScore W4281687132C202444582 @default.
- W4281687132 hasConceptScore W4281687132C2524010 @default.
- W4281687132 hasConceptScore W4281687132C33923547 @default.
- W4281687132 hasConceptScore W4281687132C41008148 @default.
- W4281687132 hasConceptScore W4281687132C529865628 @default.
- W4281687132 hasConceptScore W4281687132C612670 @default.
- W4281687132 hasConceptScore W4281687132C78519656 @default.
- W4281687132 hasLocation W42816871321 @default.
- W4281687132 hasOpenAccess W4281687132 @default.
- W4281687132 hasPrimaryLocation W42816871321 @default.
- W4281687132 hasRelatedWork W1516714362 @default.
- W4281687132 hasRelatedWork W2118338279 @default.
- W4281687132 hasRelatedWork W2126146080 @default.
- W4281687132 hasRelatedWork W2141523690 @default.
- W4281687132 hasRelatedWork W3108034558 @default.
- W4281687132 hasRelatedWork W3163047668 @default.
- W4281687132 hasRelatedWork W3198590018 @default.
- W4281687132 hasRelatedWork W4225279846 @default.
- W4281687132 hasRelatedWork W4280643089 @default.
- W4281687132 hasRelatedWork W4301042573 @default.
- W4281687132 isParatext "false" @default.
- W4281687132 isRetracted "false" @default.
- W4281687132 workType "article" @default.