Matches in SemOpenAlex for { <https://semopenalex.org/work/W3167374089> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W3167374089 endingPage "2466" @default.
- W3167374089 startingPage "2456" @default.
- W3167374089 abstract "Attention based neural networks are state of the art in a large range of applications. However, their performance tends to degrade when the number of layers increases. In this work, we show that enforcing Lipschitz continuity by normalizing the attention scores can significantly improve the performance of deep attention models. First, we show that, for deep graph attention networks (GAT), gradient explosion appears during training, leading to poor performance of gradient-based training algorithms. To address this issue, we derive a theoretical analysis of the Lipschitz continuity of attention modules and introduce LipschitzNorm, a simple and parameter-free normalization for self-attention mechanisms that enforces the model to be Lipschitz continuous. We then apply LipschitzNorm to GAT and Graph Transformers and show that their performance is substantially improved in the deep setting (10 to 30 layers). More specifically, we show that a deep GAT model with LipschitzNorm achieves state of the art results for node label prediction tasks that exhibit long-range dependencies, while showing consistent improvements over their unnormalized counterparts in benchmark node classification tasks." @default.
- W3167374089 created "2021-06-22" @default.
- W3167374089 creator A5056747950 @default.
- W3167374089 creator A5060660251 @default.
- W3167374089 creator A5089687907 @default.
- W3167374089 date "2021-07-18" @default.
- W3167374089 modified "2023-09-26" @default.
- W3167374089 title "Lipschitz normalization for self-attention layers with application to graph neural networks" @default.
- W3167374089 hasPublicationYear "2021" @default.
- W3167374089 type Work @default.
- W3167374089 sameAs 3167374089 @default.
- W3167374089 citedByCount "0" @default.
- W3167374089 crossrefType "proceedings-article" @default.
- W3167374089 hasAuthorship W3167374089A5056747950 @default.
- W3167374089 hasAuthorship W3167374089A5060660251 @default.
- W3167374089 hasAuthorship W3167374089A5089687907 @default.
- W3167374089 hasConcept C11413529 @default.
- W3167374089 hasConcept C132525143 @default.
- W3167374089 hasConcept C13280743 @default.
- W3167374089 hasConcept C134306372 @default.
- W3167374089 hasConcept C136886441 @default.
- W3167374089 hasConcept C144024400 @default.
- W3167374089 hasConcept C154945302 @default.
- W3167374089 hasConcept C185798385 @default.
- W3167374089 hasConcept C19165224 @default.
- W3167374089 hasConcept C205649164 @default.
- W3167374089 hasConcept C22324862 @default.
- W3167374089 hasConcept C2984842247 @default.
- W3167374089 hasConcept C33923547 @default.
- W3167374089 hasConcept C41008148 @default.
- W3167374089 hasConcept C50644808 @default.
- W3167374089 hasConcept C80444323 @default.
- W3167374089 hasConceptScore W3167374089C11413529 @default.
- W3167374089 hasConceptScore W3167374089C132525143 @default.
- W3167374089 hasConceptScore W3167374089C13280743 @default.
- W3167374089 hasConceptScore W3167374089C134306372 @default.
- W3167374089 hasConceptScore W3167374089C136886441 @default.
- W3167374089 hasConceptScore W3167374089C144024400 @default.
- W3167374089 hasConceptScore W3167374089C154945302 @default.
- W3167374089 hasConceptScore W3167374089C185798385 @default.
- W3167374089 hasConceptScore W3167374089C19165224 @default.
- W3167374089 hasConceptScore W3167374089C205649164 @default.
- W3167374089 hasConceptScore W3167374089C22324862 @default.
- W3167374089 hasConceptScore W3167374089C2984842247 @default.
- W3167374089 hasConceptScore W3167374089C33923547 @default.
- W3167374089 hasConceptScore W3167374089C41008148 @default.
- W3167374089 hasConceptScore W3167374089C50644808 @default.
- W3167374089 hasConceptScore W3167374089C80444323 @default.
- W3167374089 hasOpenAccess W3167374089 @default.
- W3167374089 hasRelatedWork W2599546125 @default.
- W3167374089 hasRelatedWork W2785780125 @default.
- W3167374089 hasRelatedWork W2786622092 @default.
- W3167374089 hasRelatedWork W2904838594 @default.
- W3167374089 hasRelatedWork W2927655437 @default.
- W3167374089 hasRelatedWork W2945979290 @default.
- W3167374089 hasRelatedWork W2962851953 @default.
- W3167374089 hasRelatedWork W2963564348 @default.
- W3167374089 hasRelatedWork W2971069898 @default.
- W3167374089 hasRelatedWork W2978073653 @default.
- W3167374089 hasRelatedWork W2997072009 @default.
- W3167374089 hasRelatedWork W3035527820 @default.
- W3167374089 hasRelatedWork W3127805089 @default.
- W3167374089 hasRelatedWork W3134922363 @default.
- W3167374089 hasRelatedWork W3154580345 @default.
- W3167374089 hasRelatedWork W3154883250 @default.
- W3167374089 hasRelatedWork W3187298454 @default.
- W3167374089 hasRelatedWork W3198742493 @default.
- W3167374089 hasRelatedWork W3214116688 @default.
- W3167374089 hasRelatedWork W3108940806 @default.
- W3167374089 isParatext "false" @default.
- W3167374089 isRetracted "false" @default.
- W3167374089 magId "3167374089" @default.
- W3167374089 workType "article" @default.