Matches in SemOpenAlex for { <https://semopenalex.org/work/W4366342854> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4366342854 abstract "Hypernetworks, neural networks that predict the parameters of another neural network, are powerful models that have been successfully used in diverse applications from image generation to multi-task learning. Unfortunately, existing hypernetworks are often challenging to train. Training typically converges far more slowly than for non-hypernetwork models, and the rate of convergence can be very sensitive to hyperparameter choices. In this work, we identify a fundamental and previously unidentified problem that contributes to the challenge of training hypernetworks: a magnitude proportionality between the inputs and outputs of the hypernetwork. We demonstrate both analytically and empirically that this can lead to unstable optimization, thereby slowing down convergence, and sometimes even preventing any learning. We present a simple solution to this problem using a revised hypernetwork formulation that we call Magnitude Invariant Parametrizations (MIP). We demonstrate the proposed solution on several hypernetwork tasks, where it consistently stabilizes training and achieves faster convergence. Furthermore, we perform a comprehensive ablation study including choices of activation function, normalization strategies, input dimensionality, and hypernetwork architecture; and find that MIP improves training in all scenarios. We provide easy-to-use code that can turn existing networks into MIP-based hypernetworks." @default.
- W4366342854 created "2023-04-20" @default.
- W4366342854 creator A5007282049 @default.
- W4366342854 creator A5053581915 @default.
- W4366342854 creator A5091409910 @default.
- W4366342854 date "2023-04-15" @default.
- W4366342854 modified "2023-09-28" @default.
- W4366342854 title "Magnitude Invariant Parametrizations Improve Hypernetwork Learning" @default.
- W4366342854 doi "https://doi.org/10.48550/arxiv.2304.07645" @default.
- W4366342854 hasPublicationYear "2023" @default.
- W4366342854 type Work @default.
- W4366342854 citedByCount "0" @default.
- W4366342854 crossrefType "posted-content" @default.
- W4366342854 hasAuthorship W4366342854A5007282049 @default.
- W4366342854 hasAuthorship W4366342854A5053581915 @default.
- W4366342854 hasAuthorship W4366342854A5091409910 @default.
- W4366342854 hasBestOaLocation W43663428541 @default.
- W4366342854 hasConcept C111030470 @default.
- W4366342854 hasConcept C119857082 @default.
- W4366342854 hasConcept C136886441 @default.
- W4366342854 hasConcept C144024400 @default.
- W4366342854 hasConcept C154945302 @default.
- W4366342854 hasConcept C162324750 @default.
- W4366342854 hasConcept C190470478 @default.
- W4366342854 hasConcept C19165224 @default.
- W4366342854 hasConcept C2777303404 @default.
- W4366342854 hasConcept C33923547 @default.
- W4366342854 hasConcept C37914503 @default.
- W4366342854 hasConcept C41008148 @default.
- W4366342854 hasConcept C50522688 @default.
- W4366342854 hasConcept C50644808 @default.
- W4366342854 hasConcept C8642999 @default.
- W4366342854 hasConceptScore W4366342854C111030470 @default.
- W4366342854 hasConceptScore W4366342854C119857082 @default.
- W4366342854 hasConceptScore W4366342854C136886441 @default.
- W4366342854 hasConceptScore W4366342854C144024400 @default.
- W4366342854 hasConceptScore W4366342854C154945302 @default.
- W4366342854 hasConceptScore W4366342854C162324750 @default.
- W4366342854 hasConceptScore W4366342854C190470478 @default.
- W4366342854 hasConceptScore W4366342854C19165224 @default.
- W4366342854 hasConceptScore W4366342854C2777303404 @default.
- W4366342854 hasConceptScore W4366342854C33923547 @default.
- W4366342854 hasConceptScore W4366342854C37914503 @default.
- W4366342854 hasConceptScore W4366342854C41008148 @default.
- W4366342854 hasConceptScore W4366342854C50522688 @default.
- W4366342854 hasConceptScore W4366342854C50644808 @default.
- W4366342854 hasConceptScore W4366342854C8642999 @default.
- W4366342854 hasLocation W43663428541 @default.
- W4366342854 hasOpenAccess W4366342854 @default.
- W4366342854 hasPrimaryLocation W43663428541 @default.
- W4366342854 hasRelatedWork W2533072256 @default.
- W4366342854 hasRelatedWork W3199608561 @default.
- W4366342854 hasRelatedWork W4210794429 @default.
- W4366342854 hasRelatedWork W4223456145 @default.
- W4366342854 hasRelatedWork W4280535922 @default.
- W4366342854 hasRelatedWork W4283697347 @default.
- W4366342854 hasRelatedWork W4295309597 @default.
- W4366342854 hasRelatedWork W4309113015 @default.
- W4366342854 hasRelatedWork W4313854490 @default.
- W4366342854 hasRelatedWork W1629725936 @default.
- W4366342854 isParatext "false" @default.
- W4366342854 isRetracted "false" @default.
- W4366342854 workType "article" @default.