Matches in SemOpenAlex for { <https://semopenalex.org/work/W3204185301> ?p ?o ?g. }
- W3204185301 abstract "The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the years has been dedicated to specialised pruning techniques, little attention has been paid to the inherent effect of gradient based training on model sparsity. In this work, we introduce Powerpropagation, a new weight-parameterisation for neural networks that leads to inherently sparse models. Exploiting the behaviour of gradient descent, our method gives rise to weight updates exhibiting a rich get richer dynamic, leaving low-magnitude parameters largely unaffected by learning. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Powerpropagation is general, intuitive, cheap and straight-forward to implement and can readily be combined with various other techniques. To highlight its versatility, we explore it in two very different settings: Firstly, following a recent line of work, we investigate its effect on sparse training for resource-constrained settings. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark. Secondly, we advocate the use of sparsity in overcoming catastrophic forgetting, where compressed representations allow accommodating a large number of tasks at fixed model capacity. In all cases our reparameterisation considerably increases the efficacy of the off-the-shelf methods." @default.
- W3204185301 created "2021-10-11" @default.
- W3204185301 creator A5025338734 @default.
- W3204185301 creator A5043910056 @default.
- W3204185301 creator A5053701393 @default.
- W3204185301 creator A5064373793 @default.
- W3204185301 creator A5085990900 @default.
- W3204185301 date "2021-10-01" @default.
- W3204185301 modified "2023-09-27" @default.
- W3204185301 title "Powerpropagation: A sparsity inducing weight reparameterisation" @default.
- W3204185301 cites W133229983 @default.
- W3204185301 cites W1483365869 @default.
- W3204185301 cites W1522301498 @default.
- W3204185301 cites W1533861849 @default.
- W3204185301 cites W1677182931 @default.
- W3204185301 cites W1771410628 @default.
- W3204185301 cites W1836465849 @default.
- W3204185301 cites W2016384870 @default.
- W3204185301 cites W2060277733 @default.
- W3204185301 cites W2113839990 @default.
- W3204185301 cites W2114766824 @default.
- W3204185301 cites W2116522068 @default.
- W3204185301 cites W2117539524 @default.
- W3204185301 cites W2125389748 @default.
- W3204185301 cites W2134273960 @default.
- W3204185301 cites W2146502635 @default.
- W3204185301 cites W2147800946 @default.
- W3204185301 cites W2163605009 @default.
- W3204185301 cites W2164411961 @default.
- W3204185301 cites W2194775991 @default.
- W3204185301 cites W2293292859 @default.
- W3204185301 cites W2473930607 @default.
- W3204185301 cites W2553665199 @default.
- W3204185301 cites W2560321925 @default.
- W3204185301 cites W2560647685 @default.
- W3204185301 cites W2582745083 @default.
- W3204185301 cites W2583761661 @default.
- W3204185301 cites W2606748186 @default.
- W3204185301 cites W2609716011 @default.
- W3204185301 cites W2618767506 @default.
- W3204185301 cites W2622263826 @default.
- W3204185301 cites W2765101016 @default.
- W3204185301 cites W2766447205 @default.
- W3204185301 cites W2766736793 @default.
- W3204185301 cites W2768501777 @default.
- W3204185301 cites W2771655537 @default.
- W3204185301 cites W2805003733 @default.
- W3204185301 cites W28170644 @default.
- W3204185301 cites W2894094671 @default.
- W3204185301 cites W2894740066 @default.
- W3204185301 cites W2899063268 @default.
- W3204185301 cites W2899476926 @default.
- W3204185301 cites W2902456977 @default.
- W3204185301 cites W2907886210 @default.
- W3204185301 cites W2912515466 @default.
- W3204185301 cites W2914484425 @default.
- W3204185301 cites W2915589364 @default.
- W3204185301 cites W2919347051 @default.
- W3204185301 cites W2924791586 @default.
- W3204185301 cites W2939137134 @default.
- W3204185301 cites W2948635472 @default.
- W3204185301 cites W2949268663 @default.
- W3204185301 cites W2949608212 @default.
- W3204185301 cites W2949615363 @default.
- W3204185301 cites W2951004968 @default.
- W3204185301 cites W2952677972 @default.
- W3204185301 cites W2953488952 @default.
- W3204185301 cites W2956434358 @default.
- W3204185301 cites W2962724315 @default.
- W3204185301 cites W2963003887 @default.
- W3204185301 cites W2963055445 @default.
- W3204185301 cites W2963072899 @default.
- W3204185301 cites W2963207607 @default.
- W3204185301 cites W2963518130 @default.
- W3204185301 cites W2963588172 @default.
- W3204185301 cites W2963813679 @default.
- W3204185301 cites W2963850662 @default.
- W3204185301 cites W2963857170 @default.
- W3204185301 cites W2964031251 @default.
- W3204185301 cites W2964307104 @default.
- W3204185301 cites W2967821093 @default.
- W3204185301 cites W2970170116 @default.
- W3204185301 cites W2981344907 @default.
- W3204185301 cites W2994081359 @default.
- W3204185301 cites W2995892679 @default.
- W3204185301 cites W3017258201 @default.
- W3204185301 cites W3030163527 @default.
- W3204185301 cites W3035180000 @default.
- W3204185301 cites W3035304835 @default.
- W3204185301 cites W3037853434 @default.
- W3204185301 cites W3042228527 @default.
- W3204185301 cites W3098372854 @default.
- W3204185301 cites W3101388328 @default.
- W3204185301 cites W3118608800 @default.
- W3204185301 cites W3138136049 @default.
- W3204185301 cites W3165631200 @default.
- W3204185301 cites W2426267443 @default.
- W3204185301 hasPublicationYear "2021" @default.
- W3204185301 type Work @default.
- W3204185301 sameAs 3204185301 @default.