Matches in SemOpenAlex for { <https://semopenalex.org/work/W4295678827> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W4295678827 endingPage "25" @default.
- W4295678827 startingPage "1" @default.
- W4295678827 abstract "Intensive communication and synchronization cost for gradients and parameters is the well-known bottleneck of distributed deep learning training. Based on the observations that Synchronous SGD (SSGD) obtains good convergence accuracy while asynchronous SGD (ASGD) delivers a faster raw training speed, we propose Several Steps Delay SGD (SSD-SGD) to combine their merits, aiming at tackling the communication bottleneck via communication sparsification. SSD-SGD explores both global synchronous updates in the parameter servers and asynchronous local updates in the workers in each periodic iteration. The periodic and flexible synchronization makes SSD-SGD achieve good convergence accuracy and fast training speed. To the best of our knowledge, we strike the new balance between synchronization quality and communication sparsification, and improve the tradeoff between accuracy and training speed. Specifically, the core components of SSD-SGD include proper warm-up stage, steps delay stage, and the novel algorithm of global gradient for local update (GLU). GLU is critical for local update operations by using global gradient information to effectively compensate for the delayed local weights. Furthermore, we implement SSD-SGD on MXNet framework and comprehensively evaluate its performance with CIFAR-10 and ImageNet datasets. Experimental results show that SSD-SGD can accelerate distributed training speed under different experimental configurations, by up to 110% (or 2.1× of the original speed), while achieving good convergence accuracy." @default.
- W4295678827 created "2022-09-14" @default.
- W4295678827 creator A5007384264 @default.
- W4295678827 creator A5037677450 @default.
- W4295678827 creator A5042544900 @default.
- W4295678827 date "2022-12-16" @default.
- W4295678827 modified "2023-09-30" @default.
- W4295678827 title "SSD-SGD: Communication Sparsification for Distributed Deep Learning Training" @default.
- W4295678827 cites W2108598243 @default.
- W4295678827 cites W2194775991 @default.
- W4295678827 cites W2579247884 @default.
- W4295678827 cites W2580688187 @default.
- W4295678827 cites W2794670651 @default.
- W4295678827 cites W2942889782 @default.
- W4295678827 cites W2949161920 @default.
- W4295678827 cites W2963917928 @default.
- W4295678827 cites W2965658867 @default.
- W4295678827 cites W2969388332 @default.
- W4295678827 cites W2975712713 @default.
- W4295678827 cites W2982664135 @default.
- W4295678827 cites W2994144272 @default.
- W4295678827 cites W2997300524 @default.
- W4295678827 cites W3016395792 @default.
- W4295678827 cites W3037871107 @default.
- W4295678827 cites W3038104246 @default.
- W4295678827 cites W3043522163 @default.
- W4295678827 cites W3047002910 @default.
- W4295678827 cites W3090287616 @default.
- W4295678827 cites W3102816259 @default.
- W4295678827 doi "https://doi.org/10.1145/3563038" @default.
- W4295678827 hasPublicationYear "2022" @default.
- W4295678827 type Work @default.
- W4295678827 citedByCount "0" @default.
- W4295678827 crossrefType "journal-article" @default.
- W4295678827 hasAuthorship W4295678827A5007384264 @default.
- W4295678827 hasAuthorship W4295678827A5037677450 @default.
- W4295678827 hasAuthorship W4295678827A5042544900 @default.
- W4295678827 hasBestOaLocation W42956788271 @default.
- W4295678827 hasConcept C120314980 @default.
- W4295678827 hasConcept C127162648 @default.
- W4295678827 hasConcept C149635348 @default.
- W4295678827 hasConcept C151319957 @default.
- W4295678827 hasConcept C154945302 @default.
- W4295678827 hasConcept C162324750 @default.
- W4295678827 hasConcept C173608175 @default.
- W4295678827 hasConcept C2777303404 @default.
- W4295678827 hasConcept C2778562939 @default.
- W4295678827 hasConcept C2780513914 @default.
- W4295678827 hasConcept C31258907 @default.
- W4295678827 hasConcept C41008148 @default.
- W4295678827 hasConcept C50522688 @default.
- W4295678827 hasConcept C68339613 @default.
- W4295678827 hasConceptScore W4295678827C120314980 @default.
- W4295678827 hasConceptScore W4295678827C127162648 @default.
- W4295678827 hasConceptScore W4295678827C149635348 @default.
- W4295678827 hasConceptScore W4295678827C151319957 @default.
- W4295678827 hasConceptScore W4295678827C154945302 @default.
- W4295678827 hasConceptScore W4295678827C162324750 @default.
- W4295678827 hasConceptScore W4295678827C173608175 @default.
- W4295678827 hasConceptScore W4295678827C2777303404 @default.
- W4295678827 hasConceptScore W4295678827C2778562939 @default.
- W4295678827 hasConceptScore W4295678827C2780513914 @default.
- W4295678827 hasConceptScore W4295678827C31258907 @default.
- W4295678827 hasConceptScore W4295678827C41008148 @default.
- W4295678827 hasConceptScore W4295678827C50522688 @default.
- W4295678827 hasConceptScore W4295678827C68339613 @default.
- W4295678827 hasIssue "1" @default.
- W4295678827 hasLocation W42956788271 @default.
- W4295678827 hasOpenAccess W4295678827 @default.
- W4295678827 hasPrimaryLocation W42956788271 @default.
- W4295678827 hasRelatedWork W2125439111 @default.
- W4295678827 hasRelatedWork W2131630752 @default.
- W4295678827 hasRelatedWork W2133123956 @default.
- W4295678827 hasRelatedWork W2364921833 @default.
- W4295678827 hasRelatedWork W2891987081 @default.
- W4295678827 hasRelatedWork W3011751313 @default.
- W4295678827 hasRelatedWork W3128207141 @default.
- W4295678827 hasRelatedWork W4206646953 @default.
- W4295678827 hasRelatedWork W4245208434 @default.
- W4295678827 hasRelatedWork W4316252369 @default.
- W4295678827 hasVolume "20" @default.
- W4295678827 isParatext "false" @default.
- W4295678827 isRetracted "false" @default.
- W4295678827 workType "article" @default.