Matches in SemOpenAlex for { <https://semopenalex.org/work/W3036710239> ?p ?o ?g. }
- W3036710239 abstract "Recently there has been a surge of research on improving the communication efficiency of distributed training. However, little work has been done to systematically understand whether the network is the bottleneck and to what extent. In this paper, we take a first-principles approach to measure and analyze the network performance of distributed training. As expected, our measurement confirms that communication is the component that blocks distributed training from linear scale-out. However, contrary to the common belief, we find that the network is running at low utilization and that if the network can be fully utilized, distributed training can achieve a scaling factor of close to one. Moreover, while many recent proposals on gradient compression advocate over 100x compression ratio, we show that under full network utilization, there is no need for gradient compression in 100 Gbps network. On the other hand, a lower speed network like 10 Gbps requires only 2x--5x gradients compression ratio to achieve almost linear scale-out. Compared to application-level techniques like gradient compression, network-level optimizations do not require changes to applications and do not hurt the performance of trained models. As such, we advocate that the real challenge of distributed training is for the network community to develop high-performance network transport to fully utilize the network capacity and achieve linear scale-out." @default.
- W3036710239 created "2020-06-25" @default.
- W3036710239 creator A5035186499 @default.
- W3036710239 creator A5043698075 @default.
- W3036710239 creator A5059283028 @default.
- W3036710239 creator A5062901935 @default.
- W3036710239 creator A5063457506 @default.
- W3036710239 creator A5084047386 @default.
- W3036710239 date "2020-06-17" @default.
- W3036710239 modified "2023-09-25" @default.
- W3036710239 title "Is Network the Bottleneck of Distributed Training?" @default.
- W3036710239 cites W1686810756 @default.
- W3036710239 cites W2108598243 @default.
- W3036710239 cites W2194775991 @default.
- W3036710239 cites W2407022425 @default.
- W3036710239 cites W2531786980 @default.
- W3036710239 cites W2535838896 @default.
- W3036710239 cites W2787998955 @default.
- W3036710239 cites W2789197829 @default.
- W3036710239 cites W2805997383 @default.
- W3036710239 cites W2914138317 @default.
- W3036710239 cites W2922527104 @default.
- W3036710239 cites W2952369090 @default.
- W3036710239 cites W2962747323 @default.
- W3036710239 cites W2962786385 @default.
- W3036710239 cites W2963540381 @default.
- W3036710239 cites W2963803379 @default.
- W3036710239 cites W2964004663 @default.
- W3036710239 cites W2964267428 @default.
- W3036710239 cites W2969388332 @default.
- W3036710239 cites W2970421227 @default.
- W3036710239 cites W2975712713 @default.
- W3036710239 cites W3101036738 @default.
- W3036710239 cites W3115851078 @default.
- W3036710239 doi "https://doi.org/10.48550/arxiv.2006.10103" @default.
- W3036710239 hasPublicationYear "2020" @default.
- W3036710239 type Work @default.
- W3036710239 sameAs 3036710239 @default.
- W3036710239 citedByCount "4" @default.
- W3036710239 countsByYear W30367102392020 @default.
- W3036710239 countsByYear W30367102392021 @default.
- W3036710239 crossrefType "posted-content" @default.
- W3036710239 hasAuthorship W3036710239A5035186499 @default.
- W3036710239 hasAuthorship W3036710239A5043698075 @default.
- W3036710239 hasAuthorship W3036710239A5059283028 @default.
- W3036710239 hasAuthorship W3036710239A5062901935 @default.
- W3036710239 hasAuthorship W3036710239A5063457506 @default.
- W3036710239 hasAuthorship W3036710239A5084047386 @default.
- W3036710239 hasBestOaLocation W30367102391 @default.
- W3036710239 hasConcept C120314980 @default.
- W3036710239 hasConcept C121332964 @default.
- W3036710239 hasConcept C127413603 @default.
- W3036710239 hasConcept C13280743 @default.
- W3036710239 hasConcept C149635348 @default.
- W3036710239 hasConcept C159985019 @default.
- W3036710239 hasConcept C170122806 @default.
- W3036710239 hasConcept C171146098 @default.
- W3036710239 hasConcept C180016635 @default.
- W3036710239 hasConcept C192562407 @default.
- W3036710239 hasConcept C203274722 @default.
- W3036710239 hasConcept C205649164 @default.
- W3036710239 hasConcept C2524010 @default.
- W3036710239 hasConcept C25797200 @default.
- W3036710239 hasConcept C2778755073 @default.
- W3036710239 hasConcept C2780513914 @default.
- W3036710239 hasConcept C31258907 @default.
- W3036710239 hasConcept C33923547 @default.
- W3036710239 hasConcept C41008148 @default.
- W3036710239 hasConcept C511840579 @default.
- W3036710239 hasConcept C62520636 @default.
- W3036710239 hasConcept C99844830 @default.
- W3036710239 hasConceptScore W3036710239C120314980 @default.
- W3036710239 hasConceptScore W3036710239C121332964 @default.
- W3036710239 hasConceptScore W3036710239C127413603 @default.
- W3036710239 hasConceptScore W3036710239C13280743 @default.
- W3036710239 hasConceptScore W3036710239C149635348 @default.
- W3036710239 hasConceptScore W3036710239C159985019 @default.
- W3036710239 hasConceptScore W3036710239C170122806 @default.
- W3036710239 hasConceptScore W3036710239C171146098 @default.
- W3036710239 hasConceptScore W3036710239C180016635 @default.
- W3036710239 hasConceptScore W3036710239C192562407 @default.
- W3036710239 hasConceptScore W3036710239C203274722 @default.
- W3036710239 hasConceptScore W3036710239C205649164 @default.
- W3036710239 hasConceptScore W3036710239C2524010 @default.
- W3036710239 hasConceptScore W3036710239C25797200 @default.
- W3036710239 hasConceptScore W3036710239C2778755073 @default.
- W3036710239 hasConceptScore W3036710239C2780513914 @default.
- W3036710239 hasConceptScore W3036710239C31258907 @default.
- W3036710239 hasConceptScore W3036710239C33923547 @default.
- W3036710239 hasConceptScore W3036710239C41008148 @default.
- W3036710239 hasConceptScore W3036710239C511840579 @default.
- W3036710239 hasConceptScore W3036710239C62520636 @default.
- W3036710239 hasConceptScore W3036710239C99844830 @default.
- W3036710239 hasLocation W30367102391 @default.
- W3036710239 hasLocation W30367102392 @default.
- W3036710239 hasOpenAccess W3036710239 @default.
- W3036710239 hasPrimaryLocation W30367102391 @default.
- W3036710239 hasRelatedWork W2353647904 @default.
- W3036710239 hasRelatedWork W2354251581 @default.
- W3036710239 hasRelatedWork W2357461155 @default.