Matches in SemOpenAlex for { <https://semopenalex.org/work/W2617019596> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W2617019596 abstract "Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks. A fundamental barrier for parallelizing large-scale SGD is the fact that the cost of communicating the gradient updates between nodes can be very large. Consequently, lossy compression heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always provably converge, and it is not clear whether they are optimal. In this paper, we propose Quantized SGD (QSGD), a family of compression schemes which allow the compression of gradient updates at each node, while guaranteeing convergence under standard assumptions. QSGD allows the user to trade off compression and convergence time: it can communicate a sublinear number of bits per iteration in the model dimension, and can achieve asymptotically optimal communication cost. We complement our theoretical results with empirical data, showing that QSGD can significantly reduce communication cost, while being competitive with standard uncompressed techniques on a variety of real tasks. In particular, experiments show that gradient quantization applied to training of deep neural networks for image classification and automated speech recognition can lead to significant reductions in communication cost, and end-to-end training time. For instance, on 16 GPUs, we are able to train a ResNet-152 network on ImageNet 1.8x faster to full accuracy. Of note, we show that there exist generic parameter settings under which all known network architectures preserve or slightly improve their full accuracy when using quantization." @default.
- W2617019596 created "2017-06-05" @default.
- W2617019596 creator A5006277823 @default.
- W2617019596 creator A5063706071 @default.
- W2617019596 creator A5070111907 @default.
- W2617019596 creator A5079870557 @default.
- W2617019596 creator A5083822059 @default.
- W2617019596 date "2016-10-07" @default.
- W2617019596 modified "2023-09-27" @default.
- W2617019596 title "QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks" @default.
- W2617019596 hasPublicationYear "2016" @default.
- W2617019596 type Work @default.
- W2617019596 sameAs 2617019596 @default.
- W2617019596 citedByCount "3" @default.
- W2617019596 countsByYear W26170195962017 @default.
- W2617019596 countsByYear W26170195962019 @default.
- W2617019596 countsByYear W26170195962021 @default.
- W2617019596 crossrefType "posted-content" @default.
- W2617019596 hasAuthorship W2617019596A5006277823 @default.
- W2617019596 hasAuthorship W2617019596A5063706071 @default.
- W2617019596 hasAuthorship W2617019596A5070111907 @default.
- W2617019596 hasAuthorship W2617019596A5079870557 @default.
- W2617019596 hasAuthorship W2617019596A5083822059 @default.
- W2617019596 hasConcept C111919701 @default.
- W2617019596 hasConcept C113775141 @default.
- W2617019596 hasConcept C11413529 @default.
- W2617019596 hasConcept C120314980 @default.
- W2617019596 hasConcept C126255220 @default.
- W2617019596 hasConcept C127705205 @default.
- W2617019596 hasConcept C151730666 @default.
- W2617019596 hasConcept C153258448 @default.
- W2617019596 hasConcept C154945302 @default.
- W2617019596 hasConcept C206688291 @default.
- W2617019596 hasConcept C2779343474 @default.
- W2617019596 hasConcept C28855332 @default.
- W2617019596 hasConcept C2984842247 @default.
- W2617019596 hasConcept C33923547 @default.
- W2617019596 hasConcept C41008148 @default.
- W2617019596 hasConcept C48044578 @default.
- W2617019596 hasConcept C50644808 @default.
- W2617019596 hasConcept C77088390 @default.
- W2617019596 hasConcept C86803240 @default.
- W2617019596 hasConceptScore W2617019596C111919701 @default.
- W2617019596 hasConceptScore W2617019596C113775141 @default.
- W2617019596 hasConceptScore W2617019596C11413529 @default.
- W2617019596 hasConceptScore W2617019596C120314980 @default.
- W2617019596 hasConceptScore W2617019596C126255220 @default.
- W2617019596 hasConceptScore W2617019596C127705205 @default.
- W2617019596 hasConceptScore W2617019596C151730666 @default.
- W2617019596 hasConceptScore W2617019596C153258448 @default.
- W2617019596 hasConceptScore W2617019596C154945302 @default.
- W2617019596 hasConceptScore W2617019596C206688291 @default.
- W2617019596 hasConceptScore W2617019596C2779343474 @default.
- W2617019596 hasConceptScore W2617019596C28855332 @default.
- W2617019596 hasConceptScore W2617019596C2984842247 @default.
- W2617019596 hasConceptScore W2617019596C33923547 @default.
- W2617019596 hasConceptScore W2617019596C41008148 @default.
- W2617019596 hasConceptScore W2617019596C48044578 @default.
- W2617019596 hasConceptScore W2617019596C50644808 @default.
- W2617019596 hasConceptScore W2617019596C77088390 @default.
- W2617019596 hasConceptScore W2617019596C86803240 @default.
- W2617019596 hasLocation W26170195961 @default.
- W2617019596 hasOpenAccess W2617019596 @default.
- W2617019596 hasPrimaryLocation W26170195961 @default.
- W2617019596 hasRelatedWork W2896404430 @default.
- W2617019596 hasRelatedWork W2901028208 @default.
- W2617019596 hasRelatedWork W2943847545 @default.
- W2617019596 hasRelatedWork W2950819300 @default.
- W2617019596 hasRelatedWork W2952369090 @default.
- W2617019596 hasRelatedWork W2962710991 @default.
- W2617019596 hasRelatedWork W2970020383 @default.
- W2617019596 hasRelatedWork W3087165870 @default.
- W2617019596 hasRelatedWork W3096004388 @default.
- W2617019596 hasRelatedWork W3118543361 @default.
- W2617019596 hasRelatedWork W3127479988 @default.
- W2617019596 hasRelatedWork W3130801519 @default.
- W2617019596 hasRelatedWork W3131685502 @default.
- W2617019596 hasRelatedWork W3131880614 @default.
- W2617019596 hasRelatedWork W3136347532 @default.
- W2617019596 hasRelatedWork W3174871729 @default.
- W2617019596 hasRelatedWork W3198707348 @default.
- W2617019596 hasRelatedWork W3204450046 @default.
- W2617019596 hasRelatedWork W3204507145 @default.
- W2617019596 hasRelatedWork W3205035978 @default.
- W2617019596 isParatext "false" @default.
- W2617019596 isRetracted "false" @default.
- W2617019596 magId "2617019596" @default.
- W2617019596 workType "article" @default.