Matches in SemOpenAlex for { <https://semopenalex.org/work/W2094233035> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W2094233035 abstract "This paper compares the theoretical efficiency of model-parallel and data-parallel distributed stochastic gradient descent training of DNNs. For a typical Switchboard DNN with 46M parameters, the results are not pretty: With modern GPUs and interconnects, model parallelism is optimal with only 3 GPUs in a single server, while data parallelism with a minibatch size of 1024 does not even scale to 2 GPUs. We further show that data-parallel training efficiency can be improved by increasing the minibatch size (through a combination of AdaGrad and automatic adjustments of learning rate and minibatch size) and data compression. We arrive at an estimated possible end-to-end speed-up of 5 times or more. We do not address issues of robustness to process failure or other issues that might occur during training, nor of speed of convergence differences between ASGD and SGD parameter update patterns." @default.
- W2094233035 created "2016-06-24" @default.
- W2094233035 creator A5012153296 @default.
- W2094233035 creator A5031351839 @default.
- W2094233035 creator A5034476404 @default.
- W2094233035 creator A5034936272 @default.
- W2094233035 creator A5072932051 @default.
- W2094233035 date "2014-05-01" @default.
- W2094233035 modified "2023-09-25" @default.
- W2094233035 title "On parallelizability of stochastic gradient descent for speech DNNS" @default.
- W2094233035 cites W1218987319 @default.
- W2094233035 cites W1498436455 @default.
- W2094233035 cites W2000200144 @default.
- W2094233035 cites W2012257340 @default.
- W2094233035 cites W2071310251 @default.
- W2094233035 cites W2087402357 @default.
- W2094233035 cites W2093794678 @default.
- W2094233035 cites W2114016253 @default.
- W2094233035 cites W2136922672 @default.
- W2094233035 cites W2160306971 @default.
- W2094233035 cites W2394932179 @default.
- W2094233035 cites W2403195671 @default.
- W2094233035 cites W4292363360 @default.
- W2094233035 doi "https://doi.org/10.1109/icassp.2014.6853593" @default.
- W2094233035 hasPublicationYear "2014" @default.
- W2094233035 type Work @default.
- W2094233035 sameAs 2094233035 @default.
- W2094233035 citedByCount "70" @default.
- W2094233035 countsByYear W20942330352014 @default.
- W2094233035 countsByYear W20942330352015 @default.
- W2094233035 countsByYear W20942330352016 @default.
- W2094233035 countsByYear W20942330352017 @default.
- W2094233035 countsByYear W20942330352018 @default.
- W2094233035 countsByYear W20942330352019 @default.
- W2094233035 countsByYear W20942330352020 @default.
- W2094233035 countsByYear W20942330352021 @default.
- W2094233035 countsByYear W20942330352022 @default.
- W2094233035 countsByYear W20942330352023 @default.
- W2094233035 crossrefType "proceedings-article" @default.
- W2094233035 hasAuthorship W2094233035A5012153296 @default.
- W2094233035 hasAuthorship W2094233035A5031351839 @default.
- W2094233035 hasAuthorship W2094233035A5034476404 @default.
- W2094233035 hasAuthorship W2094233035A5034936272 @default.
- W2094233035 hasAuthorship W2094233035A5072932051 @default.
- W2094233035 hasConcept C153258448 @default.
- W2094233035 hasConcept C154945302 @default.
- W2094233035 hasConcept C206688291 @default.
- W2094233035 hasConcept C28490314 @default.
- W2094233035 hasConcept C41008148 @default.
- W2094233035 hasConcept C50644808 @default.
- W2094233035 hasConceptScore W2094233035C153258448 @default.
- W2094233035 hasConceptScore W2094233035C154945302 @default.
- W2094233035 hasConceptScore W2094233035C206688291 @default.
- W2094233035 hasConceptScore W2094233035C28490314 @default.
- W2094233035 hasConceptScore W2094233035C41008148 @default.
- W2094233035 hasConceptScore W2094233035C50644808 @default.
- W2094233035 hasLocation W20942330351 @default.
- W2094233035 hasOpenAccess W2094233035 @default.
- W2094233035 hasPrimaryLocation W20942330351 @default.
- W2094233035 hasRelatedWork W2312116756 @default.
- W2094233035 hasRelatedWork W2368779261 @default.
- W2094233035 hasRelatedWork W2754816816 @default.
- W2094233035 hasRelatedWork W2778699561 @default.
- W2094233035 hasRelatedWork W2794438528 @default.
- W2094233035 hasRelatedWork W2893763841 @default.
- W2094233035 hasRelatedWork W2895097035 @default.
- W2094233035 hasRelatedWork W2995996972 @default.
- W2094233035 hasRelatedWork W3128571556 @default.
- W2094233035 hasRelatedWork W4206903459 @default.
- W2094233035 isParatext "false" @default.
- W2094233035 isRetracted "false" @default.
- W2094233035 magId "2094233035" @default.
- W2094233035 workType "article" @default.