Matches in SemOpenAlex for { <https://semopenalex.org/work/W4377157076> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W4377157076 abstract "Large-scale deep learning models contribute to significant performance improvements on varieties of downstream tasks. Current data and model parallelism approaches utilize model replication and partition techniques to support the distributed training of ultra-large models. However, directly deploying these systems often leads to sub-optimal training efficiency due to the complex model architectures and the strict device memory constraints. In this paper, we propose Optimal Sharded Data Parallel (OSDP), an automated parallel training system that combines the advantages from both data and model parallelism. Given the model description and the device information, OSDP makes trade-offs between the memory consumption and the hardware utilization, thus automatically generates the distributed computation graph and maximizes the overall system throughput. In addition, OSDP introduces operator splitting to further alleviate peak memory footprints during training with negligible overheads, which enables the trainability of larger models as well as the higher throughput. Extensive experimental results of OSDP on multiple different kinds of large-scale models demonstrate that the proposed strategy outperforms the state-of-the-art in multiple regards." @default.
- W4377157076 created "2023-05-21" @default.
- W4377157076 creator A5015552951 @default.
- W4377157076 creator A5039254679 @default.
- W4377157076 creator A5059601307 @default.
- W4377157076 creator A5062357883 @default.
- W4377157076 creator A5070725905 @default.
- W4377157076 date "2023-05-17" @default.
- W4377157076 modified "2023-09-27" @default.
- W4377157076 title "OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning" @default.
- W4377157076 doi "https://doi.org/10.48550/arxiv.2305.09940" @default.
- W4377157076 hasPublicationYear "2023" @default.
- W4377157076 type Work @default.
- W4377157076 citedByCount "0" @default.
- W4377157076 crossrefType "posted-content" @default.
- W4377157076 hasAuthorship W4377157076A5015552951 @default.
- W4377157076 hasAuthorship W4377157076A5039254679 @default.
- W4377157076 hasAuthorship W4377157076A5059601307 @default.
- W4377157076 hasAuthorship W4377157076A5062357883 @default.
- W4377157076 hasAuthorship W4377157076A5070725905 @default.
- W4377157076 hasBestOaLocation W43771570761 @default.
- W4377157076 hasConcept C105795698 @default.
- W4377157076 hasConcept C108583219 @default.
- W4377157076 hasConcept C11413529 @default.
- W4377157076 hasConcept C114614502 @default.
- W4377157076 hasConcept C120314980 @default.
- W4377157076 hasConcept C121332964 @default.
- W4377157076 hasConcept C12590798 @default.
- W4377157076 hasConcept C132525143 @default.
- W4377157076 hasConcept C133875982 @default.
- W4377157076 hasConcept C154945302 @default.
- W4377157076 hasConcept C157764524 @default.
- W4377157076 hasConcept C173608175 @default.
- W4377157076 hasConcept C2778755073 @default.
- W4377157076 hasConcept C2781172179 @default.
- W4377157076 hasConcept C33923547 @default.
- W4377157076 hasConcept C41008148 @default.
- W4377157076 hasConcept C42812 @default.
- W4377157076 hasConcept C45374587 @default.
- W4377157076 hasConcept C555944384 @default.
- W4377157076 hasConcept C61483411 @default.
- W4377157076 hasConcept C62520636 @default.
- W4377157076 hasConcept C76155785 @default.
- W4377157076 hasConcept C80444323 @default.
- W4377157076 hasConcept C91481028 @default.
- W4377157076 hasConceptScore W4377157076C105795698 @default.
- W4377157076 hasConceptScore W4377157076C108583219 @default.
- W4377157076 hasConceptScore W4377157076C11413529 @default.
- W4377157076 hasConceptScore W4377157076C114614502 @default.
- W4377157076 hasConceptScore W4377157076C120314980 @default.
- W4377157076 hasConceptScore W4377157076C121332964 @default.
- W4377157076 hasConceptScore W4377157076C12590798 @default.
- W4377157076 hasConceptScore W4377157076C132525143 @default.
- W4377157076 hasConceptScore W4377157076C133875982 @default.
- W4377157076 hasConceptScore W4377157076C154945302 @default.
- W4377157076 hasConceptScore W4377157076C157764524 @default.
- W4377157076 hasConceptScore W4377157076C173608175 @default.
- W4377157076 hasConceptScore W4377157076C2778755073 @default.
- W4377157076 hasConceptScore W4377157076C2781172179 @default.
- W4377157076 hasConceptScore W4377157076C33923547 @default.
- W4377157076 hasConceptScore W4377157076C41008148 @default.
- W4377157076 hasConceptScore W4377157076C42812 @default.
- W4377157076 hasConceptScore W4377157076C45374587 @default.
- W4377157076 hasConceptScore W4377157076C555944384 @default.
- W4377157076 hasConceptScore W4377157076C61483411 @default.
- W4377157076 hasConceptScore W4377157076C62520636 @default.
- W4377157076 hasConceptScore W4377157076C76155785 @default.
- W4377157076 hasConceptScore W4377157076C80444323 @default.
- W4377157076 hasConceptScore W4377157076C91481028 @default.
- W4377157076 hasLocation W43771570761 @default.
- W4377157076 hasLocation W43771570762 @default.
- W4377157076 hasOpenAccess W4377157076 @default.
- W4377157076 hasPrimaryLocation W43771570761 @default.
- W4377157076 hasRelatedWork W1496703677 @default.
- W4377157076 hasRelatedWork W1601078274 @default.
- W4377157076 hasRelatedWork W1608806855 @default.
- W4377157076 hasRelatedWork W2023505575 @default.
- W4377157076 hasRelatedWork W2047588290 @default.
- W4377157076 hasRelatedWork W2107882594 @default.
- W4377157076 hasRelatedWork W2348711589 @default.
- W4377157076 hasRelatedWork W3095555187 @default.
- W4377157076 hasRelatedWork W4240606930 @default.
- W4377157076 hasRelatedWork W2480956401 @default.
- W4377157076 isParatext "false" @default.
- W4377157076 isRetracted "false" @default.
- W4377157076 workType "article" @default.