Matches in SemOpenAlex for { <https://semopenalex.org/work/W4366999304> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4366999304 abstract "It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit technical barrier for the wider community to access and leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel (FSDP) as an industry-grade solution for large model training. FSDP has been closely co-designed with several key PyTorch core components including Tensor implementation, dispatcher system, and CUDA memory caching allocator, to provide non-intrusive user experiences and high training efficiency. Additionally, FSDP natively incorporates a range of techniques and settings to optimize resource utilization across a variety of hardware configurations. The experimental results demonstrate that FSDP is capable of achieving comparable performance to Distributed Data Parallel while providing support for significantly larger models with near-linear scalability in terms of TFLOPS." @default.
- W4366999304 created "2023-04-27" @default.
- W4366999304 creator A5010674973 @default.
- W4366999304 creator A5014393716 @default.
- W4366999304 creator A5027893749 @default.
- W4366999304 creator A5035010141 @default.
- W4366999304 creator A5040311207 @default.
- W4366999304 creator A5042449490 @default.
- W4366999304 creator A5042895564 @default.
- W4366999304 creator A5045285053 @default.
- W4366999304 creator A5049675042 @default.
- W4366999304 creator A5058043863 @default.
- W4366999304 creator A5059764251 @default.
- W4366999304 creator A5059973452 @default.
- W4366999304 creator A5068083482 @default.
- W4366999304 creator A5074075244 @default.
- W4366999304 creator A5076248976 @default.
- W4366999304 creator A5090022624 @default.
- W4366999304 date "2023-04-21" @default.
- W4366999304 modified "2023-09-26" @default.
- W4366999304 title "PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel" @default.
- W4366999304 doi "https://doi.org/10.48550/arxiv.2304.11277" @default.
- W4366999304 hasPublicationYear "2023" @default.
- W4366999304 type Work @default.
- W4366999304 citedByCount "0" @default.
- W4366999304 crossrefType "posted-content" @default.
- W4366999304 hasAuthorship W4366999304A5010674973 @default.
- W4366999304 hasAuthorship W4366999304A5014393716 @default.
- W4366999304 hasAuthorship W4366999304A5027893749 @default.
- W4366999304 hasAuthorship W4366999304A5035010141 @default.
- W4366999304 hasAuthorship W4366999304A5040311207 @default.
- W4366999304 hasAuthorship W4366999304A5042449490 @default.
- W4366999304 hasAuthorship W4366999304A5042895564 @default.
- W4366999304 hasAuthorship W4366999304A5045285053 @default.
- W4366999304 hasAuthorship W4366999304A5049675042 @default.
- W4366999304 hasAuthorship W4366999304A5058043863 @default.
- W4366999304 hasAuthorship W4366999304A5059764251 @default.
- W4366999304 hasAuthorship W4366999304A5059973452 @default.
- W4366999304 hasAuthorship W4366999304A5068083482 @default.
- W4366999304 hasAuthorship W4366999304A5074075244 @default.
- W4366999304 hasAuthorship W4366999304A5076248976 @default.
- W4366999304 hasAuthorship W4366999304A5090022624 @default.
- W4366999304 hasBestOaLocation W43669993041 @default.
- W4366999304 hasConcept C120314980 @default.
- W4366999304 hasConcept C153083717 @default.
- W4366999304 hasConcept C154945302 @default.
- W4366999304 hasConcept C173608175 @default.
- W4366999304 hasConcept C202444582 @default.
- W4366999304 hasConcept C2524010 @default.
- W4366999304 hasConcept C33923547 @default.
- W4366999304 hasConcept C41008148 @default.
- W4366999304 hasConcept C48044578 @default.
- W4366999304 hasConcept C77088390 @default.
- W4366999304 hasConcept C9652623 @default.
- W4366999304 hasConcept C99844830 @default.
- W4366999304 hasConceptScore W4366999304C120314980 @default.
- W4366999304 hasConceptScore W4366999304C153083717 @default.
- W4366999304 hasConceptScore W4366999304C154945302 @default.
- W4366999304 hasConceptScore W4366999304C173608175 @default.
- W4366999304 hasConceptScore W4366999304C202444582 @default.
- W4366999304 hasConceptScore W4366999304C2524010 @default.
- W4366999304 hasConceptScore W4366999304C33923547 @default.
- W4366999304 hasConceptScore W4366999304C41008148 @default.
- W4366999304 hasConceptScore W4366999304C48044578 @default.
- W4366999304 hasConceptScore W4366999304C77088390 @default.
- W4366999304 hasConceptScore W4366999304C9652623 @default.
- W4366999304 hasConceptScore W4366999304C99844830 @default.
- W4366999304 hasLocation W43669993041 @default.
- W4366999304 hasOpenAccess W4366999304 @default.
- W4366999304 hasPrimaryLocation W43669993041 @default.
- W4366999304 hasRelatedWork W1525643724 @default.
- W4366999304 hasRelatedWork W1569389315 @default.
- W4366999304 hasRelatedWork W1992741870 @default.
- W4366999304 hasRelatedWork W2067938758 @default.
- W4366999304 hasRelatedWork W2302028273 @default.
- W4366999304 hasRelatedWork W2364921833 @default.
- W4366999304 hasRelatedWork W2380023786 @default.
- W4366999304 hasRelatedWork W2385146268 @default.
- W4366999304 hasRelatedWork W2546696010 @default.
- W4366999304 hasRelatedWork W2503642292 @default.
- W4366999304 isParatext "false" @default.
- W4366999304 isRetracted "false" @default.
- W4366999304 workType "article" @default.