Matches in SemOpenAlex for { <https://semopenalex.org/work/W3197720002> ?p ?o ?g. }
- W3197720002 abstract "Deep learning recommendation models (DLRMs) have been used across many business-critical services at Meta and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper, we present Neo, a software-hardware co-designed system for high-performance distributed training of large-scale DLRMs. Neo employs a novel 4D parallelism strategy that combines table-wise, row-wise, column-wise, and data parallelism for training massive embedding operators in DLRMs. In addition, Neo enables extremely high-performance and memory-efficient embedding computations using a variety of critical systems optimizations, including hybrid kernel fusion, software-managed caching, and quality-preserving compression. Finally, Neo is paired with ZionEX, a new hardware platform co-designed with Neo's 4D parallelism for optimizing communications for large-scale DLRM training. Our evaluation on 128 GPUs using 16 ZionEX nodes shows that Neo outperforms existing systems by up to 40× for training 12-trillion-parameter DLRM models deployed in production." @default.
- W3197720002 created "2021-09-13" @default.
- W3197720002 creator A5002281059 @default.
- W3197720002 creator A5003317373 @default.
- W3197720002 creator A5004002767 @default.
- W3197720002 creator A5004459350 @default.
- W3197720002 creator A5007535563 @default.
- W3197720002 creator A5007892541 @default.
- W3197720002 creator A5011400580 @default.
- W3197720002 creator A5011441302 @default.
- W3197720002 creator A5011731495 @default.
- W3197720002 creator A5014068004 @default.
- W3197720002 creator A5019595876 @default.
- W3197720002 creator A5020355643 @default.
- W3197720002 creator A5020495890 @default.
- W3197720002 creator A5029184375 @default.
- W3197720002 creator A5032557791 @default.
- W3197720002 creator A5033659558 @default.
- W3197720002 creator A5034003919 @default.
- W3197720002 creator A5038410573 @default.
- W3197720002 creator A5038603364 @default.
- W3197720002 creator A5039412958 @default.
- W3197720002 creator A5040205022 @default.
- W3197720002 creator A5045208071 @default.
- W3197720002 creator A5046705985 @default.
- W3197720002 creator A5058119792 @default.
- W3197720002 creator A5059614356 @default.
- W3197720002 creator A5059811129 @default.
- W3197720002 creator A5060130626 @default.
- W3197720002 creator A5061248847 @default.
- W3197720002 creator A5062437461 @default.
- W3197720002 creator A5062917955 @default.
- W3197720002 creator A5063171219 @default.
- W3197720002 creator A5063193730 @default.
- W3197720002 creator A5066366329 @default.
- W3197720002 creator A5068114879 @default.
- W3197720002 creator A5069729500 @default.
- W3197720002 creator A5069883905 @default.
- W3197720002 creator A5070194020 @default.
- W3197720002 creator A5071326128 @default.
- W3197720002 creator A5071363838 @default.
- W3197720002 creator A5071392807 @default.
- W3197720002 creator A5073566291 @default.
- W3197720002 creator A5075614313 @default.
- W3197720002 creator A5076405830 @default.
- W3197720002 creator A5078483147 @default.
- W3197720002 creator A5085294519 @default.
- W3197720002 creator A5086372658 @default.
- W3197720002 creator A5086922344 @default.
- W3197720002 creator A5088572292 @default.
- W3197720002 creator A5089183792 @default.
- W3197720002 creator A5089662089 @default.
- W3197720002 creator A5090022624 @default.
- W3197720002 creator A5090712600 @default.
- W3197720002 creator A5091325862 @default.
- W3197720002 date "2022-06-11" @default.
- W3197720002 modified "2023-10-17" @default.
- W3197720002 title "Software-hardware co-design for fast and scalable training of deep learning recommendation models" @default.
- W3197720002 cites W2054141820 @default.
- W3197720002 cites W2076618162 @default.
- W3197720002 cites W2097117768 @default.
- W3197720002 cites W2151166364 @default.
- W3197720002 cites W2210543184 @default.
- W3197720002 cites W2512971201 @default.
- W3197720002 cites W2605350416 @default.
- W3197720002 cites W2614794251 @default.
- W3197720002 cites W2766447205 @default.
- W3197720002 cites W2794670651 @default.
- W3197720002 cites W2969388332 @default.
- W3197720002 cites W2975712713 @default.
- W3197720002 cites W2981587687 @default.
- W3197720002 cites W2984020950 @default.
- W3197720002 cites W2996471668 @default.
- W3197720002 cites W3016842236 @default.
- W3197720002 cites W3086105743 @default.
- W3197720002 cites W3129488589 @default.
- W3197720002 cites W3129831491 @default.
- W3197720002 cites W3130104841 @default.
- W3197720002 cites W3158146252 @default.
- W3197720002 cites W3173720884 @default.
- W3197720002 doi "https://doi.org/10.1145/3470496.3533727" @default.
- W3197720002 hasPublicationYear "2022" @default.
- W3197720002 type Work @default.
- W3197720002 sameAs 3197720002 @default.
- W3197720002 citedByCount "18" @default.
- W3197720002 countsByYear W31977200022021 @default.
- W3197720002 countsByYear W31977200022022 @default.
- W3197720002 countsByYear W31977200022023 @default.
- W3197720002 crossrefType "proceedings-article" @default.
- W3197720002 hasAuthorship W3197720002A5002281059 @default.
- W3197720002 hasAuthorship W3197720002A5003317373 @default.
- W3197720002 hasAuthorship W3197720002A5004002767 @default.
- W3197720002 hasAuthorship W3197720002A5004459350 @default.
- W3197720002 hasAuthorship W3197720002A5007535563 @default.
- W3197720002 hasAuthorship W3197720002A5007892541 @default.
- W3197720002 hasAuthorship W3197720002A5011400580 @default.
- W3197720002 hasAuthorship W3197720002A5011441302 @default.
- W3197720002 hasAuthorship W3197720002A5011731495 @default.
- W3197720002 hasAuthorship W3197720002A5014068004 @default.
- W3197720002 hasAuthorship W3197720002A5019595876 @default.