Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285662664> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4285662664 abstract "Deep learning for recommendation data is one of the most pervasive and challenging AI workload in recent times. State-of-the-art recommendation models are one of the largest models matching the likes of GPT-3 and Switch Transformer. Challenges in deep learning recommendation models (DLRM) stem from learning dense embeddings for each of the categorical tokens. These embedding tables in industrial scale models can be as large as hundreds of terabytes. Such large models lead to a plethora of engineering challenges, not to mention prohibitive communication overheads, and slower training and inference times. Of these, slower inference time directly impacts user experience. Model compression for DLRM is gaining traction and the community has recently shown impressive compression results. In this paper, we present Random Offset Block Embedding Array (ROBE) as a low memory alternative to embedding tables which provide orders of magnitude reduction in memory usage while maintaining accuracy and boosting execution speed. ROBE is a simple fundamental approach in improving both cache performance and the variance of randomized hashing, which could be of independent interest in itself. We demonstrate that we can successfully train DLRM models with same accuracy while using $1000 times$ less memory. A $1000times$ compressed model directly results in faster inference without any engineering effort. In particular, we show that we can train DLRM model using ROBE array of size 100MB on a single GPU to achieve AUC of 0.8025 or higher as required by official MLPerf CriteoTB benchmark DLRM model of 100GB while achieving about $3.1times$ (209%) improvement in inference throughput." @default.
- W4285662664 created "2022-07-17" @default.
- W4285662664 creator A5002013975 @default.
- W4285662664 creator A5024993683 @default.
- W4285662664 creator A5074765626 @default.
- W4285662664 date "2021-08-04" @default.
- W4285662664 modified "2023-09-25" @default.
- W4285662664 title "Random Offset Block Embedding Array (ROBE) for CriteoTB Benchmark MLPerf DLRM Model : 1000$times$ Compression and 3.1$times$ Faster Inference" @default.
- W4285662664 doi "https://doi.org/10.48550/arxiv.2108.02191" @default.
- W4285662664 hasPublicationYear "2021" @default.
- W4285662664 type Work @default.
- W4285662664 citedByCount "0" @default.
- W4285662664 crossrefType "posted-content" @default.
- W4285662664 hasAuthorship W4285662664A5002013975 @default.
- W4285662664 hasAuthorship W4285662664A5024993683 @default.
- W4285662664 hasAuthorship W4285662664A5074765626 @default.
- W4285662664 hasBestOaLocation W42856626641 @default.
- W4285662664 hasConcept C108583219 @default.
- W4285662664 hasConcept C113775141 @default.
- W4285662664 hasConcept C115537543 @default.
- W4285662664 hasConcept C13280743 @default.
- W4285662664 hasConcept C154945302 @default.
- W4285662664 hasConcept C173608175 @default.
- W4285662664 hasConcept C175291020 @default.
- W4285662664 hasConcept C185798385 @default.
- W4285662664 hasConcept C199360897 @default.
- W4285662664 hasConcept C205649164 @default.
- W4285662664 hasConcept C2776214188 @default.
- W4285662664 hasConcept C41008148 @default.
- W4285662664 hasConcept C41608201 @default.
- W4285662664 hasConcept C43711488 @default.
- W4285662664 hasConcept C68339613 @default.
- W4285662664 hasConcept C76155785 @default.
- W4285662664 hasConcept C83283714 @default.
- W4285662664 hasConceptScore W4285662664C108583219 @default.
- W4285662664 hasConceptScore W4285662664C113775141 @default.
- W4285662664 hasConceptScore W4285662664C115537543 @default.
- W4285662664 hasConceptScore W4285662664C13280743 @default.
- W4285662664 hasConceptScore W4285662664C154945302 @default.
- W4285662664 hasConceptScore W4285662664C173608175 @default.
- W4285662664 hasConceptScore W4285662664C175291020 @default.
- W4285662664 hasConceptScore W4285662664C185798385 @default.
- W4285662664 hasConceptScore W4285662664C199360897 @default.
- W4285662664 hasConceptScore W4285662664C205649164 @default.
- W4285662664 hasConceptScore W4285662664C2776214188 @default.
- W4285662664 hasConceptScore W4285662664C41008148 @default.
- W4285662664 hasConceptScore W4285662664C41608201 @default.
- W4285662664 hasConceptScore W4285662664C43711488 @default.
- W4285662664 hasConceptScore W4285662664C68339613 @default.
- W4285662664 hasConceptScore W4285662664C76155785 @default.
- W4285662664 hasConceptScore W4285662664C83283714 @default.
- W4285662664 hasLocation W42856626641 @default.
- W4285662664 hasOpenAccess W4285662664 @default.
- W4285662664 hasPrimaryLocation W42856626641 @default.
- W4285662664 hasRelatedWork W1559572663 @default.
- W4285662664 hasRelatedWork W1784146144 @default.
- W4285662664 hasRelatedWork W1800827217 @default.
- W4285662664 hasRelatedWork W1820611261 @default.
- W4285662664 hasRelatedWork W1997967900 @default.
- W4285662664 hasRelatedWork W2582456645 @default.
- W4285662664 hasRelatedWork W3157678043 @default.
- W4285662664 hasRelatedWork W3179800311 @default.
- W4285662664 hasRelatedWork W3211672687 @default.
- W4285662664 hasRelatedWork W4307933444 @default.
- W4285662664 isParatext "false" @default.
- W4285662664 isRetracted "false" @default.
- W4285662664 workType "article" @default.