Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367859812> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W4367859812 abstract "General Matrix Multiplication (GEMM) is a crucial algorithm for various applications such as machine learning and scientific computing since an efficient GEMM implementation is essential for the performance of these calculations. While researchers often strive for faster performance by using large computing platforms, the increased scale of these systems can raise concerns about hardware and software reliability. In this paper, we present a design of a high-performance GPU-based GEMM that integrates an algorithm-based fault tolerance scheme that detects and corrects silent data corruptions at computing units on-the-fly. We explore fault-tolerant designs for GEMM at the thread, warp, and threadblock levels, and also provide a baseline GEMM implementation that is competitive with or faster than the state-of-the-art, closed-source cuBLAS GEMM. We present a kernel fusion strategy to overlap and mitigate the memory latency due to fault tolerance with the original GEMM computation. To support a wide range of input matrix shapes and reduce development costs, we present a template-based approach for automatic code generation for both fault-tolerant and non-fault-tolerant GEMM implementations. We evaluate our work on NVIDIA Tesla T4 and A100 server GPUs. Our experimental results demonstrate that our baseline GEMM shows comparable or superior performance compared to the closed-source cuBLAS. Compared with the prior state-of-the-art non-fused fault-tolerant GEMM, our optimal fused strategy achieves a 39.04% speedup on average. In addition, our fault-tolerant GEMM incurs only a minimal overhead (8.89% on average) compared to cuBLAS even with hundreds of errors injected per minute. For irregularly shaped inputs, the code generator-generated kernels show remarkable speedups of 160% ~ 183.5% and 148.55% ~ 165.12% for fault-tolerant and non-fault-tolerant GEMMs, respectively, which outperforms cuBLAS by up to 41.40%." @default.
- W4367859812 created "2023-05-04" @default.
- W4367859812 creator A5022569738 @default.
- W4367859812 creator A5035339023 @default.
- W4367859812 creator A5051636078 @default.
- W4367859812 creator A5061737717 @default.
- W4367859812 creator A5071045650 @default.
- W4367859812 creator A5075806978 @default.
- W4367859812 creator A5088401238 @default.
- W4367859812 date "2023-06-21" @default.
- W4367859812 modified "2023-09-24" @default.
- W4367859812 title "Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs" @default.
- W4367859812 cites W1541483005 @default.
- W4367859812 cites W1898823215 @default.
- W4367859812 cites W1986905947 @default.
- W4367859812 cites W1988070283 @default.
- W4367859812 cites W1995746640 @default.
- W4367859812 cites W1997843200 @default.
- W4367859812 cites W2023856022 @default.
- W4367859812 cites W2030806070 @default.
- W4367859812 cites W2034593585 @default.
- W4367859812 cites W2052455844 @default.
- W4367859812 cites W2095928739 @default.
- W4367859812 cites W2105524676 @default.
- W4367859812 cites W2118832200 @default.
- W4367859812 cites W2125768532 @default.
- W4367859812 cites W2128511938 @default.
- W4367859812 cites W2130076536 @default.
- W4367859812 cites W2130189691 @default.
- W4367859812 cites W2134320686 @default.
- W4367859812 cites W2150981663 @default.
- W4367859812 cites W2151984682 @default.
- W4367859812 cites W2152211247 @default.
- W4367859812 cites W2155893237 @default.
- W4367859812 cites W2156514327 @default.
- W4367859812 cites W2169596872 @default.
- W4367859812 cites W2170196949 @default.
- W4367859812 cites W2229245554 @default.
- W4367859812 cites W2292469857 @default.
- W4367859812 cites W2296204683 @default.
- W4367859812 cites W2343351966 @default.
- W4367859812 cites W2411755313 @default.
- W4367859812 cites W2418331349 @default.
- W4367859812 cites W2647773517 @default.
- W4367859812 cites W2767260595 @default.
- W4367859812 cites W2767321582 @default.
- W4367859812 cites W2767694495 @default.
- W4367859812 cites W2962821792 @default.
- W4367859812 cites W2986161099 @default.
- W4367859812 cites W3105862567 @default.
- W4367859812 cites W3206892724 @default.
- W4367859812 cites W3014127403 @default.
- W4367859812 doi "https://doi.org/10.1145/3577193.3593715" @default.
- W4367859812 hasPublicationYear "2023" @default.
- W4367859812 type Work @default.
- W4367859812 citedByCount "0" @default.
- W4367859812 crossrefType "proceedings-article" @default.
- W4367859812 hasAuthorship W4367859812A5022569738 @default.
- W4367859812 hasAuthorship W4367859812A5035339023 @default.
- W4367859812 hasAuthorship W4367859812A5051636078 @default.
- W4367859812 hasAuthorship W4367859812A5061737717 @default.
- W4367859812 hasAuthorship W4367859812A5071045650 @default.
- W4367859812 hasAuthorship W4367859812A5075806978 @default.
- W4367859812 hasAuthorship W4367859812A5088401238 @default.
- W4367859812 hasBestOaLocation W43678598121 @default.
- W4367859812 hasConcept C120314980 @default.
- W4367859812 hasConcept C173608175 @default.
- W4367859812 hasConcept C41008148 @default.
- W4367859812 hasConcept C63540848 @default.
- W4367859812 hasConcept C68339613 @default.
- W4367859812 hasConcept C83283714 @default.
- W4367859812 hasConceptScore W4367859812C120314980 @default.
- W4367859812 hasConceptScore W4367859812C173608175 @default.
- W4367859812 hasConceptScore W4367859812C41008148 @default.
- W4367859812 hasConceptScore W4367859812C63540848 @default.
- W4367859812 hasConceptScore W4367859812C68339613 @default.
- W4367859812 hasConceptScore W4367859812C83283714 @default.
- W4367859812 hasFunder F4320306084 @default.
- W4367859812 hasLocation W43678598121 @default.
- W4367859812 hasLocation W43678598122 @default.
- W4367859812 hasOpenAccess W4367859812 @default.
- W4367859812 hasPrimaryLocation W43678598121 @default.
- W4367859812 hasRelatedWork W1509211761 @default.
- W4367859812 hasRelatedWork W1531488649 @default.
- W4367859812 hasRelatedWork W1585350690 @default.
- W4367859812 hasRelatedWork W1588481459 @default.
- W4367859812 hasRelatedWork W2011313916 @default.
- W4367859812 hasRelatedWork W2119821807 @default.
- W4367859812 hasRelatedWork W2133693067 @default.
- W4367859812 hasRelatedWork W2366027386 @default.
- W4367859812 hasRelatedWork W2391299576 @default.
- W4367859812 hasRelatedWork W3150687539 @default.
- W4367859812 isParatext "false" @default.
- W4367859812 isRetracted "false" @default.
- W4367859812 workType "article" @default.