Matches in SemOpenAlex for { <https://semopenalex.org/work/W3146688250> ?p ?o ?g. }
Showing items 1 to 57 of
57
with 100 items per page.
- W3146688250 abstract "GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world’s fastest supercomputer in the TOP500 list, built at NUDT (National University of Defense Technology) last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability guarantees required by many HPC applications. By analyzing the SIMT (single-instruction, multiple-thread) characteristics of programs running on GPGPUs, we have developed PartialRC, a new checkpoint-based compiler-directed partial recomputing method, for achieving efficient fault recovery by leveraging the phenomenal computing power of GPGPUs. In this paper, we introduce our PartialRC method that recovers from errors detected in a code region by partially re-computing the region, describe a checkpoint-based faulttolerance framework developed on PartialRC, and discuss an implementation on the CUDA platform. Validation using a range of representative CUDA programs on NVIDIA GPGPUs against FullRC (a traditional full-recomputing Checkpoint-Rollback-Restart fault recovery method for CPUs) shows that PartialRC reduces significantly the fault recovery overheads incurred by FullRC, by 73.5% when errors occur earlier during execution and 74.6% when errors occur later on average. In addition, PartialRC also reduces error detection overheads incurred by FullRC during fault recovery while incurring negligible performance overheads when no fault happens." @default.
- W3146688250 created "2021-04-13" @default.
- W3146688250 creator A5049243005 @default.
- W3146688250 date "2012-01-01" @default.
- W3146688250 modified "2023-09-22" @default.
- W3146688250 title "PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs" @default.
- W3146688250 hasPublicationYear "2012" @default.
- W3146688250 type Work @default.
- W3146688250 sameAs 3146688250 @default.
- W3146688250 citedByCount "0" @default.
- W3146688250 crossrefType "journal-article" @default.
- W3146688250 hasAuthorship W3146688250A5049243005 @default.
- W3146688250 hasConcept C111919701 @default.
- W3146688250 hasConcept C169590947 @default.
- W3146688250 hasConcept C173608175 @default.
- W3146688250 hasConcept C21442007 @default.
- W3146688250 hasConcept C2778119891 @default.
- W3146688250 hasConcept C41008148 @default.
- W3146688250 hasConcept C50630238 @default.
- W3146688250 hasConcept C63540848 @default.
- W3146688250 hasConcept C83283714 @default.
- W3146688250 hasConceptScore W3146688250C111919701 @default.
- W3146688250 hasConceptScore W3146688250C169590947 @default.
- W3146688250 hasConceptScore W3146688250C173608175 @default.
- W3146688250 hasConceptScore W3146688250C21442007 @default.
- W3146688250 hasConceptScore W3146688250C2778119891 @default.
- W3146688250 hasConceptScore W3146688250C41008148 @default.
- W3146688250 hasConceptScore W3146688250C50630238 @default.
- W3146688250 hasConceptScore W3146688250C63540848 @default.
- W3146688250 hasConceptScore W3146688250C83283714 @default.
- W3146688250 hasLocation W31466882501 @default.
- W3146688250 hasOpenAccess W3146688250 @default.
- W3146688250 hasPrimaryLocation W31466882501 @default.
- W3146688250 hasRelatedWork W2004018853 @default.
- W3146688250 hasRelatedWork W2014413779 @default.
- W3146688250 hasRelatedWork W2016461387 @default.
- W3146688250 hasRelatedWork W2031260715 @default.
- W3146688250 hasRelatedWork W2036741878 @default.
- W3146688250 hasRelatedWork W2044920626 @default.
- W3146688250 hasRelatedWork W2057307872 @default.
- W3146688250 hasRelatedWork W2088552220 @default.
- W3146688250 hasRelatedWork W2109396327 @default.
- W3146688250 hasRelatedWork W2124439663 @default.
- W3146688250 hasRelatedWork W212533765 @default.
- W3146688250 hasRelatedWork W2132077583 @default.
- W3146688250 hasRelatedWork W2163891823 @default.
- W3146688250 hasRelatedWork W2382869634 @default.
- W3146688250 hasRelatedWork W2537242250 @default.
- W3146688250 hasRelatedWork W3043066357 @default.
- W3146688250 hasRelatedWork W3135754886 @default.
- W3146688250 hasRelatedWork W3183818740 @default.
- W3146688250 hasRelatedWork W3193156221 @default.
- W3146688250 hasRelatedWork W39601649 @default.
- W3146688250 isParatext "false" @default.
- W3146688250 isRetracted "false" @default.
- W3146688250 magId "3146688250" @default.
- W3146688250 workType "article" @default.