Matches in SemOpenAlex for { <https://semopenalex.org/work/W1989653583> ?p ?o ?g. }
- W1989653583 abstract "The increasing size and complexity of massively parallel systems (e.g. HPC systems) is making it increasingly likely that individual circuits will produce erroneous results. For this reason, novel fault tolerance approaches are increasingly needed. Prior fault tolerance approaches often rely on checkpoint-rollback based schemes. Unfortunately, such schemes are primarily limited to rare error event scenarios as the overheads of such schemes become prohibitive if faults are common. In this paper, we propose a novel approach for algorithmic correction of faulty application outputs. The key insight for this approach is that even under high error scenarios, even if the result of an algorithm is erroneous, most of it is correct. Instead of simply rolling back to the most recent checkpoint and repeating the entire segment of computation, our novel resilience approach uses algorithmic error localization and partial recomputation to efficiently correct the corrupted results. We evaluate our approach in the specific algorithmic scenario of linear algebra operations, focusing on matrix-vector multiplication (MVM) and iterative linear solvers. We develop a novel technique for localizing errors in MVM and show how to achieve partial recomputation within this algorithm, and demonstrate that this approach both improves the performance of the Conjugate Gradient solver in high error scenarios by 3x-4x and increases the probability that it completes successfully by up to 60% with parallel experiments up to 100 nodes." @default.
- W1989653583 created "2016-06-24" @default.
- W1989653583 creator A5062929833 @default.
- W1989653583 creator A5067547869 @default.
- W1989653583 creator A5084379116 @default.
- W1989653583 date "2013-06-01" @default.
- W1989653583 modified "2023-09-29" @default.
- W1989653583 title "An algorithmic approach to error localization and partial recomputation for low-overhead fault tolerance" @default.
- W1989653583 cites W1592673641 @default.
- W1989653583 cites W1768849904 @default.
- W1989653583 cites W1974635992 @default.
- W1989653583 cites W1984564341 @default.
- W1989653583 cites W2004675401 @default.
- W1989653583 cites W2037523067 @default.
- W1989653583 cites W2083613288 @default.
- W1989653583 cites W2118582701 @default.
- W1989653583 cites W2123475473 @default.
- W1989653583 cites W2123728588 @default.
- W1989653583 cites W2134864637 @default.
- W1989653583 cites W2142843905 @default.
- W1989653583 cites W2147078347 @default.
- W1989653583 cites W2150679822 @default.
- W1989653583 cites W2159161022 @default.
- W1989653583 cites W2169596872 @default.
- W1989653583 cites W3022806736 @default.
- W1989653583 cites W3145311561 @default.
- W1989653583 doi "https://doi.org/10.1109/dsn.2013.6575309" @default.
- W1989653583 hasPublicationYear "2013" @default.
- W1989653583 type Work @default.
- W1989653583 sameAs 1989653583 @default.
- W1989653583 citedByCount "40" @default.
- W1989653583 countsByYear W19896535832013 @default.
- W1989653583 countsByYear W19896535832014 @default.
- W1989653583 countsByYear W19896535832015 @default.
- W1989653583 countsByYear W19896535832016 @default.
- W1989653583 countsByYear W19896535832017 @default.
- W1989653583 countsByYear W19896535832018 @default.
- W1989653583 countsByYear W19896535832019 @default.
- W1989653583 countsByYear W19896535832020 @default.
- W1989653583 countsByYear W19896535832021 @default.
- W1989653583 countsByYear W19896535832022 @default.
- W1989653583 countsByYear W19896535832023 @default.
- W1989653583 crossrefType "proceedings-article" @default.
- W1989653583 hasAuthorship W1989653583A5062929833 @default.
- W1989653583 hasAuthorship W1989653583A5067547869 @default.
- W1989653583 hasAuthorship W1989653583A5084379116 @default.
- W1989653583 hasConcept C103088060 @default.
- W1989653583 hasConcept C111919701 @default.
- W1989653583 hasConcept C113775141 @default.
- W1989653583 hasConcept C11413529 @default.
- W1989653583 hasConcept C120314980 @default.
- W1989653583 hasConcept C121332964 @default.
- W1989653583 hasConcept C17349429 @default.
- W1989653583 hasConcept C173608175 @default.
- W1989653583 hasConcept C174220543 @default.
- W1989653583 hasConcept C190475519 @default.
- W1989653583 hasConcept C199360897 @default.
- W1989653583 hasConcept C2777187653 @default.
- W1989653583 hasConcept C2779585090 @default.
- W1989653583 hasConcept C2779960059 @default.
- W1989653583 hasConcept C41008148 @default.
- W1989653583 hasConcept C4822641 @default.
- W1989653583 hasConcept C62520636 @default.
- W1989653583 hasConcept C63540848 @default.
- W1989653583 hasConcept C75949130 @default.
- W1989653583 hasConcept C80444323 @default.
- W1989653583 hasConcept C84114770 @default.
- W1989653583 hasConcept C97355855 @default.
- W1989653583 hasConceptScore W1989653583C103088060 @default.
- W1989653583 hasConceptScore W1989653583C111919701 @default.
- W1989653583 hasConceptScore W1989653583C113775141 @default.
- W1989653583 hasConceptScore W1989653583C11413529 @default.
- W1989653583 hasConceptScore W1989653583C120314980 @default.
- W1989653583 hasConceptScore W1989653583C121332964 @default.
- W1989653583 hasConceptScore W1989653583C17349429 @default.
- W1989653583 hasConceptScore W1989653583C173608175 @default.
- W1989653583 hasConceptScore W1989653583C174220543 @default.
- W1989653583 hasConceptScore W1989653583C190475519 @default.
- W1989653583 hasConceptScore W1989653583C199360897 @default.
- W1989653583 hasConceptScore W1989653583C2777187653 @default.
- W1989653583 hasConceptScore W1989653583C2779585090 @default.
- W1989653583 hasConceptScore W1989653583C2779960059 @default.
- W1989653583 hasConceptScore W1989653583C41008148 @default.
- W1989653583 hasConceptScore W1989653583C4822641 @default.
- W1989653583 hasConceptScore W1989653583C62520636 @default.
- W1989653583 hasConceptScore W1989653583C63540848 @default.
- W1989653583 hasConceptScore W1989653583C75949130 @default.
- W1989653583 hasConceptScore W1989653583C80444323 @default.
- W1989653583 hasConceptScore W1989653583C84114770 @default.
- W1989653583 hasConceptScore W1989653583C97355855 @default.
- W1989653583 hasLocation W19896535831 @default.
- W1989653583 hasOpenAccess W1989653583 @default.
- W1989653583 hasPrimaryLocation W19896535831 @default.
- W1989653583 hasRelatedWork W119040397 @default.
- W1989653583 hasRelatedWork W1513155443 @default.
- W1989653583 hasRelatedWork W1543798151 @default.
- W1989653583 hasRelatedWork W1580237945 @default.
- W1989653583 hasRelatedWork W1585772192 @default.
- W1989653583 hasRelatedWork W1989653583 @default.
- W1989653583 hasRelatedWork W2097946733 @default.