Matches in SemOpenAlex for { <https://semopenalex.org/work/W2898046929> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W2898046929 endingPage "5230" @default.
- W2898046929 startingPage "5220" @default.
- W2898046929 abstract "Machine learning (ML) training algorithms often possess an inherent self-correcting behavior due to their iterative-convergent nature. Recent systems exploit this property to achieve adaptability and efficiency in unreliable computing environments by relaxing the consistency of execution and allowing calculation errors to be self-corrected during training. However, the behavior of such systems are only well understood for specific types of calculation errors, such as those caused by staleness, reduced precision, or asynchronicity, and for specific types of training algorithms, such as stochastic gradient descent. In this paper, we develop a general framework to quantify the effects of calculation errors on iterative-convergent algorithms and use this framework to design new strategies for checkpoint-based fault tolerance. Our framework yields a worst-case upper bound on the iteration cost of arbitrary perturbations to model parameters during training. Our system, SCAR, employs strategies which reduce the iteration cost upper bound due to perturbations incurred when recovering from checkpoints. We show that SCAR can reduce the iteration cost of partial failures by 78% - 95% when compared with traditional checkpoint-based fault tolerance across a variety of ML models and training algorithms." @default.
- W2898046929 created "2018-10-26" @default.
- W2898046929 creator A5009547049 @default.
- W2898046929 creator A5010626339 @default.
- W2898046929 creator A5047970850 @default.
- W2898046929 creator A5073820927 @default.
- W2898046929 date "2019-05-24" @default.
- W2898046929 modified "2023-10-05" @default.
- W2898046929 title "Fault Tolerance in Iterative-Convergent Machine Learning" @default.
- W2898046929 hasPublicationYear "2019" @default.
- W2898046929 type Work @default.
- W2898046929 sameAs 2898046929 @default.
- W2898046929 citedByCount "6" @default.
- W2898046929 countsByYear W28980469292020 @default.
- W2898046929 countsByYear W28980469292021 @default.
- W2898046929 crossrefType "proceedings-article" @default.
- W2898046929 hasAuthorship W2898046929A5009547049 @default.
- W2898046929 hasAuthorship W2898046929A5010626339 @default.
- W2898046929 hasAuthorship W2898046929A5047970850 @default.
- W2898046929 hasAuthorship W2898046929A5073820927 @default.
- W2898046929 hasConcept C111472728 @default.
- W2898046929 hasConcept C11413529 @default.
- W2898046929 hasConcept C120314980 @default.
- W2898046929 hasConcept C134306372 @default.
- W2898046929 hasConcept C138885662 @default.
- W2898046929 hasConcept C154945302 @default.
- W2898046929 hasConcept C157553263 @default.
- W2898046929 hasConcept C159694833 @default.
- W2898046929 hasConcept C165696696 @default.
- W2898046929 hasConcept C177606310 @default.
- W2898046929 hasConcept C18903297 @default.
- W2898046929 hasConcept C189950617 @default.
- W2898046929 hasConcept C206688291 @default.
- W2898046929 hasConcept C2776436953 @default.
- W2898046929 hasConcept C33923547 @default.
- W2898046929 hasConcept C38652104 @default.
- W2898046929 hasConcept C41008148 @default.
- W2898046929 hasConcept C50644808 @default.
- W2898046929 hasConcept C63540848 @default.
- W2898046929 hasConcept C77553402 @default.
- W2898046929 hasConcept C86803240 @default.
- W2898046929 hasConceptScore W2898046929C111472728 @default.
- W2898046929 hasConceptScore W2898046929C11413529 @default.
- W2898046929 hasConceptScore W2898046929C120314980 @default.
- W2898046929 hasConceptScore W2898046929C134306372 @default.
- W2898046929 hasConceptScore W2898046929C138885662 @default.
- W2898046929 hasConceptScore W2898046929C154945302 @default.
- W2898046929 hasConceptScore W2898046929C157553263 @default.
- W2898046929 hasConceptScore W2898046929C159694833 @default.
- W2898046929 hasConceptScore W2898046929C165696696 @default.
- W2898046929 hasConceptScore W2898046929C177606310 @default.
- W2898046929 hasConceptScore W2898046929C18903297 @default.
- W2898046929 hasConceptScore W2898046929C189950617 @default.
- W2898046929 hasConceptScore W2898046929C206688291 @default.
- W2898046929 hasConceptScore W2898046929C2776436953 @default.
- W2898046929 hasConceptScore W2898046929C33923547 @default.
- W2898046929 hasConceptScore W2898046929C38652104 @default.
- W2898046929 hasConceptScore W2898046929C41008148 @default.
- W2898046929 hasConceptScore W2898046929C50644808 @default.
- W2898046929 hasConceptScore W2898046929C63540848 @default.
- W2898046929 hasConceptScore W2898046929C77553402 @default.
- W2898046929 hasConceptScore W2898046929C86803240 @default.
- W2898046929 hasLocation W28980469291 @default.
- W2898046929 hasOpenAccess W2898046929 @default.
- W2898046929 hasPrimaryLocation W28980469291 @default.
- W2898046929 hasRelatedWork W1442374986 @default.
- W2898046929 hasRelatedWork W1485800019 @default.
- W2898046929 hasRelatedWork W1557623003 @default.
- W2898046929 hasRelatedWork W186881209 @default.
- W2898046929 hasRelatedWork W1993754891 @default.
- W2898046929 hasRelatedWork W2083842231 @default.
- W2898046929 hasRelatedWork W2574401261 @default.
- W2898046929 hasRelatedWork W2621365674 @default.
- W2898046929 hasRelatedWork W2739720758 @default.
- W2898046929 hasRelatedWork W2946208315 @default.
- W2898046929 hasRelatedWork W2974142167 @default.
- W2898046929 hasRelatedWork W2979999556 @default.
- W2898046929 hasRelatedWork W3140223115 @default.
- W2898046929 hasRelatedWork W3162026808 @default.
- W2898046929 hasRelatedWork W3170064614 @default.
- W2898046929 hasRelatedWork W3171288285 @default.
- W2898046929 hasRelatedWork W3192938859 @default.
- W2898046929 hasRelatedWork W3202086817 @default.
- W2898046929 hasRelatedWork W83651405 @default.
- W2898046929 hasRelatedWork W2182305664 @default.
- W2898046929 isParatext "false" @default.
- W2898046929 isRetracted "false" @default.
- W2898046929 magId "2898046929" @default.
- W2898046929 workType "article" @default.