Matches in SemOpenAlex for { <https://semopenalex.org/work/W2890925057> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W2890925057 abstract "When an MPI program experiences a failure, the most common recovery approach is to restart all processes from a previous checkpoint and to re-queue the entire job. A disadvantage of this method is that, although the failure occurred within the main application loop, live processes must start again from the beginning of the program, along with new replacement processes---this incurs unnecessary overhead for live processes. To avoid such overheads and concomitant delays, we introduce the concept of MPI Stages. MPI Stages saves internal MPI state in a separate checkpoint in conjunction with application state. Upon failure, both MPI and application state are recovered, respectively, from their last synchronous checkpoints and continue without restarting the overall MPI job. Live processes roll back only a few iterations within the main loop instead of rolling back to the beginning of the program, while a replacement of failed process restarts and reintegrates, thereby achieving faster failure recovery. This approach integrates well with large-scale, bulk synchronous applications and checkpoint/restart." @default.
- W2890925057 created "2018-09-27" @default.
- W2890925057 creator A5014448094 @default.
- W2890925057 creator A5026440046 @default.
- W2890925057 creator A5033868370 @default.
- W2890925057 creator A5051517667 @default.
- W2890925057 creator A5077782526 @default.
- W2890925057 creator A5080122988 @default.
- W2890925057 date "2018-09-23" @default.
- W2890925057 modified "2023-09-24" @default.
- W2890925057 title "MPI Stages" @default.
- W2890925057 cites W1695243693 @default.
- W2890925057 cites W1929619420 @default.
- W2890925057 cites W1981432246 @default.
- W2890925057 cites W1984564341 @default.
- W2890925057 cites W2001495258 @default.
- W2890925057 cites W2017060126 @default.
- W2890925057 cites W2021234574 @default.
- W2890925057 cites W2036641664 @default.
- W2890925057 cites W2037208432 @default.
- W2890925057 cites W2045879521 @default.
- W2890925057 cites W2084293824 @default.
- W2890925057 cites W2105039796 @default.
- W2890925057 cites W2128577831 @default.
- W2890925057 cites W2130604611 @default.
- W2890925057 cites W2340533753 @default.
- W2890925057 doi "https://doi.org/10.1145/3236367.3236385" @default.
- W2890925057 hasPublicationYear "2018" @default.
- W2890925057 type Work @default.
- W2890925057 sameAs 2890925057 @default.
- W2890925057 citedByCount "7" @default.
- W2890925057 countsByYear W28909250572019 @default.
- W2890925057 countsByYear W28909250572020 @default.
- W2890925057 countsByYear W28909250572022 @default.
- W2890925057 countsByYear W28909250572023 @default.
- W2890925057 crossrefType "proceedings-article" @default.
- W2890925057 hasAuthorship W2890925057A5014448094 @default.
- W2890925057 hasAuthorship W2890925057A5026440046 @default.
- W2890925057 hasAuthorship W2890925057A5033868370 @default.
- W2890925057 hasAuthorship W2890925057A5051517667 @default.
- W2890925057 hasAuthorship W2890925057A5077782526 @default.
- W2890925057 hasAuthorship W2890925057A5080122988 @default.
- W2890925057 hasConcept C111919701 @default.
- W2890925057 hasConcept C120314980 @default.
- W2890925057 hasConcept C173608175 @default.
- W2890925057 hasConcept C199360897 @default.
- W2890925057 hasConcept C2779960059 @default.
- W2890925057 hasConcept C41008148 @default.
- W2890925057 hasConcept C48103436 @default.
- W2890925057 hasConcept C854659 @default.
- W2890925057 hasConcept C98045186 @default.
- W2890925057 hasConceptScore W2890925057C111919701 @default.
- W2890925057 hasConceptScore W2890925057C120314980 @default.
- W2890925057 hasConceptScore W2890925057C173608175 @default.
- W2890925057 hasConceptScore W2890925057C199360897 @default.
- W2890925057 hasConceptScore W2890925057C2779960059 @default.
- W2890925057 hasConceptScore W2890925057C41008148 @default.
- W2890925057 hasConceptScore W2890925057C48103436 @default.
- W2890925057 hasConceptScore W2890925057C854659 @default.
- W2890925057 hasConceptScore W2890925057C98045186 @default.
- W2890925057 hasLocation W28909250571 @default.
- W2890925057 hasOpenAccess W2890925057 @default.
- W2890925057 hasPrimaryLocation W28909250571 @default.
- W2890925057 hasRelatedWork W1513967072 @default.
- W2890925057 hasRelatedWork W1582405267 @default.
- W2890925057 hasRelatedWork W2092071486 @default.
- W2890925057 hasRelatedWork W2350626096 @default.
- W2890925057 hasRelatedWork W2354106728 @default.
- W2890925057 hasRelatedWork W2391167130 @default.
- W2890925057 hasRelatedWork W2573708829 @default.
- W2890925057 hasRelatedWork W3011151575 @default.
- W2890925057 hasRelatedWork W4283067488 @default.
- W2890925057 hasRelatedWork W2460246254 @default.
- W2890925057 isParatext "false" @default.
- W2890925057 isRetracted "false" @default.
- W2890925057 magId "2890925057" @default.
- W2890925057 workType "article" @default.