Matches in SemOpenAlex for { <https://semopenalex.org/work/W2607600913> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2607600913 abstract "Author(s): Wu, Panruo | Advisor(s): Chen, Zizhong | Abstract: The lack of efficient resilience solutions is expected to be a major problem for the coming exascale supercomputers, as the chance that a long running large scale computation can finish without faults is diminishing quickly. In this dissertation I try to develop algorithmic techniques to provide fault tolerance for the commonly used matrix factorization algorithms and its high performance implementation in distributed memory massively parallel systems, with very low overhead and high scalability.Specifically, I design numerical error correcting encoding of matrix and the corresponding algorithms to tolerate hardware faults during matrix factorizations. It is in common with error correcting codes (ECC) used widely in communication and storage systems that use codes to detect and correct errors occured during communication or at rest in storage cells. The salient difference is that while ECC protects invariable data, I need an ECC for variable matrix that is under factorization. My previous and current work covers the design of such algorithmic fault tolerance techniques for the six most widely used matrix factorizations — LU, QR, Cholesky, SVD, Hessenberg reduction, and tridiagonal reduction which comprise the core functionality of the de facto dense linear algebra package ScaLAPACK (Scalable Linear Algebra PACKage). The novel approach I used extensively is the on-line ABFT which not only designs the numerical codes but also modifies the algorithm to maintain the checksum in flight. For LU/QR/Cholesky factorizations, the on-line transformation results in vastly improved fault tolerance at a small extra cost. For SVD/Hessenberg/tridiagonal factorizations where no ABFT exist, the on-line ABFT fills this void and produces similarly highly scalable, resilient, and efficient algorithms and implementations." @default.
- W2607600913 created "2017-05-05" @default.
- W2607600913 creator A5020822198 @default.
- W2607600913 date "2016-01-01" @default.
- W2607600913 modified "2023-09-27" @default.
- W2607600913 title "Silent Data Corruption Resilient Matrix Factorizations on Distributed Memory System" @default.
- W2607600913 hasPublicationYear "2016" @default.
- W2607600913 type Work @default.
- W2607600913 sameAs 2607600913 @default.
- W2607600913 citedByCount "0" @default.
- W2607600913 crossrefType "journal-article" @default.
- W2607600913 hasAuthorship W2607600913A5020822198 @default.
- W2607600913 hasConcept C106487976 @default.
- W2607600913 hasConcept C111335779 @default.
- W2607600913 hasConcept C11413529 @default.
- W2607600913 hasConcept C120314980 @default.
- W2607600913 hasConcept C121332964 @default.
- W2607600913 hasConcept C123213974 @default.
- W2607600913 hasConcept C139352143 @default.
- W2607600913 hasConcept C158693339 @default.
- W2607600913 hasConcept C159985019 @default.
- W2607600913 hasConcept C173608175 @default.
- W2607600913 hasConcept C188060507 @default.
- W2607600913 hasConcept C192562407 @default.
- W2607600913 hasConcept C2524010 @default.
- W2607600913 hasConcept C33923547 @default.
- W2607600913 hasConcept C34727166 @default.
- W2607600913 hasConcept C41008148 @default.
- W2607600913 hasConcept C42355184 @default.
- W2607600913 hasConcept C48044578 @default.
- W2607600913 hasConcept C62520636 @default.
- W2607600913 hasConcept C63540848 @default.
- W2607600913 hasConcept C77088390 @default.
- W2607600913 hasConceptScore W2607600913C106487976 @default.
- W2607600913 hasConceptScore W2607600913C111335779 @default.
- W2607600913 hasConceptScore W2607600913C11413529 @default.
- W2607600913 hasConceptScore W2607600913C120314980 @default.
- W2607600913 hasConceptScore W2607600913C121332964 @default.
- W2607600913 hasConceptScore W2607600913C123213974 @default.
- W2607600913 hasConceptScore W2607600913C139352143 @default.
- W2607600913 hasConceptScore W2607600913C158693339 @default.
- W2607600913 hasConceptScore W2607600913C159985019 @default.
- W2607600913 hasConceptScore W2607600913C173608175 @default.
- W2607600913 hasConceptScore W2607600913C188060507 @default.
- W2607600913 hasConceptScore W2607600913C192562407 @default.
- W2607600913 hasConceptScore W2607600913C2524010 @default.
- W2607600913 hasConceptScore W2607600913C33923547 @default.
- W2607600913 hasConceptScore W2607600913C34727166 @default.
- W2607600913 hasConceptScore W2607600913C41008148 @default.
- W2607600913 hasConceptScore W2607600913C42355184 @default.
- W2607600913 hasConceptScore W2607600913C48044578 @default.
- W2607600913 hasConceptScore W2607600913C62520636 @default.
- W2607600913 hasConceptScore W2607600913C63540848 @default.
- W2607600913 hasConceptScore W2607600913C77088390 @default.
- W2607600913 hasLocation W26076009131 @default.
- W2607600913 hasOpenAccess W2607600913 @default.
- W2607600913 hasPrimaryLocation W26076009131 @default.
- W2607600913 hasRelatedWork W2040464566 @default.
- W2607600913 hasRelatedWork W2123291824 @default.
- W2607600913 hasRelatedWork W2210161345 @default.
- W2607600913 hasRelatedWork W232494833 @default.
- W2607600913 hasRelatedWork W2405076432 @default.
- W2607600913 hasRelatedWork W2478930528 @default.
- W2607600913 hasRelatedWork W2580243656 @default.
- W2607600913 hasRelatedWork W2613275038 @default.
- W2607600913 hasRelatedWork W2729559881 @default.
- W2607600913 hasRelatedWork W2786012290 @default.
- W2607600913 hasRelatedWork W2904573237 @default.
- W2607600913 hasRelatedWork W2957455719 @default.
- W2607600913 hasRelatedWork W2990529725 @default.
- W2607600913 hasRelatedWork W3014000538 @default.
- W2607600913 hasRelatedWork W3032910758 @default.
- W2607600913 hasRelatedWork W3037775421 @default.
- W2607600913 hasRelatedWork W3046244749 @default.
- W2607600913 hasRelatedWork W312022350 @default.
- W2607600913 hasRelatedWork W3156035383 @default.
- W2607600913 hasRelatedWork W3195052161 @default.
- W2607600913 isParatext "false" @default.
- W2607600913 isRetracted "false" @default.
- W2607600913 magId "2607600913" @default.
- W2607600913 workType "article" @default.