Matches in SemOpenAlex for { <https://semopenalex.org/work/W201315547> ?p ?o ?g. }
- W201315547 endingPage "109" @default.
- W201315547 startingPage "90" @default.
- W201315547 abstract "Extra memory allows parallel matrix multiplication to be done with asymptotically less communication than Cannon’s algorithm and be faster in practice. “3D” algorithms arrange the p processors in a 3D array, and store redundant copies of the matrices on each of p 1/3 layers. ‘2D” algorithms such as Cannon’s algorithm store a single copy of the matrices on a 2D array of processors. We generalize these 2D and 3D algorithms by introducing a new class of “2.5D algorithms”. For matrix multiplication, we can take advantage of any amount of extra memory to store c copies of the data, for any $c in{1,2,...,lfloor p^{1/3}rfloor}$ , to reduce the bandwidth cost of Cannon’s algorithm by a factor of c 1/2 and the latency cost by a factor c 3/2. We also show that these costs reach the lower bounds, modulo polylog(p) factors. We introduce a novel algorithm for 2.5D LU decomposition. To the best of our knowledge, this LU algorithm is the first to minimize communication along the critical path of execution in the 3D case. Our 2.5D LU algorithm uses communication-avoiding pivoting, a stable alternative to partial-pivoting. We prove a novel lower bound on the latency cost of 2.5D and 3D LU factorization, showing that while c copies of the data can also reduce the bandwidth by a factor of c 1/2, the latency must increase by a factor of c 1/2, so that the 2D LU algorithm (c = 1) in fact minimizes latency. We provide implementations and performance results for 2D and 2.5D versions of all the new algorithms. Our results demonstrate that 2.5D matrix multiplication and LU algorithms strongly scale more efficiently than 2D algorithms. Each of our 2.5D algorithms performs over 2X faster than the corresponding 2D algorithm for certain problem sizes on 65,536 cores of a BG/P supercomputer." @default.
- W201315547 created "2016-06-24" @default.
- W201315547 creator A5022838370 @default.
- W201315547 creator A5076825233 @default.
- W201315547 date "2011-01-01" @default.
- W201315547 modified "2023-10-18" @default.
- W201315547 title "Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms" @default.
- W201315547 cites W1575067475 @default.
- W201315547 cites W1980670496 @default.
- W201315547 cites W1997882689 @default.
- W201315547 cites W2010747199 @default.
- W201315547 cites W2012652661 @default.
- W201315547 cites W201315547 @default.
- W201315547 cites W2029342163 @default.
- W201315547 cites W2056999868 @default.
- W201315547 cites W2065231486 @default.
- W201315547 cites W2072910106 @default.
- W201315547 cites W2087674115 @default.
- W201315547 cites W3099472489 @default.
- W201315547 cites W4210861204 @default.
- W201315547 cites W4231150350 @default.
- W201315547 doi "https://doi.org/10.1007/978-3-642-23397-5_10" @default.
- W201315547 hasPublicationYear "2011" @default.
- W201315547 type Work @default.
- W201315547 sameAs 201315547 @default.
- W201315547 citedByCount "154" @default.
- W201315547 countsByYear W2013155472012 @default.
- W201315547 countsByYear W2013155472013 @default.
- W201315547 countsByYear W2013155472014 @default.
- W201315547 countsByYear W2013155472015 @default.
- W201315547 countsByYear W2013155472016 @default.
- W201315547 countsByYear W2013155472017 @default.
- W201315547 countsByYear W2013155472018 @default.
- W201315547 countsByYear W2013155472019 @default.
- W201315547 countsByYear W2013155472020 @default.
- W201315547 countsByYear W2013155472021 @default.
- W201315547 countsByYear W2013155472022 @default.
- W201315547 countsByYear W2013155472023 @default.
- W201315547 crossrefType "book-chapter" @default.
- W201315547 hasAuthorship W201315547A5022838370 @default.
- W201315547 hasAuthorship W201315547A5076825233 @default.
- W201315547 hasBestOaLocation W2013155471 @default.
- W201315547 hasConcept C106487976 @default.
- W201315547 hasConcept C11413529 @default.
- W201315547 hasConcept C114614502 @default.
- W201315547 hasConcept C121332964 @default.
- W201315547 hasConcept C123213974 @default.
- W201315547 hasConcept C158693339 @default.
- W201315547 hasConcept C159985019 @default.
- W201315547 hasConcept C17349429 @default.
- W201315547 hasConcept C173608175 @default.
- W201315547 hasConcept C187834632 @default.
- W201315547 hasConcept C192562407 @default.
- W201315547 hasConcept C199360897 @default.
- W201315547 hasConcept C201290732 @default.
- W201315547 hasConcept C2780595030 @default.
- W201315547 hasConcept C2781039887 @default.
- W201315547 hasConcept C33923547 @default.
- W201315547 hasConcept C41008148 @default.
- W201315547 hasConcept C42355184 @default.
- W201315547 hasConcept C48372109 @default.
- W201315547 hasConcept C62520636 @default.
- W201315547 hasConcept C76155785 @default.
- W201315547 hasConcept C82876162 @default.
- W201315547 hasConcept C84114770 @default.
- W201315547 hasConcept C94375191 @default.
- W201315547 hasConceptScore W201315547C106487976 @default.
- W201315547 hasConceptScore W201315547C11413529 @default.
- W201315547 hasConceptScore W201315547C114614502 @default.
- W201315547 hasConceptScore W201315547C121332964 @default.
- W201315547 hasConceptScore W201315547C123213974 @default.
- W201315547 hasConceptScore W201315547C158693339 @default.
- W201315547 hasConceptScore W201315547C159985019 @default.
- W201315547 hasConceptScore W201315547C17349429 @default.
- W201315547 hasConceptScore W201315547C173608175 @default.
- W201315547 hasConceptScore W201315547C187834632 @default.
- W201315547 hasConceptScore W201315547C192562407 @default.
- W201315547 hasConceptScore W201315547C199360897 @default.
- W201315547 hasConceptScore W201315547C201290732 @default.
- W201315547 hasConceptScore W201315547C2780595030 @default.
- W201315547 hasConceptScore W201315547C2781039887 @default.
- W201315547 hasConceptScore W201315547C33923547 @default.
- W201315547 hasConceptScore W201315547C41008148 @default.
- W201315547 hasConceptScore W201315547C42355184 @default.
- W201315547 hasConceptScore W201315547C48372109 @default.
- W201315547 hasConceptScore W201315547C62520636 @default.
- W201315547 hasConceptScore W201315547C76155785 @default.
- W201315547 hasConceptScore W201315547C82876162 @default.
- W201315547 hasConceptScore W201315547C84114770 @default.
- W201315547 hasConceptScore W201315547C94375191 @default.
- W201315547 hasLocation W2013155471 @default.
- W201315547 hasLocation W2013155472 @default.
- W201315547 hasOpenAccess W201315547 @default.
- W201315547 hasPrimaryLocation W2013155471 @default.
- W201315547 hasRelatedWork W1948183148 @default.
- W201315547 hasRelatedWork W1976954664 @default.
- W201315547 hasRelatedWork W2041251310 @default.
- W201315547 hasRelatedWork W2041380856 @default.