Matches in SemOpenAlex for { <https://semopenalex.org/work/W2115020004> ?p ?o ?g. }
- W2115020004 abstract "Scientists across a wide range of domains increasingly rely on computer simulation for their investigations. Such simulations often spend a majority of their run-times solving large systems of linear equations that require vast amounts of computational power and memory. It is hence critical to design solvers in a highly efficient and scalable manner. Hypre is a high performance, scalable software library that offers several optimized linear solver routines and pre-conditioners. In this paper, we study the characteristics of Hypre's Preconditioned Conjugate Gradient (PCG) solver algorithm. The PCG routine is known to spend a majority of its communication time in the MPI All reduce operation to compute a global summation during the inner product operation. The MPI All reduce is a blocking operation, whose latency is often a limiting factor to the overall efficiency of the PCG solver routine, and correspondingly the performance of simulations that rely on this solver. Hence, hiding the latency of the MPI All reduce operation is critical towards scaling the PCG solver routine and improving the performance of many simulations. The upcoming revision of MPI, MPI-3, will provide support for non-blocking collective communication to enable latency-hiding. The latest Infini Band adapter from Mellanox, ConnectX-2, enables offloading of generalized lists of communication operations to the network interface. Such an interface can be leveraged to design non-blocking collective operations. In this paper, we design fully functional, scalable algorithms for the MPI Iall reduce operation, based on the network offload technology. To the best of our knowledge, these network offload-based algorithms are the first to be presented for the MPI Iall reduce operation. Our designs scale beyond 512 processes and we achieve near perfect communication/computation overlap. We also re-design the PCG solver routine to leverage our proposed MPI Iall reduce operation to hide the latency of the global reduction operations. We observe up to 21% improvements in the run-times of the PCG routine, when compared to the default PCG implementation in Hypre. We also note that about 16% of the overall benefits are due to overlapping the All reduce operations." @default.
- W2115020004 created "2016-06-24" @default.
- W2115020004 creator A5010923189 @default.
- W2115020004 creator A5024879682 @default.
- W2115020004 creator A5034293705 @default.
- W2115020004 creator A5040711735 @default.
- W2115020004 creator A5058719424 @default.
- W2115020004 creator A5060821170 @default.
- W2115020004 creator A5067370556 @default.
- W2115020004 creator A5068423553 @default.
- W2115020004 creator A5078989266 @default.
- W2115020004 creator A5091761334 @default.
- W2115020004 date "2012-05-01" @default.
- W2115020004 modified "2023-09-26" @default.
- W2115020004 title "Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers" @default.
- W2115020004 cites W1530542361 @default.
- W2115020004 cites W1569680480 @default.
- W2115020004 cites W2062011830 @default.
- W2115020004 cites W2062764889 @default.
- W2115020004 cites W2068975988 @default.
- W2115020004 cites W2082007415 @default.
- W2115020004 cites W2091780466 @default.
- W2115020004 cites W2099020156 @default.
- W2115020004 cites W2120827897 @default.
- W2115020004 cites W2127089909 @default.
- W2115020004 cites W2140973798 @default.
- W2115020004 cites W2143539682 @default.
- W2115020004 cites W2151445386 @default.
- W2115020004 cites W2152994038 @default.
- W2115020004 cites W2165102815 @default.
- W2115020004 cites W2167520203 @default.
- W2115020004 cites W2316564661 @default.
- W2115020004 cites W4301491118 @default.
- W2115020004 cites W73153831 @default.
- W2115020004 doi "https://doi.org/10.1109/ipdps.2012.106" @default.
- W2115020004 hasPublicationYear "2012" @default.
- W2115020004 type Work @default.
- W2115020004 sameAs 2115020004 @default.
- W2115020004 citedByCount "21" @default.
- W2115020004 countsByYear W21150200042012 @default.
- W2115020004 countsByYear W21150200042013 @default.
- W2115020004 countsByYear W21150200042014 @default.
- W2115020004 countsByYear W21150200042015 @default.
- W2115020004 countsByYear W21150200042016 @default.
- W2115020004 countsByYear W21150200042017 @default.
- W2115020004 countsByYear W21150200042019 @default.
- W2115020004 countsByYear W21150200042020 @default.
- W2115020004 countsByYear W21150200042021 @default.
- W2115020004 crossrefType "proceedings-article" @default.
- W2115020004 hasAuthorship W2115020004A5010923189 @default.
- W2115020004 hasAuthorship W2115020004A5024879682 @default.
- W2115020004 hasAuthorship W2115020004A5034293705 @default.
- W2115020004 hasAuthorship W2115020004A5040711735 @default.
- W2115020004 hasAuthorship W2115020004A5058719424 @default.
- W2115020004 hasAuthorship W2115020004A5060821170 @default.
- W2115020004 hasAuthorship W2115020004A5067370556 @default.
- W2115020004 hasAuthorship W2115020004A5068423553 @default.
- W2115020004 hasAuthorship W2115020004A5078989266 @default.
- W2115020004 hasAuthorship W2115020004A5091761334 @default.
- W2115020004 hasBestOaLocation W21150200042 @default.
- W2115020004 hasConcept C111919701 @default.
- W2115020004 hasConcept C11413529 @default.
- W2115020004 hasConcept C120314980 @default.
- W2115020004 hasConcept C144745244 @default.
- W2115020004 hasConcept C173608175 @default.
- W2115020004 hasConcept C177284502 @default.
- W2115020004 hasConcept C199360897 @default.
- W2115020004 hasConcept C2778770139 @default.
- W2115020004 hasConcept C2781030343 @default.
- W2115020004 hasConcept C31258907 @default.
- W2115020004 hasConcept C41008148 @default.
- W2115020004 hasConcept C48044578 @default.
- W2115020004 hasConcept C76155785 @default.
- W2115020004 hasConcept C81184566 @default.
- W2115020004 hasConcept C82876162 @default.
- W2115020004 hasConceptScore W2115020004C111919701 @default.
- W2115020004 hasConceptScore W2115020004C11413529 @default.
- W2115020004 hasConceptScore W2115020004C120314980 @default.
- W2115020004 hasConceptScore W2115020004C144745244 @default.
- W2115020004 hasConceptScore W2115020004C173608175 @default.
- W2115020004 hasConceptScore W2115020004C177284502 @default.
- W2115020004 hasConceptScore W2115020004C199360897 @default.
- W2115020004 hasConceptScore W2115020004C2778770139 @default.
- W2115020004 hasConceptScore W2115020004C2781030343 @default.
- W2115020004 hasConceptScore W2115020004C31258907 @default.
- W2115020004 hasConceptScore W2115020004C41008148 @default.
- W2115020004 hasConceptScore W2115020004C48044578 @default.
- W2115020004 hasConceptScore W2115020004C76155785 @default.
- W2115020004 hasConceptScore W2115020004C81184566 @default.
- W2115020004 hasConceptScore W2115020004C82876162 @default.
- W2115020004 hasLocation W21150200041 @default.
- W2115020004 hasLocation W21150200042 @default.
- W2115020004 hasLocation W21150200043 @default.
- W2115020004 hasOpenAccess W2115020004 @default.
- W2115020004 hasPrimaryLocation W21150200041 @default.
- W2115020004 hasRelatedWork W1531780705 @default.
- W2115020004 hasRelatedWork W1604898313 @default.
- W2115020004 hasRelatedWork W1638830944 @default.
- W2115020004 hasRelatedWork W2364921833 @default.
- W2115020004 hasRelatedWork W2385146268 @default.