Matches in SemOpenAlex for { <https://semopenalex.org/work/W2893007245> ?p ?o ?g. }
- W2893007245 endingPage "2055" @default.
- W2893007245 startingPage "2040" @default.
- W2893007245 abstract "Computational problems in engineering and scientific disciplines often rely on the solution of many instances of small systems of linear equations, which are called batched solves. In this paper, we focus on the important variants of both batch Cholesky factorization and subsequent substitution. The former requires the linear system matrices to be symmetric positive definite (SPD). We describe the implementation and automated performance engineering of these kernels that implement the factorization and the two substitutions. Our target platforms are graphics processing units (GPUs), which over the past decade have become an attractive high-performance computing (HPC) target for solvers of linear systems of equations. Due to their throughput-oriented design, GPUs exhibit the highest processing rates among the available processors. However, without careful design and coding, this speed is mostly restricted to large matrix sizes. We show an automated exploration of the implementation space as well as a new data layout for the batched class of SPD solvers. Our tests involve the solution of many thousands of linear SPD systems of exactly the same size. The primary focus of our techniques is on the individual matrices in the batch that have dimensions ranging from 5-by-5 up to 100-by-100. We compare our autotuned solvers against the state-of-the-art solvers such as those provided through NVIDIA channels and publicly available in the optimized MAGMA library. The observed performance is competitive and many times superior for many practical cases. The advantage of the presented methodology lies in achieving these results in a portable manner across matrix storage formats and GPU hardware architecture platforms." @default.
- W2893007245 created "2018-10-05" @default.
- W2893007245 creator A5023168533 @default.
- W2893007245 creator A5062373552 @default.
- W2893007245 creator A5070456582 @default.
- W2893007245 creator A5073990539 @default.
- W2893007245 creator A5075517045 @default.
- W2893007245 date "2018-11-01" @default.
- W2893007245 modified "2023-09-24" @default.
- W2893007245 title "Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators" @default.
- W2893007245 cites W1046360519 @default.
- W2893007245 cites W1582138098 @default.
- W2893007245 cites W1737435842 @default.
- W2893007245 cites W1964031104 @default.
- W2893007245 cites W1978642402 @default.
- W2893007245 cites W1997033059 @default.
- W2893007245 cites W1998492046 @default.
- W2893007245 cites W2006508316 @default.
- W2893007245 cites W2006682733 @default.
- W2893007245 cites W2015262702 @default.
- W2893007245 cites W2016279572 @default.
- W2893007245 cites W2063509315 @default.
- W2893007245 cites W2090698642 @default.
- W2893007245 cites W2099625934 @default.
- W2893007245 cites W2101409192 @default.
- W2893007245 cites W2102182691 @default.
- W2893007245 cites W2109222446 @default.
- W2893007245 cites W2127906479 @default.
- W2893007245 cites W2136952590 @default.
- W2893007245 cites W2138215414 @default.
- W2893007245 cites W2155942156 @default.
- W2893007245 cites W2162525251 @default.
- W2893007245 cites W2169150754 @default.
- W2893007245 cites W2199814763 @default.
- W2893007245 cites W2208934446 @default.
- W2893007245 cites W2410272182 @default.
- W2893007245 cites W2500595894 @default.
- W2893007245 cites W2731116490 @default.
- W2893007245 cites W4239685157 @default.
- W2893007245 doi "https://doi.org/10.1109/jproc.2018.2868961" @default.
- W2893007245 hasPublicationYear "2018" @default.
- W2893007245 type Work @default.
- W2893007245 sameAs 2893007245 @default.
- W2893007245 citedByCount "7" @default.
- W2893007245 countsByYear W28930072452019 @default.
- W2893007245 countsByYear W28930072452020 @default.
- W2893007245 countsByYear W28930072452021 @default.
- W2893007245 countsByYear W28930072452023 @default.
- W2893007245 crossrefType "journal-article" @default.
- W2893007245 hasAuthorship W2893007245A5023168533 @default.
- W2893007245 hasAuthorship W2893007245A5062373552 @default.
- W2893007245 hasAuthorship W2893007245A5070456582 @default.
- W2893007245 hasAuthorship W2893007245A5073990539 @default.
- W2893007245 hasAuthorship W2893007245A5075517045 @default.
- W2893007245 hasBestOaLocation W28930072451 @default.
- W2893007245 hasConcept C11413529 @default.
- W2893007245 hasConcept C120665830 @default.
- W2893007245 hasConcept C121332964 @default.
- W2893007245 hasConcept C13164978 @default.
- W2893007245 hasConcept C134306372 @default.
- W2893007245 hasConcept C136119220 @default.
- W2893007245 hasConcept C139352143 @default.
- W2893007245 hasConcept C163834973 @default.
- W2893007245 hasConcept C168834538 @default.
- W2893007245 hasConcept C173608175 @default.
- W2893007245 hasConcept C180048950 @default.
- W2893007245 hasConcept C202444582 @default.
- W2893007245 hasConcept C2524010 @default.
- W2893007245 hasConcept C33923547 @default.
- W2893007245 hasConcept C41008148 @default.
- W2893007245 hasConcept C42935608 @default.
- W2893007245 hasConcept C45374587 @default.
- W2893007245 hasConcept C459310 @default.
- W2893007245 hasConcept C6802819 @default.
- W2893007245 hasConcept C9390403 @default.
- W2893007245 hasConceptScore W2893007245C11413529 @default.
- W2893007245 hasConceptScore W2893007245C120665830 @default.
- W2893007245 hasConceptScore W2893007245C121332964 @default.
- W2893007245 hasConceptScore W2893007245C13164978 @default.
- W2893007245 hasConceptScore W2893007245C134306372 @default.
- W2893007245 hasConceptScore W2893007245C136119220 @default.
- W2893007245 hasConceptScore W2893007245C139352143 @default.
- W2893007245 hasConceptScore W2893007245C163834973 @default.
- W2893007245 hasConceptScore W2893007245C168834538 @default.
- W2893007245 hasConceptScore W2893007245C173608175 @default.
- W2893007245 hasConceptScore W2893007245C180048950 @default.
- W2893007245 hasConceptScore W2893007245C202444582 @default.
- W2893007245 hasConceptScore W2893007245C2524010 @default.
- W2893007245 hasConceptScore W2893007245C33923547 @default.
- W2893007245 hasConceptScore W2893007245C41008148 @default.
- W2893007245 hasConceptScore W2893007245C42935608 @default.
- W2893007245 hasConceptScore W2893007245C45374587 @default.
- W2893007245 hasConceptScore W2893007245C459310 @default.
- W2893007245 hasConceptScore W2893007245C6802819 @default.
- W2893007245 hasConceptScore W2893007245C9390403 @default.
- W2893007245 hasFunder F4320306076 @default.
- W2893007245 hasFunder F4320306084 @default.
- W2893007245 hasIssue "11" @default.