Matches in SemOpenAlex for { <https://semopenalex.org/work/W2091371833> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W2091371833 endingPage "140" @default.
- W2091371833 startingPage "133" @default.
- W2091371833 abstract "We present an interface and an implementation of the General Matrix Multiply (GEMM) routine for multiple small matrices processed simultaneously on NVIDIA graphics processing units (GPUs). We focus on matrix sizes under 16. The implementation can be easily extended to larger sizes. For single precision matrices, our implementation is 30% to 600% faster than the batched cuBLAS implementation distributed in the CUDA Toolkit 5.0 on NVIDIA Tesla K20c. For example, we obtain 104 GFlop/s and 216 GFlop/s when multiplying 100,000 independent matrix pairs of size 10 and 16, respectively. Similar improvement in performance is obtained for other sizes, in single and double precisions for real and complex types, and when the number of matrices is smaller. Apart from our implementation, our different function interface also plays an important role in the improved performance. Applications of this software include finite element computation on GPUs." @default.
- W2091371833 created "2016-06-24" @default.
- W2091371833 creator A5015945152 @default.
- W2091371833 creator A5053072193 @default.
- W2091371833 date "2015-01-01" @default.
- W2091371833 modified "2023-09-29" @default.
- W2091371833 title "A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices" @default.
- W2091371833 cites W2047196981 @default.
- W2091371833 cites W2079658918 @default.
- W2091371833 cites W2084379367 @default.
- W2091371833 cites W2099021415 @default.
- W2091371833 cites W2146406121 @default.
- W2091371833 doi "https://doi.org/10.1016/j.jpdc.2014.09.003" @default.
- W2091371833 hasPublicationYear "2015" @default.
- W2091371833 type Work @default.
- W2091371833 sameAs 2091371833 @default.
- W2091371833 citedByCount "18" @default.
- W2091371833 countsByYear W20913718332013 @default.
- W2091371833 countsByYear W20913718332015 @default.
- W2091371833 countsByYear W20913718332016 @default.
- W2091371833 countsByYear W20913718332017 @default.
- W2091371833 countsByYear W20913718332018 @default.
- W2091371833 countsByYear W20913718332019 @default.
- W2091371833 countsByYear W20913718332020 @default.
- W2091371833 countsByYear W20913718332022 @default.
- W2091371833 crossrefType "journal-article" @default.
- W2091371833 hasAuthorship W2091371833A5015945152 @default.
- W2091371833 hasAuthorship W2091371833A5053072193 @default.
- W2091371833 hasBestOaLocation W20913718331 @default.
- W2091371833 hasConcept C106487976 @default.
- W2091371833 hasConcept C111919701 @default.
- W2091371833 hasConcept C113843644 @default.
- W2091371833 hasConcept C11413529 @default.
- W2091371833 hasConcept C120665830 @default.
- W2091371833 hasConcept C121332964 @default.
- W2091371833 hasConcept C121684516 @default.
- W2091371833 hasConcept C129307140 @default.
- W2091371833 hasConcept C157915830 @default.
- W2091371833 hasConcept C159985019 @default.
- W2091371833 hasConcept C173608175 @default.
- W2091371833 hasConcept C192209626 @default.
- W2091371833 hasConcept C192562407 @default.
- W2091371833 hasConcept C21442007 @default.
- W2091371833 hasConcept C2777904410 @default.
- W2091371833 hasConcept C2778119891 @default.
- W2091371833 hasConcept C2779851693 @default.
- W2091371833 hasConcept C35912277 @default.
- W2091371833 hasConcept C41008148 @default.
- W2091371833 hasConcept C45374587 @default.
- W2091371833 hasConcept C459310 @default.
- W2091371833 hasConceptScore W2091371833C106487976 @default.
- W2091371833 hasConceptScore W2091371833C111919701 @default.
- W2091371833 hasConceptScore W2091371833C113843644 @default.
- W2091371833 hasConceptScore W2091371833C11413529 @default.
- W2091371833 hasConceptScore W2091371833C120665830 @default.
- W2091371833 hasConceptScore W2091371833C121332964 @default.
- W2091371833 hasConceptScore W2091371833C121684516 @default.
- W2091371833 hasConceptScore W2091371833C129307140 @default.
- W2091371833 hasConceptScore W2091371833C157915830 @default.
- W2091371833 hasConceptScore W2091371833C159985019 @default.
- W2091371833 hasConceptScore W2091371833C173608175 @default.
- W2091371833 hasConceptScore W2091371833C192209626 @default.
- W2091371833 hasConceptScore W2091371833C192562407 @default.
- W2091371833 hasConceptScore W2091371833C21442007 @default.
- W2091371833 hasConceptScore W2091371833C2777904410 @default.
- W2091371833 hasConceptScore W2091371833C2778119891 @default.
- W2091371833 hasConceptScore W2091371833C2779851693 @default.
- W2091371833 hasConceptScore W2091371833C35912277 @default.
- W2091371833 hasConceptScore W2091371833C41008148 @default.
- W2091371833 hasConceptScore W2091371833C45374587 @default.
- W2091371833 hasConceptScore W2091371833C459310 @default.
- W2091371833 hasFunder F4320306084 @default.
- W2091371833 hasLocation W20913718331 @default.
- W2091371833 hasLocation W20913718332 @default.
- W2091371833 hasLocation W20913718333 @default.
- W2091371833 hasLocation W20913718334 @default.
- W2091371833 hasOpenAccess W2091371833 @default.
- W2091371833 hasPrimaryLocation W20913718331 @default.
- W2091371833 hasRelatedWork W1936984751 @default.
- W2091371833 hasRelatedWork W2020279179 @default.
- W2091371833 hasRelatedWork W2023954774 @default.
- W2091371833 hasRelatedWork W2029040955 @default.
- W2091371833 hasRelatedWork W2074129693 @default.
- W2091371833 hasRelatedWork W2091371833 @default.
- W2091371833 hasRelatedWork W2119534391 @default.
- W2091371833 hasRelatedWork W2146871484 @default.
- W2091371833 hasRelatedWork W2952178034 @default.
- W2091371833 hasRelatedWork W4297080925 @default.
- W2091371833 hasVolume "75" @default.
- W2091371833 isParatext "false" @default.
- W2091371833 isRetracted "false" @default.
- W2091371833 magId "2091371833" @default.
- W2091371833 workType "article" @default.