Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311486595> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W4311486595 endingPage "8147" @default.
- W4311486595 startingPage "8124" @default.
- W4311486595 abstract "Abstract We provide a practical demonstration that it is possible to systematically generate a variety of high-performance micro-kernels for the general matrix multiplication ( gemm ) via generic templates which can be easily customized to different processor architectures and micro-kernel dimensions. These generic templates employ vector intrinsics to exploit the SIMD (single instruction, multiple data) units in current general-purpose processors and, for the particular type of gemm problems encountered in deep learning, deliver a floating-point throughput rate on par with or even higher than that obtained with conventional, carefully tuned implementations of gemm in current linear algebra libraries (e.g., BLIS, AMD AOCL, ARMPL). Our work exposes the structure of the template-based micro-kernels for ARM Neon (128-bit SIMD), ARM SVE (variable-length SIMD) and Intel AVX512 (512-bit SIMD), showing considerable performance for an NVIDIA Carmel processor (ARM Neon), a Fujitsu A64FX processor (ARM SVE) and on an AMD EPYC 7282 processor (256-bit SIMD)." @default.
- W4311486595 created "2022-12-26" @default.
- W4311486595 creator A5004018864 @default.
- W4311486595 creator A5012806004 @default.
- W4311486595 creator A5018719028 @default.
- W4311486595 creator A5033947083 @default.
- W4311486595 creator A5075880248 @default.
- W4311486595 creator A5086343654 @default.
- W4311486595 date "2022-12-14" @default.
- W4311486595 modified "2023-10-18" @default.
- W4311486595 title "Micro-kernels for portable and efficient matrix multiplication in deep learning" @default.
- W4311486595 cites W1546569454 @default.
- W4311486595 cites W1983157164 @default.
- W4311486595 cites W1984212674 @default.
- W4311486595 cites W2002257715 @default.
- W4311486595 cites W2002555321 @default.
- W4311486595 cites W2043275593 @default.
- W4311486595 cites W2073061372 @default.
- W4311486595 cites W2194775991 @default.
- W4311486595 cites W2252007067 @default.
- W4311486595 cites W4224278821 @default.
- W4311486595 cites W4281687509 @default.
- W4311486595 doi "https://doi.org/10.1007/s11227-022-05003-3" @default.
- W4311486595 hasPublicationYear "2022" @default.
- W4311486595 type Work @default.
- W4311486595 citedByCount "0" @default.
- W4311486595 crossrefType "journal-article" @default.
- W4311486595 hasAuthorship W4311486595A5004018864 @default.
- W4311486595 hasAuthorship W4311486595A5012806004 @default.
- W4311486595 hasAuthorship W4311486595A5018719028 @default.
- W4311486595 hasAuthorship W4311486595A5033947083 @default.
- W4311486595 hasAuthorship W4311486595A5075880248 @default.
- W4311486595 hasAuthorship W4311486595A5086343654 @default.
- W4311486595 hasBestOaLocation W43114865951 @default.
- W4311486595 hasConcept C11413529 @default.
- W4311486595 hasConcept C114614502 @default.
- W4311486595 hasConcept C121332964 @default.
- W4311486595 hasConcept C150552126 @default.
- W4311486595 hasConcept C17349429 @default.
- W4311486595 hasConcept C173608175 @default.
- W4311486595 hasConcept C24890656 @default.
- W4311486595 hasConcept C2780595030 @default.
- W4311486595 hasConcept C2908650547 @default.
- W4311486595 hasConcept C33923547 @default.
- W4311486595 hasConcept C41008148 @default.
- W4311486595 hasConcept C62520636 @default.
- W4311486595 hasConcept C74193536 @default.
- W4311486595 hasConcept C84114770 @default.
- W4311486595 hasConcept C84211073 @default.
- W4311486595 hasConceptScore W4311486595C11413529 @default.
- W4311486595 hasConceptScore W4311486595C114614502 @default.
- W4311486595 hasConceptScore W4311486595C121332964 @default.
- W4311486595 hasConceptScore W4311486595C150552126 @default.
- W4311486595 hasConceptScore W4311486595C17349429 @default.
- W4311486595 hasConceptScore W4311486595C173608175 @default.
- W4311486595 hasConceptScore W4311486595C24890656 @default.
- W4311486595 hasConceptScore W4311486595C2780595030 @default.
- W4311486595 hasConceptScore W4311486595C2908650547 @default.
- W4311486595 hasConceptScore W4311486595C33923547 @default.
- W4311486595 hasConceptScore W4311486595C41008148 @default.
- W4311486595 hasConceptScore W4311486595C62520636 @default.
- W4311486595 hasConceptScore W4311486595C74193536 @default.
- W4311486595 hasConceptScore W4311486595C84114770 @default.
- W4311486595 hasConceptScore W4311486595C84211073 @default.
- W4311486595 hasFunder F4320313831 @default.
- W4311486595 hasFunder F4320319605 @default.
- W4311486595 hasFunder F4320320300 @default.
- W4311486595 hasFunder F4320322930 @default.
- W4311486595 hasIssue "7" @default.
- W4311486595 hasLocation W43114865951 @default.
- W4311486595 hasLocation W43114865952 @default.
- W4311486595 hasOpenAccess W4311486595 @default.
- W4311486595 hasPrimaryLocation W43114865951 @default.
- W4311486595 hasRelatedWork W1865141138 @default.
- W4311486595 hasRelatedWork W1981985132 @default.
- W4311486595 hasRelatedWork W2001175489 @default.
- W4311486595 hasRelatedWork W2059696958 @default.
- W4311486595 hasRelatedWork W2940194197 @default.
- W4311486595 hasRelatedWork W4289638474 @default.
- W4311486595 hasRelatedWork W4304206929 @default.
- W4311486595 hasRelatedWork W4311486595 @default.
- W4311486595 hasRelatedWork W4312862090 @default.
- W4311486595 hasRelatedWork W4367155883 @default.
- W4311486595 hasVolume "79" @default.
- W4311486595 isParatext "false" @default.
- W4311486595 isRetracted "false" @default.
- W4311486595 workType "article" @default.