Matches in SemOpenAlex for { <https://semopenalex.org/work/W4316252370> ?p ?o ?g. }
Showing items 1 to 94 of
94
with 100 items per page.
- W4316252370 abstract "We present an approach to the automatic generation of efficient matrix multiplication code on the latest Sunway processor, which will be employed by the next-generation machine of Sunway TaihuLight, one of the fastest supercomputers on earth. The method allows users to write simple C code and automatically generates high-performance matrix multiplication kernels. It uses polyhedral transformations to implement rapid compute decomposition, data exchanges across memory hierarchy and memory latency hiding. An assembly routine is finally integrated into the generated kernels. While achieving up to 90.14% of the theoretical peak performance, our method surpasses a highly tuned library by 9.44%. Compared with existing techniques, our approach reduces the software development life cycle to generate efficient matrix code from months to seconds. We also take into account batched matrix multiplication and some fusion patterns for deep learning (DL), outperforming the library-based implementations by 1.30 × and 1.67 ×." @default.
- W4316252370 created "2023-01-15" @default.
- W4316252370 creator A5000546902 @default.
- W4316252370 creator A5009323865 @default.
- W4316252370 creator A5028753892 @default.
- W4316252370 creator A5033392056 @default.
- W4316252370 creator A5051642924 @default.
- W4316252370 creator A5061431311 @default.
- W4316252370 date "2022-08-29" @default.
- W4316252370 modified "2023-10-17" @default.
- W4316252370 title "Automatically Generating High-performance Matrix Multiplication Kernels on the Latest Sunway Processor" @default.
- W4316252370 cites W1558370006 @default.
- W4316252370 cites W2064872546 @default.
- W4316252370 cites W2067575922 @default.
- W4316252370 cites W2076517649 @default.
- W4316252370 cites W2168412050 @default.
- W4316252370 cites W2592969254 @default.
- W4316252370 cites W2753495321 @default.
- W4316252370 cites W2767373187 @default.
- W4316252370 cites W2962871385 @default.
- W4316252370 cites W2979365412 @default.
- W4316252370 cites W3104745751 @default.
- W4316252370 cites W3177452048 @default.
- W4316252370 cites W3210190478 @default.
- W4316252370 cites W4220850685 @default.
- W4316252370 cites W4243796884 @default.
- W4316252370 doi "https://doi.org/10.1145/3545008.3545031" @default.
- W4316252370 hasPublicationYear "2022" @default.
- W4316252370 type Work @default.
- W4316252370 citedByCount "0" @default.
- W4316252370 crossrefType "proceedings-article" @default.
- W4316252370 hasAuthorship W4316252370A5000546902 @default.
- W4316252370 hasAuthorship W4316252370A5009323865 @default.
- W4316252370 hasAuthorship W4316252370A5028753892 @default.
- W4316252370 hasAuthorship W4316252370A5033392056 @default.
- W4316252370 hasAuthorship W4316252370A5051642924 @default.
- W4316252370 hasAuthorship W4316252370A5061431311 @default.
- W4316252370 hasConcept C114614502 @default.
- W4316252370 hasConcept C115537543 @default.
- W4316252370 hasConcept C118524514 @default.
- W4316252370 hasConcept C121332964 @default.
- W4316252370 hasConcept C123213974 @default.
- W4316252370 hasConcept C158693339 @default.
- W4316252370 hasConcept C17349429 @default.
- W4316252370 hasConcept C173608175 @default.
- W4316252370 hasConcept C177264268 @default.
- W4316252370 hasConcept C199360897 @default.
- W4316252370 hasConcept C2776760102 @default.
- W4316252370 hasConcept C2778100165 @default.
- W4316252370 hasConcept C2780595030 @default.
- W4316252370 hasConcept C33923547 @default.
- W4316252370 hasConcept C41008148 @default.
- W4316252370 hasConcept C42355184 @default.
- W4316252370 hasConcept C459310 @default.
- W4316252370 hasConcept C62520636 @default.
- W4316252370 hasConcept C83283714 @default.
- W4316252370 hasConcept C84114770 @default.
- W4316252370 hasConceptScore W4316252370C114614502 @default.
- W4316252370 hasConceptScore W4316252370C115537543 @default.
- W4316252370 hasConceptScore W4316252370C118524514 @default.
- W4316252370 hasConceptScore W4316252370C121332964 @default.
- W4316252370 hasConceptScore W4316252370C123213974 @default.
- W4316252370 hasConceptScore W4316252370C158693339 @default.
- W4316252370 hasConceptScore W4316252370C17349429 @default.
- W4316252370 hasConceptScore W4316252370C173608175 @default.
- W4316252370 hasConceptScore W4316252370C177264268 @default.
- W4316252370 hasConceptScore W4316252370C199360897 @default.
- W4316252370 hasConceptScore W4316252370C2776760102 @default.
- W4316252370 hasConceptScore W4316252370C2778100165 @default.
- W4316252370 hasConceptScore W4316252370C2780595030 @default.
- W4316252370 hasConceptScore W4316252370C33923547 @default.
- W4316252370 hasConceptScore W4316252370C41008148 @default.
- W4316252370 hasConceptScore W4316252370C42355184 @default.
- W4316252370 hasConceptScore W4316252370C459310 @default.
- W4316252370 hasConceptScore W4316252370C62520636 @default.
- W4316252370 hasConceptScore W4316252370C83283714 @default.
- W4316252370 hasConceptScore W4316252370C84114770 @default.
- W4316252370 hasFunder F4320321001 @default.
- W4316252370 hasLocation W43162523701 @default.
- W4316252370 hasOpenAccess W4316252370 @default.
- W4316252370 hasPrimaryLocation W43162523701 @default.
- W4316252370 hasRelatedWork W2079736157 @default.
- W4316252370 hasRelatedWork W2099571870 @default.
- W4316252370 hasRelatedWork W2132603425 @default.
- W4316252370 hasRelatedWork W2147956657 @default.
- W4316252370 hasRelatedWork W2186439059 @default.
- W4316252370 hasRelatedWork W2523376728 @default.
- W4316252370 hasRelatedWork W2768573991 @default.
- W4316252370 hasRelatedWork W3091817928 @default.
- W4316252370 hasRelatedWork W3176814699 @default.
- W4316252370 hasRelatedWork W52302056 @default.
- W4316252370 isParatext "false" @default.
- W4316252370 isRetracted "false" @default.
- W4316252370 workType "article" @default.