Matches in SemOpenAlex for { <https://semopenalex.org/work/W3190673640> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W3190673640 abstract "The A64FX CPU is arguably the most powerful Arm-based processor design to date. Although it is a traditional cache-based multicore processor, its peak performance and memory bandwidth rival accelerator devices. A good understanding of its performance features is of paramount importance for developers who wish to leverage its full potential. We present an architectural analysis of the A64FX used in the Fujitsu FX1000 supercomputer at a level of detail that allows for the construction of Execution-Cache-Memory (ECM) performance models for steady-state loops. In the process we identify architectural peculiarities that point to viable generic optimization strategies. After validating the model using simple streaming loops we apply the insight gained to sparse matrix-vector multiplication (SpMV) and the domain wall (DW) kernel from quantum chromodynamics (QCD). For SpMV we show why the CRS matrix storage format is not a good practical choice on this architecture and how the SELL-C-sigma format can achieve bandwidth saturation. For the DW kernel we provide a cache-reuse analysis and show how an appropriate choice of data layout for complex arrays can realize memory-bandwidth saturation in this case as well. A comparison with state-of-the-art high-end Intel Cascade Lake AP and Nvidia V100 systems puts the capabilities of the A64FX into perspective. We also explore the potential for power optimizations using the tuning knobs provided by the Fugaku system, achieving energy savings of about 31% for SpMV and 18% for DW." @default.
- W3190673640 created "2021-08-16" @default.
- W3190673640 creator A5017848833 @default.
- W3190673640 creator A5031307529 @default.
- W3190673640 creator A5039978607 @default.
- W3190673640 creator A5058357790 @default.
- W3190673640 creator A5070209050 @default.
- W3190673640 creator A5082552227 @default.
- W3190673640 creator A5087470984 @default.
- W3190673640 date "2021-08-01" @default.
- W3190673640 modified "2023-10-02" @default.
- W3190673640 title "Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX" @default.
- W3190673640 cites W1961751213 @default.
- W3190673640 cites W2035080386 @default.
- W3190673640 cites W2093379922 @default.
- W3190673640 cites W2101511474 @default.
- W3190673640 cites W2291117088 @default.
- W3190673640 cites W2532916439 @default.
- W3190673640 cites W2588108287 @default.
- W3190673640 cites W2952403176 @default.
- W3190673640 cites W2964258936 @default.
- W3190673640 cites W3045469217 @default.
- W3190673640 cites W3089579782 @default.
- W3190673640 cites W3095810911 @default.
- W3190673640 cites W3095952171 @default.
- W3190673640 cites W3096250259 @default.
- W3190673640 cites W3097283637 @default.
- W3190673640 cites W3097572649 @default.
- W3190673640 cites W3097636320 @default.
- W3190673640 cites W3100998497 @default.
- W3190673640 cites W3103041597 @default.
- W3190673640 cites W3104634154 @default.
- W3190673640 cites W3104900731 @default.
- W3190673640 cites W3106055984 @default.
- W3190673640 cites W3138973966 @default.
- W3190673640 cites W4232919122 @default.
- W3190673640 cites W4249968602 @default.
- W3190673640 doi "https://doi.org/10.1002/cpe.6512" @default.
- W3190673640 hasPublicationYear "2021" @default.
- W3190673640 type Work @default.
- W3190673640 sameAs 3190673640 @default.
- W3190673640 citedByCount "7" @default.
- W3190673640 countsByYear W31906736402022 @default.
- W3190673640 countsByYear W31906736402023 @default.
- W3190673640 crossrefType "journal-article" @default.
- W3190673640 hasAuthorship W3190673640A5017848833 @default.
- W3190673640 hasAuthorship W3190673640A5031307529 @default.
- W3190673640 hasAuthorship W3190673640A5039978607 @default.
- W3190673640 hasAuthorship W3190673640A5058357790 @default.
- W3190673640 hasAuthorship W3190673640A5070209050 @default.
- W3190673640 hasAuthorship W3190673640A5082552227 @default.
- W3190673640 hasAuthorship W3190673640A5087470984 @default.
- W3190673640 hasBestOaLocation W31906736401 @default.
- W3190673640 hasConcept C115537543 @default.
- W3190673640 hasConcept C121332964 @default.
- W3190673640 hasConcept C163716315 @default.
- W3190673640 hasConcept C17349429 @default.
- W3190673640 hasConcept C173608175 @default.
- W3190673640 hasConcept C188045654 @default.
- W3190673640 hasConcept C2776257435 @default.
- W3190673640 hasConcept C31258907 @default.
- W3190673640 hasConcept C41008148 @default.
- W3190673640 hasConcept C56372850 @default.
- W3190673640 hasConcept C62520636 @default.
- W3190673640 hasConcept C83283714 @default.
- W3190673640 hasConcept C84114770 @default.
- W3190673640 hasConceptScore W3190673640C115537543 @default.
- W3190673640 hasConceptScore W3190673640C121332964 @default.
- W3190673640 hasConceptScore W3190673640C163716315 @default.
- W3190673640 hasConceptScore W3190673640C17349429 @default.
- W3190673640 hasConceptScore W3190673640C173608175 @default.
- W3190673640 hasConceptScore W3190673640C188045654 @default.
- W3190673640 hasConceptScore W3190673640C2776257435 @default.
- W3190673640 hasConceptScore W3190673640C31258907 @default.
- W3190673640 hasConceptScore W3190673640C41008148 @default.
- W3190673640 hasConceptScore W3190673640C56372850 @default.
- W3190673640 hasConceptScore W3190673640C62520636 @default.
- W3190673640 hasConceptScore W3190673640C83283714 @default.
- W3190673640 hasConceptScore W3190673640C84114770 @default.
- W3190673640 hasIssue "20" @default.
- W3190673640 hasLocation W31906736401 @default.
- W3190673640 hasLocation W31906736402 @default.
- W3190673640 hasOpenAccess W3190673640 @default.
- W3190673640 hasPrimaryLocation W31906736401 @default.
- W3190673640 hasRelatedWork W1980282429 @default.
- W3190673640 hasRelatedWork W2040556424 @default.
- W3190673640 hasRelatedWork W2098513105 @default.
- W3190673640 hasRelatedWork W2142496304 @default.
- W3190673640 hasRelatedWork W2149529325 @default.
- W3190673640 hasRelatedWork W2768573991 @default.
- W3190673640 hasRelatedWork W3011554625 @default.
- W3190673640 hasRelatedWork W3089579782 @default.
- W3190673640 hasRelatedWork W3176814699 @default.
- W3190673640 hasRelatedWork W4223467872 @default.
- W3190673640 hasVolume "34" @default.
- W3190673640 isParatext "false" @default.
- W3190673640 isRetracted "false" @default.
- W3190673640 magId "3190673640" @default.
- W3190673640 workType "article" @default.