Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311546748> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W4311546748 abstract "Integer sum reduction is a primitive operation commonly used in scientific computing. Implementing a parallel reduction on a GPU often involves concurrent memory accesses using atomic operations and synchronization of work-items in a work-group. For a better understanding of these operations, we redesigned micro-kernels in the HIP programming language to measure the time of atomic operations over global memory, the cost of barrier synchronization, and reduction within a work-group to shared local memory using one atomic addition per work-item on a compute unit in an AMD MI100 GPU. Then, we describe the implementations of the reduction kernels with vectorized memory accesses, parameterized workload sizes, and vendor's library APIs. Our experimental results show that 1) there is a performance tradeoff between the cost of barrier synchronization and the amount of parallelism from atomic operations over shared local memory when we increase the size of a work-group. 2) a reduction kernel with vectorized memory accesses and vector data types is approximately 3% faster for the large problem size than the kernels written with the vendor's library APIs. 3) the compiler needs to assist the hardware processor with data dependency resolution at the level of instruction set architecture. 4) the power consumption of the kernel execution on the GPU fluctuates between 277 Watts and 301 Watts and the dynamic power of other GPU activities is at most 31 Watts." @default.
- W4311546748 created "2022-12-27" @default.
- W4311546748 creator A5057257864 @default.
- W4311546748 creator A5061838490 @default.
- W4311546748 date "2022-08-29" @default.
- W4311546748 modified "2023-10-17" @default.
- W4311546748 title "A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU" @default.
- W4311546748 cites W1480958225 @default.
- W4311546748 cites W2016357834 @default.
- W4311546748 cites W2088134616 @default.
- W4311546748 cites W2399715892 @default.
- W4311546748 cites W2748044793 @default.
- W4311546748 cites W2776545425 @default.
- W4311546748 cites W2791531244 @default.
- W4311546748 cites W2954922220 @default.
- W4311546748 cites W2962721408 @default.
- W4311546748 cites W3022422003 @default.
- W4311546748 cites W3137083963 @default.
- W4311546748 cites W3163802921 @default.
- W4311546748 cites W3209449653 @default.
- W4311546748 cites W4235158490 @default.
- W4311546748 cites W4250027548 @default.
- W4311546748 doi "https://doi.org/10.1145/3547276.3548627" @default.
- W4311546748 hasPublicationYear "2022" @default.
- W4311546748 type Work @default.
- W4311546748 citedByCount "0" @default.
- W4311546748 crossrefType "proceedings-article" @default.
- W4311546748 hasAuthorship W4311546748A5057257864 @default.
- W4311546748 hasAuthorship W4311546748A5061838490 @default.
- W4311546748 hasBestOaLocation W43115467482 @default.
- W4311546748 hasConcept C111335779 @default.
- W4311546748 hasConcept C111919701 @default.
- W4311546748 hasConcept C114614502 @default.
- W4311546748 hasConcept C127162648 @default.
- W4311546748 hasConcept C169590947 @default.
- W4311546748 hasConcept C173608175 @default.
- W4311546748 hasConcept C2524010 @default.
- W4311546748 hasConcept C2778562939 @default.
- W4311546748 hasConcept C31258907 @default.
- W4311546748 hasConcept C33923547 @default.
- W4311546748 hasConcept C41008148 @default.
- W4311546748 hasConcept C74193536 @default.
- W4311546748 hasConceptScore W4311546748C111335779 @default.
- W4311546748 hasConceptScore W4311546748C111919701 @default.
- W4311546748 hasConceptScore W4311546748C114614502 @default.
- W4311546748 hasConceptScore W4311546748C127162648 @default.
- W4311546748 hasConceptScore W4311546748C169590947 @default.
- W4311546748 hasConceptScore W4311546748C173608175 @default.
- W4311546748 hasConceptScore W4311546748C2524010 @default.
- W4311546748 hasConceptScore W4311546748C2778562939 @default.
- W4311546748 hasConceptScore W4311546748C31258907 @default.
- W4311546748 hasConceptScore W4311546748C33923547 @default.
- W4311546748 hasConceptScore W4311546748C41008148 @default.
- W4311546748 hasConceptScore W4311546748C74193536 @default.
- W4311546748 hasFunder F4320306084 @default.
- W4311546748 hasLocation W43115467481 @default.
- W4311546748 hasLocation W43115467482 @default.
- W4311546748 hasOpenAccess W4311546748 @default.
- W4311546748 hasPrimaryLocation W43115467481 @default.
- W4311546748 hasRelatedWork W1508811940 @default.
- W4311546748 hasRelatedWork W1583465708 @default.
- W4311546748 hasRelatedWork W1601646354 @default.
- W4311546748 hasRelatedWork W1853049011 @default.
- W4311546748 hasRelatedWork W2074024830 @default.
- W4311546748 hasRelatedWork W2078700326 @default.
- W4311546748 hasRelatedWork W2348711589 @default.
- W4311546748 hasRelatedWork W4235959758 @default.
- W4311546748 hasRelatedWork W4245265375 @default.
- W4311546748 hasRelatedWork W2479014312 @default.
- W4311546748 isParatext "false" @default.
- W4311546748 isRetracted "false" @default.
- W4311546748 workType "article" @default.