Matches in SemOpenAlex for { <https://semopenalex.org/work/W2505199798> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W2505199798 abstract "Nested patterns are one of the most frequently occurring algorithmic themes in GPU applications where coarse-grained tasks are constituted from a number of fine-grained ones. However, efficient execution of irregular nested patterns, with coarse-grained tasks that substantially vary in size, has remained an open problem for the GPU's SIMT architecture. Existing methods rely on static task decomposition where one or a fixed number of threads inside the SIMD grouping (warp) carry out the fine-grained tasks. These approaches fail to provide portable performance across diversity of irregular inputs. Moreover, due to intra-warp load imbalance, they incur warp underutilization. In this paper, we introduce a novel software technique called Collaborative Task Engagement (CTE) that, unlike previous methods, achieves sustained high warp execution efficiencies across irregular inputs and provides portable performance. CTE assigns a group of coarse-grained tasks to the warp and allows threads inside the warp carry out the expanded list of fine-grained tasks collaboratively. In multiple rounds, all the warp threads perform mapping portion of fine-grained tasks and participate in a reduction phase with appropriate lanes to reduce calculated values. This scheme avoids over-subscription or under-subscription of threads while preserving the benefits of parallel reduction. We prepared a CUDA C++ device-side template library for developers to easily express nested patterns in GPU kernels using our technique. Our experiments show that CTE delivers up to 37% warp execution efficiency improvement and gives up to 1.51x speedup over sub-warp decomposition with the best sub-warp width." @default.
- W2505199798 created "2016-08-23" @default.
- W2505199798 creator A5036836220 @default.
- W2505199798 creator A5048573356 @default.
- W2505199798 creator A5048949780 @default.
- W2505199798 creator A5068686889 @default.
- W2505199798 date "2016-05-01" @default.
- W2505199798 modified "2023-09-24" @default.
- W2505199798 title "Eliminating Intra-Warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement" @default.
- W2505199798 cites W1504291959 @default.
- W2505199798 cites W1919570435 @default.
- W2505199798 cites W1965830721 @default.
- W2505199798 cites W1972971542 @default.
- W2505199798 cites W1985291160 @default.
- W2505199798 cites W2013247896 @default.
- W2505199798 cites W2029940394 @default.
- W2505199798 cites W2035080386 @default.
- W2505199798 cites W2061313045 @default.
- W2505199798 cites W2123440268 @default.
- W2505199798 cites W2128329055 @default.
- W2505199798 cites W2128853364 @default.
- W2505199798 cites W2143114052 @default.
- W2505199798 cites W2144061463 @default.
- W2505199798 cites W2146591355 @default.
- W2505199798 cites W2151686327 @default.
- W2505199798 cites W2156180003 @default.
- W2505199798 cites W2167675119 @default.
- W2505199798 cites W2171399035 @default.
- W2505199798 cites W2236252626 @default.
- W2505199798 cites W2268177516 @default.
- W2505199798 cites W2295258302 @default.
- W2505199798 cites W2432978112 @default.
- W2505199798 cites W2748306984 @default.
- W2505199798 cites W3098040575 @default.
- W2505199798 cites W4214549590 @default.
- W2505199798 doi "https://doi.org/10.1109/ipdps.2016.36" @default.
- W2505199798 hasPublicationYear "2016" @default.
- W2505199798 type Work @default.
- W2505199798 sameAs 2505199798 @default.
- W2505199798 citedByCount "9" @default.
- W2505199798 countsByYear W25051997982016 @default.
- W2505199798 countsByYear W25051997982017 @default.
- W2505199798 countsByYear W25051997982018 @default.
- W2505199798 countsByYear W25051997982020 @default.
- W2505199798 countsByYear W25051997982022 @default.
- W2505199798 crossrefType "proceedings-article" @default.
- W2505199798 hasAuthorship W2505199798A5036836220 @default.
- W2505199798 hasAuthorship W2505199798A5048573356 @default.
- W2505199798 hasAuthorship W2505199798A5048949780 @default.
- W2505199798 hasAuthorship W2505199798A5068686889 @default.
- W2505199798 hasConcept C111335779 @default.
- W2505199798 hasConcept C124681953 @default.
- W2505199798 hasConcept C138959212 @default.
- W2505199798 hasConcept C150552126 @default.
- W2505199798 hasConcept C162324750 @default.
- W2505199798 hasConcept C173608175 @default.
- W2505199798 hasConcept C187691185 @default.
- W2505199798 hasConcept C187736073 @default.
- W2505199798 hasConcept C18903297 @default.
- W2505199798 hasConcept C2524010 @default.
- W2505199798 hasConcept C2778119891 @default.
- W2505199798 hasConcept C2780451532 @default.
- W2505199798 hasConcept C33923547 @default.
- W2505199798 hasConcept C41008148 @default.
- W2505199798 hasConcept C68339613 @default.
- W2505199798 hasConcept C86803240 @default.
- W2505199798 hasConceptScore W2505199798C111335779 @default.
- W2505199798 hasConceptScore W2505199798C124681953 @default.
- W2505199798 hasConceptScore W2505199798C138959212 @default.
- W2505199798 hasConceptScore W2505199798C150552126 @default.
- W2505199798 hasConceptScore W2505199798C162324750 @default.
- W2505199798 hasConceptScore W2505199798C173608175 @default.
- W2505199798 hasConceptScore W2505199798C187691185 @default.
- W2505199798 hasConceptScore W2505199798C187736073 @default.
- W2505199798 hasConceptScore W2505199798C18903297 @default.
- W2505199798 hasConceptScore W2505199798C2524010 @default.
- W2505199798 hasConceptScore W2505199798C2778119891 @default.
- W2505199798 hasConceptScore W2505199798C2780451532 @default.
- W2505199798 hasConceptScore W2505199798C33923547 @default.
- W2505199798 hasConceptScore W2505199798C41008148 @default.
- W2505199798 hasConceptScore W2505199798C68339613 @default.
- W2505199798 hasConceptScore W2505199798C86803240 @default.
- W2505199798 hasLocation W25051997981 @default.
- W2505199798 hasOpenAccess W2505199798 @default.
- W2505199798 hasPrimaryLocation W25051997981 @default.
- W2505199798 hasRelatedWork W1531488649 @default.
- W2505199798 hasRelatedWork W1585350690 @default.
- W2505199798 hasRelatedWork W1850429294 @default.
- W2505199798 hasRelatedWork W1982599317 @default.
- W2505199798 hasRelatedWork W2000051442 @default.
- W2505199798 hasRelatedWork W2014711461 @default.
- W2505199798 hasRelatedWork W2074226157 @default.
- W2505199798 hasRelatedWork W2167093312 @default.
- W2505199798 hasRelatedWork W2545696998 @default.
- W2505199798 hasRelatedWork W2613115449 @default.
- W2505199798 isParatext "false" @default.
- W2505199798 isRetracted "false" @default.
- W2505199798 magId "2505199798" @default.
- W2505199798 workType "article" @default.