Matches in SemOpenAlex for { <https://semopenalex.org/work/W2806880029> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W2806880029 abstract "iFKO (iterative Floating point Kernel Optimizer) is an open-source iterative empirical compilation framework which can be used to tune high performance computing (HPC) kernels. The goal of our research is to advance iterative empirical compilation to the degree that the performance it can achieve is comparable to that delivered by painstaking hand tuning in assembly. This will allow many HPC researchers to spend precious development time on higher level aspects of tuning such as parallelization, as well as enabling computational scientists to develop new algorithms that demand new high performance kernels. At present, algorithms that cannot use hand-tuned performance libraries tend to lose to even inferior algorithms that can. We discuss our new autovectorization technique (speculative vectorization) which can autovectorize loops past dependent branches by speculating along frequently taken paths, even when other paths cannot be effectively vectorized. We implemented this technique in iFKO and demonstrated significant speedup for kernels that prior vectorization techniques could not optimize. We have developed an optimization for two dimensional array indexing that is critical for allowing us to heavily unroll and jam loops without restriction from integer register pressure. We then extended the state of the art single basic block vectorization method, SLP, to vectorize nested loops. We have also introduced optimized reductions that can retain full SIMD parallelization for the entire reduction, as well as doing loop specialization and unswitching as needed to address vector alignment issues and paths inside the loops which inhibit autovectorization. We have also implemented a critical transformation for optimal vectorization of mixed-type data. Combining all these techniques we can now fully vectorize the loopnests for our most complicated kernels, allowing us to achieve performance very close to that of hand-tuned assembly." @default.
- W2806880029 created "2018-06-13" @default.
- W2806880029 creator A5029838000 @default.
- W2806880029 date "2022-06-10" @default.
- W2806880029 modified "2023-10-15" @default.
- W2806880029 title "Empirically Tuning HPC Kernels with iFKO" @default.
- W2806880029 cites W1485499095 @default.
- W2806880029 cites W1596351413 @default.
- W2806880029 cites W1631114303 @default.
- W2806880029 cites W1843198456 @default.
- W2806880029 cites W1851570257 @default.
- W2806880029 cites W1964031104 @default.
- W2806880029 cites W1966324811 @default.
- W2806880029 cites W1970008223 @default.
- W2806880029 cites W1972209410 @default.
- W2806880029 cites W1984972320 @default.
- W2806880029 cites W1988425770 @default.
- W2806880029 cites W2002257715 @default.
- W2806880029 cites W2004692581 @default.
- W2806880029 cites W2020166439 @default.
- W2806880029 cites W2034761517 @default.
- W2806880029 cites W2038469228 @default.
- W2806880029 cites W2046699259 @default.
- W2806880029 cites W2049890071 @default.
- W2806880029 cites W2055084740 @default.
- W2806880029 cites W2068810256 @default.
- W2806880029 cites W2078429521 @default.
- W2806880029 cites W2079658918 @default.
- W2806880029 cites W2083056254 @default.
- W2806880029 cites W2089363288 @default.
- W2806880029 cites W2096070062 @default.
- W2806880029 cites W2099404643 @default.
- W2806880029 cites W2112502633 @default.
- W2806880029 cites W2128249697 @default.
- W2806880029 cites W2135653967 @default.
- W2806880029 cites W2140311411 @default.
- W2806880029 cites W2147128695 @default.
- W2806880029 cites W2147423491 @default.
- W2806880029 cites W2147654959 @default.
- W2806880029 cites W2158308706 @default.
- W2806880029 cites W2164003586 @default.
- W2806880029 cites W2294933397 @default.
- W2806880029 cites W2011393414 @default.
- W2806880029 doi "https://doi.org/10.31390/gradschool_dissertations.4427" @default.
- W2806880029 hasPublicationYear "2022" @default.
- W2806880029 type Work @default.
- W2806880029 sameAs 2806880029 @default.
- W2806880029 citedByCount "0" @default.
- W2806880029 crossrefType "dissertation" @default.
- W2806880029 hasAuthorship W2806880029A5029838000 @default.
- W2806880029 hasBestOaLocation W28068800291 @default.
- W2806880029 hasConcept C11413529 @default.
- W2806880029 hasConcept C114614502 @default.
- W2806880029 hasConcept C150552126 @default.
- W2806880029 hasConcept C154945302 @default.
- W2806880029 hasConcept C161824985 @default.
- W2806880029 hasConcept C173608175 @default.
- W2806880029 hasConcept C2524010 @default.
- W2806880029 hasConcept C2777210771 @default.
- W2806880029 hasConcept C33923547 @default.
- W2806880029 hasConcept C41008148 @default.
- W2806880029 hasConcept C41681595 @default.
- W2806880029 hasConcept C68339613 @default.
- W2806880029 hasConcept C74193536 @default.
- W2806880029 hasConcept C75165309 @default.
- W2806880029 hasConcept C83283714 @default.
- W2806880029 hasConcept C84211073 @default.
- W2806880029 hasConceptScore W2806880029C11413529 @default.
- W2806880029 hasConceptScore W2806880029C114614502 @default.
- W2806880029 hasConceptScore W2806880029C150552126 @default.
- W2806880029 hasConceptScore W2806880029C154945302 @default.
- W2806880029 hasConceptScore W2806880029C161824985 @default.
- W2806880029 hasConceptScore W2806880029C173608175 @default.
- W2806880029 hasConceptScore W2806880029C2524010 @default.
- W2806880029 hasConceptScore W2806880029C2777210771 @default.
- W2806880029 hasConceptScore W2806880029C33923547 @default.
- W2806880029 hasConceptScore W2806880029C41008148 @default.
- W2806880029 hasConceptScore W2806880029C41681595 @default.
- W2806880029 hasConceptScore W2806880029C68339613 @default.
- W2806880029 hasConceptScore W2806880029C74193536 @default.
- W2806880029 hasConceptScore W2806880029C75165309 @default.
- W2806880029 hasConceptScore W2806880029C83283714 @default.
- W2806880029 hasConceptScore W2806880029C84211073 @default.
- W2806880029 hasLocation W28068800291 @default.
- W2806880029 hasOpenAccess W2806880029 @default.
- W2806880029 hasPrimaryLocation W28068800291 @default.
- W2806880029 hasRelatedWork W1663444505 @default.
- W2806880029 hasRelatedWork W2053732522 @default.
- W2806880029 hasRelatedWork W2097213213 @default.
- W2806880029 hasRelatedWork W2312429937 @default.
- W2806880029 hasRelatedWork W2735785139 @default.
- W2806880029 hasRelatedWork W2767612671 @default.
- W2806880029 hasRelatedWork W2806880029 @default.
- W2806880029 hasRelatedWork W2947212999 @default.
- W2806880029 hasRelatedWork W3092494433 @default.
- W2806880029 hasRelatedWork W3146057294 @default.
- W2806880029 isParatext "false" @default.
- W2806880029 isRetracted "false" @default.
- W2806880029 magId "2806880029" @default.
- W2806880029 workType "dissertation" @default.