Matches in SemOpenAlex for { <https://semopenalex.org/work/W2992034195> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W2992034195 abstract "In the recent years, the complexity of optimizing compilers has increased significantly due to increasing diversity of programming languages and heterogeneity of target architectures. Even though there has been a lot of progress with the general purpose compilers, they are not been able to extract peak level performance provided by the specialized libraries. To bridge this performance gap domain specific compilers(DSLs) are proposed, by restricting input to a specialized domain it can perform more aggressive transformations needed to achieve peak performance while being more flexible than standard libraries. One of the major optimization needed to obtain high performance on modern heterogeneous architectures is loop transformations to exploiting locality and automatic parallelization. The polyhedral model has evolved as a highly efficient, reusable generic framework for loop optimizations especially for regular static control affine programs. In this thesis we explore the suitability of polyhedral loop transformation framework in context of compiling Image processing and Deep learning pipelines. We study the challenges of adapting a generic polyhedral scheduler for DSLs. We propose various extensions to the scheduler to find optimal schedule by modeling various hardware and application characteristics.We present method to handle reductions in polyhedral model. In the state-of-the-art polyhedral compilers there was no support for reductions. The reduction loop was treated as a serial loop and this may be a major bottleneck for several applications especially on GPUs. We propose languages extensions in PENCIL to express arbitrary user-defined reductions. We encode this reduction information in polyhedral model using reduction dependences. We show how to use this dependences in polyhedral scheduler to exploit parallelization of reduction loops. We also propose a template based code generation for emitting highly efficient reduction code for GPUs. We validate our approach through experiments by comparing automatically generated code with the highly tuned library.Exploiting locality is a key factor in achieving high performance on the complex processors with complex memory/computation hierarchies. The cost function used in the Pluto algorithm optimizes only temporal locality. Exploiting spatial locality is as important as temporal locality and it has implications on vectorization and coalesced memory accesses. we propose a new unified algorithm for optimizing parallelism and locality in loop nests, that is capable of modeling temporal and spatial effects of multiprocessors and accelerators with deep memory hierarchies and multiple levels of parallelism. It orchestrates a collection of parametrizable optimization problems for locality and parallelism objectives over a polyhedral space of semantics-preserving transformations. We discuss the rationale for this unified algorithm, and validate it on a collection of representative computational kernels/benchmarks.We study the challenges of using polyhedral compilation techniques for a complex, real-world, end-to-end image processing application called SLAMBench. The SLAMBench has several non-affine kernels that not readily amendable for polyhedral compilation.We show the usefulness of summary functions to compile such non-affine parts of the program thus extending the reach of polyhedral compilation. We also present prl runtime library needed to avoid redundant data transfers between device and host. We validate our high-level compilation approach through experiments comparing the performance of the generated code with the highly optimized manual version of the SLAMBench.We also study the applicability of polyhedral compilation for optimizing deep learning pipelines. Most of the operations in the deep learning pipelines are affine hence are suitability for polyhedral compilation. Our framework is build on TVM an end-to-end deep learning compilation framework with support for multiple front ends such as MXNet, Tensorflow etc. and supports multiple different architectures. We extract the polyhedral representation from TVM IR and use polyhedral scheduler along with performance model based autotuning to automatically find the schedules for TVM operators. In this context we extend the polyhedral scheduler to find optimal schedules for different sizes and shapes of the tensor. We model the amount of data reuse for the case when all the parameter values are known, and formulate the constraints to ILP to maximize data reuse. We also present a performance model based autotuning technique that can cut down the tuning time from hours to minutes. We conduct experiments on the common deep learning benchmarks validating the effectiveness and general applicability of our technique in providing portable performance.Finally, we summarize our work and present concluding remarks as well as future research direc- tions. We believe with the improvements proposed in this dissertation improves the effectiveness of polyhedral framework as a loop transformation framework for compiling DSLs." @default.
- W2992034195 created "2019-12-13" @default.
- W2992034195 creator A5064495467 @default.
- W2992034195 date "2019-03-31" @default.
- W2992034195 modified "2023-09-23" @default.
- W2992034195 title "Polyhedral Compilation for Domain Specific Languages" @default.
- W2992034195 hasPublicationYear "2019" @default.
- W2992034195 type Work @default.
- W2992034195 sameAs 2992034195 @default.
- W2992034195 citedByCount "0" @default.
- W2992034195 crossrefType "dissertation" @default.
- W2992034195 hasAuthorship W2992034195A5064495467 @default.
- W2992034195 hasConcept C111335779 @default.
- W2992034195 hasConcept C113391598 @default.
- W2992034195 hasConcept C118615104 @default.
- W2992034195 hasConcept C120314980 @default.
- W2992034195 hasConcept C135257023 @default.
- W2992034195 hasConcept C138885662 @default.
- W2992034195 hasConcept C145691206 @default.
- W2992034195 hasConcept C149635348 @default.
- W2992034195 hasConcept C151730666 @default.
- W2992034195 hasConcept C162324750 @default.
- W2992034195 hasConcept C169590947 @default.
- W2992034195 hasConcept C173608175 @default.
- W2992034195 hasConcept C199360897 @default.
- W2992034195 hasConcept C200833197 @default.
- W2992034195 hasConcept C202444582 @default.
- W2992034195 hasConcept C206729178 @default.
- W2992034195 hasConcept C21547014 @default.
- W2992034195 hasConcept C2524010 @default.
- W2992034195 hasConcept C2779343474 @default.
- W2992034195 hasConcept C2779808786 @default.
- W2992034195 hasConcept C2780513914 @default.
- W2992034195 hasConcept C33923547 @default.
- W2992034195 hasConcept C41008148 @default.
- W2992034195 hasConcept C41895202 @default.
- W2992034195 hasConcept C76970557 @default.
- W2992034195 hasConcept C80444323 @default.
- W2992034195 hasConcept C86803240 @default.
- W2992034195 hasConcept C92757383 @default.
- W2992034195 hasConceptScore W2992034195C111335779 @default.
- W2992034195 hasConceptScore W2992034195C113391598 @default.
- W2992034195 hasConceptScore W2992034195C118615104 @default.
- W2992034195 hasConceptScore W2992034195C120314980 @default.
- W2992034195 hasConceptScore W2992034195C135257023 @default.
- W2992034195 hasConceptScore W2992034195C138885662 @default.
- W2992034195 hasConceptScore W2992034195C145691206 @default.
- W2992034195 hasConceptScore W2992034195C149635348 @default.
- W2992034195 hasConceptScore W2992034195C151730666 @default.
- W2992034195 hasConceptScore W2992034195C162324750 @default.
- W2992034195 hasConceptScore W2992034195C169590947 @default.
- W2992034195 hasConceptScore W2992034195C173608175 @default.
- W2992034195 hasConceptScore W2992034195C199360897 @default.
- W2992034195 hasConceptScore W2992034195C200833197 @default.
- W2992034195 hasConceptScore W2992034195C202444582 @default.
- W2992034195 hasConceptScore W2992034195C206729178 @default.
- W2992034195 hasConceptScore W2992034195C21547014 @default.
- W2992034195 hasConceptScore W2992034195C2524010 @default.
- W2992034195 hasConceptScore W2992034195C2779343474 @default.
- W2992034195 hasConceptScore W2992034195C2779808786 @default.
- W2992034195 hasConceptScore W2992034195C2780513914 @default.
- W2992034195 hasConceptScore W2992034195C33923547 @default.
- W2992034195 hasConceptScore W2992034195C41008148 @default.
- W2992034195 hasConceptScore W2992034195C41895202 @default.
- W2992034195 hasConceptScore W2992034195C76970557 @default.
- W2992034195 hasConceptScore W2992034195C80444323 @default.
- W2992034195 hasConceptScore W2992034195C86803240 @default.
- W2992034195 hasConceptScore W2992034195C92757383 @default.
- W2992034195 hasOpenAccess W2992034195 @default.
- W2992034195 hasRelatedWork W1565283590 @default.
- W2992034195 hasRelatedWork W1569608833 @default.
- W2992034195 hasRelatedWork W1984692061 @default.
- W2992034195 hasRelatedWork W1988659054 @default.
- W2992034195 hasRelatedWork W2016801240 @default.
- W2992034195 hasRelatedWork W2227848409 @default.
- W2992034195 hasRelatedWork W2252549844 @default.
- W2992034195 hasRelatedWork W2291474020 @default.
- W2992034195 hasRelatedWork W2314944927 @default.
- W2992034195 hasRelatedWork W243738321 @default.
- W2992034195 hasRelatedWork W2606167063 @default.
- W2992034195 hasRelatedWork W2625706279 @default.
- W2992034195 hasRelatedWork W2744165236 @default.
- W2992034195 hasRelatedWork W2772573827 @default.
- W2992034195 hasRelatedWork W2788464413 @default.
- W2992034195 hasRelatedWork W2963094369 @default.
- W2992034195 hasRelatedWork W2979396597 @default.
- W2992034195 hasRelatedWork W3118614023 @default.
- W2992034195 hasRelatedWork W3132664025 @default.
- W2992034195 hasRelatedWork W1569709052 @default.
- W2992034195 isParatext "false" @default.
- W2992034195 isRetracted "false" @default.
- W2992034195 magId "2992034195" @default.
- W2992034195 workType "dissertation" @default.