Matches in SemOpenAlex for { <https://semopenalex.org/work/W2914575557> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W2914575557 abstract "The need for high performance coupled with the increasing design complexity of modern processors and power and thermal constraints has led to the development of multi-cores systems. Examples of such systems include IBM/Toshiba's Cell processor, Intel's Core 2 Duo processor. One of the ways to exploit the hardware parallelism of such systems is via thread-level program parallelization. Although there has been a large amount of work done in the context of multithreading, the lack of detailed application characterization on real machines makes it difficult to assess the relevance and importance of the problems addressed in prior work and also of the practicality of the solutions proposed. To alleviate this limitation, we did a thorough analysis of ordinary programs, as represented by industry-standard SPEC benchmarks, on both IA-32 and IA-64 architectures to identify real performance bottlenecks. Based on the above and given that loops account for a large percentage of the total execution time in ordinary programs, we propose techniques for extracting thread-level parallelism (TLP) from both—DOALL and non-DOALL —type of loops. Extraction of TLP from DOALL loops entails efficient partitioning and mapping of a DOALL loop so as to achieve load balance between the different processors. In this regard, we present a general approach for partitioning nested DOALL loops, both perfect and non-perfect, with conditionals, with rectangular and non-rectangular iteration geometries, where the expressions in a conditional are affine functions of the outer loop indices. Non-DOALL loops can be parallelized either speculatively (TLS) or via explicit synchronization. Although TLS enables parallel execution of difficult-to-analyze (at compile time) program regions, its efficacy is limited by a wide variety of factors such as high misspeculation penalty and the need for additional hardware. This necessitates an evaluation of the performance potential of TLS. Using the Intel Fortran/C++ compiler, we show that the speedup achievable via TLS, at the loop level, is minimal in ordinary programs. Therefore, we adopted explicit synchronization as the way to parallelize non- DOALL loops and proposed lightweight lock-free synchronization techniques for extracting TLP from non-DOALL loops. We show that the proposed techniques achieve better performance than the state-of-the-art on real machines." @default.
- W2914575557 created "2019-02-21" @default.
- W2914575557 creator A5042512979 @default.
- W2914575557 creator A5047988079 @default.
- W2914575557 date "2008-01-01" @default.
- W2914575557 modified "2023-09-28" @default.
- W2914575557 title "On the evaluation and extraction of thread-level parallelism in ordinary programs" @default.
- W2914575557 hasPublicationYear "2008" @default.
- W2914575557 type Work @default.
- W2914575557 sameAs 2914575557 @default.
- W2914575557 citedByCount "0" @default.
- W2914575557 crossrefType "journal-article" @default.
- W2914575557 hasAuthorship W2914575557A5042512979 @default.
- W2914575557 hasAuthorship W2914575557A5047988079 @default.
- W2914575557 hasConcept C138101251 @default.
- W2914575557 hasConcept C140763907 @default.
- W2914575557 hasConcept C165696696 @default.
- W2914575557 hasConcept C173608175 @default.
- W2914575557 hasConcept C199360897 @default.
- W2914575557 hasConcept C201410400 @default.
- W2914575557 hasConcept C2778565505 @default.
- W2914575557 hasConcept C2781172179 @default.
- W2914575557 hasConcept C38652104 @default.
- W2914575557 hasConcept C41008148 @default.
- W2914575557 hasConcept C42992933 @default.
- W2914575557 hasConcept C78766204 @default.
- W2914575557 hasConcept C85717602 @default.
- W2914575557 hasConceptScore W2914575557C138101251 @default.
- W2914575557 hasConceptScore W2914575557C140763907 @default.
- W2914575557 hasConceptScore W2914575557C165696696 @default.
- W2914575557 hasConceptScore W2914575557C173608175 @default.
- W2914575557 hasConceptScore W2914575557C199360897 @default.
- W2914575557 hasConceptScore W2914575557C201410400 @default.
- W2914575557 hasConceptScore W2914575557C2778565505 @default.
- W2914575557 hasConceptScore W2914575557C2781172179 @default.
- W2914575557 hasConceptScore W2914575557C38652104 @default.
- W2914575557 hasConceptScore W2914575557C41008148 @default.
- W2914575557 hasConceptScore W2914575557C42992933 @default.
- W2914575557 hasConceptScore W2914575557C78766204 @default.
- W2914575557 hasConceptScore W2914575557C85717602 @default.
- W2914575557 hasLocation W29145755571 @default.
- W2914575557 hasOpenAccess W2914575557 @default.
- W2914575557 hasPrimaryLocation W29145755571 @default.
- W2914575557 hasRelatedWork W1556466444 @default.
- W2914575557 hasRelatedWork W1577309527 @default.
- W2914575557 hasRelatedWork W1591453641 @default.
- W2914575557 hasRelatedWork W1604421843 @default.
- W2914575557 hasRelatedWork W1898598714 @default.
- W2914575557 hasRelatedWork W1976146408 @default.
- W2914575557 hasRelatedWork W1980385176 @default.
- W2914575557 hasRelatedWork W2023223057 @default.
- W2914575557 hasRelatedWork W2059654835 @default.
- W2914575557 hasRelatedWork W2079641645 @default.
- W2914575557 hasRelatedWork W2085309342 @default.
- W2914575557 hasRelatedWork W2088119230 @default.
- W2914575557 hasRelatedWork W2094407713 @default.
- W2914575557 hasRelatedWork W2187874404 @default.
- W2914575557 hasRelatedWork W2291083400 @default.
- W2914575557 hasRelatedWork W2513828528 @default.
- W2914575557 hasRelatedWork W2793755249 @default.
- W2914575557 hasRelatedWork W2943205854 @default.
- W2914575557 hasRelatedWork W302866077 @default.
- W2914575557 hasRelatedWork W74270536 @default.
- W2914575557 isParatext "false" @default.
- W2914575557 isRetracted "false" @default.
- W2914575557 magId "2914575557" @default.
- W2914575557 workType "article" @default.