Matches in SemOpenAlex for { <https://semopenalex.org/work/W74295851> ?p ?o ?g. }
- W74295851 abstract "With increasing demands for performance by embedded systems, especially by digital signal processing (DSP) applications, embedded processors must increase available instruction-level parallelism (ILP) within significant constraints on power consumption and chip cost. Unfortunately, supporting a large amount of ILP on a processor while maintaining a single register file increases chip cost and potentially decreases overall performance due to increased cycle time. To address this problem, some modern embedded processors partition the register file into multiple low-ported register files, each directly connected with one or more functional units. These functional unit/register file groups are called clusters. Clustered VLIW (very long instruction word) architectures need extra copy operations or delays to transfer values among clusters. To take advantage of clustered architectures, the compiler must expose parallelism for maximal functional-unit utilization, and schedule instructions to reduce intercluster communication overhead. High-level loop transformations offer an excellent opportunity to enhance the abilities of low-level optimizers to generate code for clustered architectures. This dissertation investigates the effects of three loop transformations, i.e., loop fusion, loop unrolling, and unroll-and-jam, on clustered VLIW architectures. The objective is to achieve high performance with low communication overhead. This dissertation discusses the following techniques: (1) Loop fusion. This research examines the impact of loop fusion on clustered architectures. A metric based upon communication costs for guiding loop fusion is developed and tested on DSP benchmarks. (2) Unroll-and-jam and loop unrolling. A new method that integrates a communication cost model with an integer-optimization problem is developed to determine unroll amounts for loop unrolling and unroll-and-jam automatically for a specific loop on a specific architecture. These techniques have been implemented and tested using DSP benchmarks on simulated, clustered VLIW architectures and a real clustered, embedded processor, the TI TMS320C64X. The results show that the new techniques achieve an average speedup of 1.72–1.89 on five different clustered architectures." @default.
- W74295851 created "2016-06-24" @default.
- W74295851 creator A5002599549 @default.
- W74295851 creator A5026633500 @default.
- W74295851 date "2020-12-10" @default.
- W74295851 modified "2023-09-23" @default.
- W74295851 title "Loop transformations for clustered VLIW architectures" @default.
- W74295851 cites W105128281 @default.
- W74295851 cites W1500590019 @default.
- W74295851 cites W152868167 @default.
- W74295851 cites W1533942910 @default.
- W74295851 cites W1536051636 @default.
- W74295851 cites W1542604854 @default.
- W74295851 cites W1560132800 @default.
- W74295851 cites W1577093684 @default.
- W74295851 cites W1585509108 @default.
- W74295851 cites W1601372155 @default.
- W74295851 cites W1966708457 @default.
- W74295851 cites W1971065013 @default.
- W74295851 cites W1986906448 @default.
- W74295851 cites W2040167141 @default.
- W74295851 cites W2043555680 @default.
- W74295851 cites W2051533028 @default.
- W74295851 cites W2057577013 @default.
- W74295851 cites W2063397164 @default.
- W74295851 cites W2073866852 @default.
- W74295851 cites W2078113878 @default.
- W74295851 cites W2089463224 @default.
- W74295851 cites W2099206054 @default.
- W74295851 cites W2100097836 @default.
- W74295851 cites W2102582914 @default.
- W74295851 cites W2108315152 @default.
- W74295851 cites W2115184416 @default.
- W74295851 cites W2119609467 @default.
- W74295851 cites W2123516374 @default.
- W74295851 cites W2129962996 @default.
- W74295851 cites W2131929304 @default.
- W74295851 cites W2139802090 @default.
- W74295851 cites W2146468199 @default.
- W74295851 cites W2148967747 @default.
- W74295851 cites W2155006335 @default.
- W74295851 cites W2158161220 @default.
- W74295851 cites W2166665311 @default.
- W74295851 cites W2172062522 @default.
- W74295851 cites W2296760900 @default.
- W74295851 cites W90084074 @default.
- W74295851 doi "https://doi.org/10.37099/mtu.dc.etds/181" @default.
- W74295851 hasPublicationYear "2020" @default.
- W74295851 type Work @default.
- W74295851 sameAs 74295851 @default.
- W74295851 citedByCount "2" @default.
- W74295851 crossrefType "dissertation" @default.
- W74295851 hasAuthorship W74295851A5002599549 @default.
- W74295851 hasAuthorship W74295851A5026633500 @default.
- W74295851 hasBestOaLocation W742958511 @default.
- W74295851 hasConcept C111919701 @default.
- W74295851 hasConcept C117280010 @default.
- W74295851 hasConcept C11799548 @default.
- W74295851 hasConcept C128916667 @default.
- W74295851 hasConcept C133162039 @default.
- W74295851 hasConcept C140763907 @default.
- W74295851 hasConcept C169590947 @default.
- W74295851 hasConcept C170595534 @default.
- W74295851 hasConcept C173608175 @default.
- W74295851 hasConcept C190902152 @default.
- W74295851 hasConcept C202491316 @default.
- W74295851 hasConcept C26517878 @default.
- W74295851 hasConcept C2779960059 @default.
- W74295851 hasConcept C2781172179 @default.
- W74295851 hasConcept C29331672 @default.
- W74295851 hasConcept C41008148 @default.
- W74295851 hasConcept C76970557 @default.
- W74295851 hasConcept C84462506 @default.
- W74295851 hasConcept C9390403 @default.
- W74295851 hasConceptScore W74295851C111919701 @default.
- W74295851 hasConceptScore W74295851C117280010 @default.
- W74295851 hasConceptScore W74295851C11799548 @default.
- W74295851 hasConceptScore W74295851C128916667 @default.
- W74295851 hasConceptScore W74295851C133162039 @default.
- W74295851 hasConceptScore W74295851C140763907 @default.
- W74295851 hasConceptScore W74295851C169590947 @default.
- W74295851 hasConceptScore W74295851C170595534 @default.
- W74295851 hasConceptScore W74295851C173608175 @default.
- W74295851 hasConceptScore W74295851C190902152 @default.
- W74295851 hasConceptScore W74295851C202491316 @default.
- W74295851 hasConceptScore W74295851C26517878 @default.
- W74295851 hasConceptScore W74295851C2779960059 @default.
- W74295851 hasConceptScore W74295851C2781172179 @default.
- W74295851 hasConceptScore W74295851C29331672 @default.
- W74295851 hasConceptScore W74295851C41008148 @default.
- W74295851 hasConceptScore W74295851C76970557 @default.
- W74295851 hasConceptScore W74295851C84462506 @default.
- W74295851 hasConceptScore W74295851C9390403 @default.
- W74295851 hasLocation W742958511 @default.
- W74295851 hasOpenAccess W74295851 @default.
- W74295851 hasPrimaryLocation W742958511 @default.
- W74295851 hasRelatedWork W1580623006 @default.
- W74295851 hasRelatedWork W1581686471 @default.
- W74295851 hasRelatedWork W1584526780 @default.
- W74295851 hasRelatedWork W1605154670 @default.