Matches in SemOpenAlex for { <https://semopenalex.org/work/W2920856836> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W2920856836 endingPage "869" @default.
- W2920856836 startingPage "862" @default.
- W2920856836 abstract "This article presents an optimization method of the parallelism extraction algorithm using spanning tree that automatically exploits the parallelism and determines an execution order of multiple kernel programs in a distributed environment. In stream‐based computing, efficient parallel execution requires careful scheduling of the invocation of the kernel programs. By mapping a kernel to a node and an I/O stream to an edge, the entire stream process can be treated as a spanning tree. The spanning tree, which allows feedback and feedforward edges, is effective for expressing dependencies that exist among kernels. In spanning tree, the nodes at the same depth do not have edges between them, and thus can be executed in parallel in the case parent nodes have been already executed. The series of the nodes can be executed in a pipelined manner. Thus, the proposed algorithm can extract both spatial and temporal parallelism. However, if the algorithm is applied for feedbacks as it is, because of waiting for the completion of the loop among the nodes, it causes the waste of time. To solve this problem, the parallel pattern can be optimized in the step of generating the communication pattern to increase the degree of parallelism. In addition, because of the difference in execution time among kernels, the load balancing can be considered for an optimization for the algorithm. To evaluate the effectiveness of the optimized algorithm, a k‐means application was developed and parallelized especially for the feedback processing. The results show that the parallel execution using two nodes of a graphics processing unit (GPU) cluster obtained 1.5 times speedup. With load balancing, the parallel execution using four nodes of the cluster obtained up to 3.5 times speedup in 2D‐FFT and 3.0 times speedup in LU decomposition, compared to the execution on a single GPU. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc." @default.
- W2920856836 created "2019-03-22" @default.
- W2920856836 creator A5027866181 @default.
- W2920856836 creator A5051244009 @default.
- W2920856836 creator A5078633631 @default.
- W2920856836 date "2019-03-13" @default.
- W2920856836 modified "2023-10-16" @default.
- W2920856836 title "Optimization in the parallelism extraction algorithm with spanning tree on a multi‐GPU environment" @default.
- W2920856836 cites W2009937690 @default.
- W2920856836 cites W2068394524 @default.
- W2920856836 cites W2104223883 @default.
- W2920856836 cites W2118382442 @default.
- W2920856836 cites W2135920560 @default.
- W2920856836 cites W2271498114 @default.
- W2920856836 cites W2536427613 @default.
- W2920856836 cites W2572431316 @default.
- W2920856836 cites W2769798155 @default.
- W2920856836 cites W4245964587 @default.
- W2920856836 cites W4255334208 @default.
- W2920856836 doi "https://doi.org/10.1002/tee.22875" @default.
- W2920856836 hasPublicationYear "2019" @default.
- W2920856836 type Work @default.
- W2920856836 sameAs 2920856836 @default.
- W2920856836 citedByCount "0" @default.
- W2920856836 crossrefType "journal-article" @default.
- W2920856836 hasAuthorship W2920856836A5027866181 @default.
- W2920856836 hasAuthorship W2920856836A5051244009 @default.
- W2920856836 hasAuthorship W2920856836A5078633631 @default.
- W2920856836 hasConcept C113174947 @default.
- W2920856836 hasConcept C11413529 @default.
- W2920856836 hasConcept C114614502 @default.
- W2920856836 hasConcept C120373497 @default.
- W2920856836 hasConcept C126255220 @default.
- W2920856836 hasConcept C134306372 @default.
- W2920856836 hasConcept C13743678 @default.
- W2920856836 hasConcept C173608175 @default.
- W2920856836 hasConcept C206729178 @default.
- W2920856836 hasConcept C2781172179 @default.
- W2920856836 hasConcept C33923547 @default.
- W2920856836 hasConcept C41008148 @default.
- W2920856836 hasConcept C61483411 @default.
- W2920856836 hasConcept C64331007 @default.
- W2920856836 hasConcept C74193536 @default.
- W2920856836 hasConceptScore W2920856836C113174947 @default.
- W2920856836 hasConceptScore W2920856836C11413529 @default.
- W2920856836 hasConceptScore W2920856836C114614502 @default.
- W2920856836 hasConceptScore W2920856836C120373497 @default.
- W2920856836 hasConceptScore W2920856836C126255220 @default.
- W2920856836 hasConceptScore W2920856836C134306372 @default.
- W2920856836 hasConceptScore W2920856836C13743678 @default.
- W2920856836 hasConceptScore W2920856836C173608175 @default.
- W2920856836 hasConceptScore W2920856836C206729178 @default.
- W2920856836 hasConceptScore W2920856836C2781172179 @default.
- W2920856836 hasConceptScore W2920856836C33923547 @default.
- W2920856836 hasConceptScore W2920856836C41008148 @default.
- W2920856836 hasConceptScore W2920856836C61483411 @default.
- W2920856836 hasConceptScore W2920856836C64331007 @default.
- W2920856836 hasConceptScore W2920856836C74193536 @default.
- W2920856836 hasIssue "6" @default.
- W2920856836 hasLocation W29208568361 @default.
- W2920856836 hasOpenAccess W2920856836 @default.
- W2920856836 hasPrimaryLocation W29208568361 @default.
- W2920856836 hasRelatedWork W1534022569 @default.
- W2920856836 hasRelatedWork W1578204257 @default.
- W2920856836 hasRelatedWork W1608806855 @default.
- W2920856836 hasRelatedWork W1850053445 @default.
- W2920856836 hasRelatedWork W2023505575 @default.
- W2920856836 hasRelatedWork W2047588290 @default.
- W2920856836 hasRelatedWork W2074226157 @default.
- W2920856836 hasRelatedWork W2313503008 @default.
- W2920856836 hasRelatedWork W2366027386 @default.
- W2920856836 hasRelatedWork W2378666660 @default.
- W2920856836 hasVolume "14" @default.
- W2920856836 isParatext "false" @default.
- W2920856836 isRetracted "false" @default.
- W2920856836 magId "2920856836" @default.
- W2920856836 workType "article" @default.