Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311777793> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4311777793 endingPage "108624" @default.
- W4311777793 startingPage "108624" @default.
- W4311777793 abstract "We develop efficient kernels for elemental operators of matrix-free solvers of the Helmholtz equation, which are the core operations for incompressible Navier-Stokes solvers, for use on graphics-processing units (GPUs). Our primary concern in this work is the extension of matrix-free routines to efficiently evaluate this elliptic operator on regular and curvilinear triangular elements in a tensor-product manner. We investigate two types of efficient CUDA kernels for a range of polynomial orders and thus varying arithmetic intensities: the first maps each elemental operation to a CUDA-thread for a completely vectorised kernel, whilst the second maps each element to a CUDA-block for nested parallelism. Our results show that the first option is beneficial for elements with low polynomial order, whereas the second option is beneficial for elements of higher order. The crossover point between these two schemes for the hardware used in this study corresponds to polynomial orders at around P=4−5, depending on element type. For both options, we highlight the importance of the layout of data structures, which necessitates the development of interleaved elemental data for vectorised kernels, and analyse the effect of selecting different memory spaces on the GPU. As the considered kernels are foremost memory-bandwidth bound, we develop kernels for curved elements that trade memory bandwidth against additional arithmetic operations, and demonstrate improved throughput in selected cases. We further compare our optimised CUDA kernels against optimised OpenACC kernels, to contrast the performance between a native and a portable programming model for GPUs." @default.
- W4311777793 created "2022-12-28" @default.
- W4311777793 creator A5058690383 @default.
- W4311777793 creator A5063877104 @default.
- W4311777793 creator A5084269203 @default.
- W4311777793 date "2023-03-01" @default.
- W4311777793 modified "2023-09-26" @default.
- W4311777793 title "Efficient vectorised kernels for unstructured high-order finite element fluid solvers on GPU architectures in two dimensions" @default.
- W4311777793 cites W1972119727 @default.
- W4311777793 cites W1976584147 @default.
- W4311777793 cites W1984334096 @default.
- W4311777793 cites W1985144286 @default.
- W4311777793 cites W2002555321 @default.
- W4311777793 cites W2003408966 @default.
- W4311777793 cites W2031149877 @default.
- W4311777793 cites W2078794610 @default.
- W4311777793 cites W2102076631 @default.
- W4311777793 cites W2102295719 @default.
- W4311777793 cites W2127570421 @default.
- W4311777793 cites W2140741668 @default.
- W4311777793 cites W2148897203 @default.
- W4311777793 cites W2172962898 @default.
- W4311777793 cites W2465804403 @default.
- W4311777793 cites W2502518620 @default.
- W4311777793 cites W2767520933 @default.
- W4311777793 cites W2785159087 @default.
- W4311777793 cites W2786313222 @default.
- W4311777793 cites W2963610942 @default.
- W4311777793 cites W2964086450 @default.
- W4311777793 cites W2996005172 @default.
- W4311777793 cites W3011811412 @default.
- W4311777793 cites W3018936555 @default.
- W4311777793 cites W3103023519 @default.
- W4311777793 cites W3179468891 @default.
- W4311777793 cites W3209330254 @default.
- W4311777793 doi "https://doi.org/10.1016/j.cpc.2022.108624" @default.
- W4311777793 hasPublicationYear "2023" @default.
- W4311777793 type Work @default.
- W4311777793 citedByCount "0" @default.
- W4311777793 crossrefType "journal-article" @default.
- W4311777793 hasAuthorship W4311777793A5058690383 @default.
- W4311777793 hasAuthorship W4311777793A5063877104 @default.
- W4311777793 hasAuthorship W4311777793A5084269203 @default.
- W4311777793 hasBestOaLocation W43117777931 @default.
- W4311777793 hasConcept C118615104 @default.
- W4311777793 hasConcept C173608175 @default.
- W4311777793 hasConcept C188045654 @default.
- W4311777793 hasConcept C2778119891 @default.
- W4311777793 hasConcept C2779851693 @default.
- W4311777793 hasConcept C33923547 @default.
- W4311777793 hasConcept C41008148 @default.
- W4311777793 hasConcept C459310 @default.
- W4311777793 hasConcept C74193536 @default.
- W4311777793 hasConceptScore W4311777793C118615104 @default.
- W4311777793 hasConceptScore W4311777793C173608175 @default.
- W4311777793 hasConceptScore W4311777793C188045654 @default.
- W4311777793 hasConceptScore W4311777793C2778119891 @default.
- W4311777793 hasConceptScore W4311777793C2779851693 @default.
- W4311777793 hasConceptScore W4311777793C33923547 @default.
- W4311777793 hasConceptScore W4311777793C41008148 @default.
- W4311777793 hasConceptScore W4311777793C459310 @default.
- W4311777793 hasConceptScore W4311777793C74193536 @default.
- W4311777793 hasFunder F4320320283 @default.
- W4311777793 hasFunder F4320334627 @default.
- W4311777793 hasLocation W43117777931 @default.
- W4311777793 hasOpenAccess W4311777793 @default.
- W4311777793 hasPrimaryLocation W43117777931 @default.
- W4311777793 hasRelatedWork W1965456742 @default.
- W4311777793 hasRelatedWork W1997955449 @default.
- W4311777793 hasRelatedWork W2002833390 @default.
- W4311777793 hasRelatedWork W2003609199 @default.
- W4311777793 hasRelatedWork W2011159963 @default.
- W4311777793 hasRelatedWork W2030707850 @default.
- W4311777793 hasRelatedWork W2048593763 @default.
- W4311777793 hasRelatedWork W2063888806 @default.
- W4311777793 hasRelatedWork W2364686214 @default.
- W4311777793 hasRelatedWork W2794923745 @default.
- W4311777793 hasVolume "284" @default.
- W4311777793 isParatext "false" @default.
- W4311777793 isRetracted "false" @default.
- W4311777793 workType "article" @default.