SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W2903669223> ?p ?o ?g. }

Showing items 1 to 73 of 73 with 100 items per page.

W2903669223 endingPage "76" @default.
W2903669223 startingPage "66" @default.
W2903669223 abstract "Deep learning (DL) is one of the key technologies in the artificial intelligence (AI) domain Deep learning neural networks (DLNN) profit a lot from the overall exponential data growth while on the other hand the computational effort for training and inference strongly increase. Most of the computational time in DLNN is consumed by the convolution step, which is based on a general matrix multiplication (GEMM). In order to accelerate the computational time for DLNN different highly optimized GEMM implementations for Graphic Processing Units (GPUs) have been presented in the last years [1] most of these approaches are GPU hardware specific implementations of the GEMM software kernel and do not incorporate the performance dependency of the training data layout. In order to achieve a maximum performance the parameters of the GEMM algorithm have to be tuned for the different GPU hardware and specific data layout of the training task. In this paper we present a two step autotuning approach for GPU based GEMM algorithms. In the first step the kernel parameter search space is pruned by several performance criteria and afterwards further processed by a modified Simulated Annealing in order to find the best kernel parameter combinations with respect to the GPU hardware and the task specific data layout. Our results were carried out on 160 different input problems with the proposed approach an average speedup against the state of the art implementation from NVIDIA (cuBLAS) from around 12 on a NVIDIA GTX 1080 Ti accelerator card can be achieved." @default.
W2903669223 created "2018-12-22" @default.
W2903669223 creator A5055403360 @default.
W2903669223 creator A5077109115 @default.
W2903669223 creator A5087367545 @default.
W2903669223 date "2018-12-18" @default.
W2903669223 modified "2023-09-27" @default.
W2903669223 title "GPU GEMM-Kernel Autotuning for scalable machine learners" @default.
W2903669223 cites W1553069021 @default.
W2903669223 cites W1782174992 @default.
W2903669223 cites W1863336885 @default.
W2903669223 cites W1947563839 @default.
W2903669223 cites W1975001341 @default.
W2903669223 cites W2063186542 @default.
W2903669223 cites W2099021415 @default.
W2903669223 cites W2099625934 @default.
W2903669223 cites W2124592110 @default.
W2903669223 cites W2128539477 @default.
W2903669223 cites W2182472769 @default.
W2903669223 cites W2254715784 @default.
W2903669223 cites W2261553819 @default.
W2903669223 cites W2408618172 @default.
W2903669223 cites W2499931820 @default.
W2903669223 cites W2609311871 @default.
W2903669223 cites W2617819327 @default.
W2903669223 cites W2728600909 @default.
W2903669223 doi "https://doi.org/10.1007/978-3-662-58485-9_8" @default.
W2903669223 hasPublicationYear "2018" @default.
W2903669223 type Work @default.
W2903669223 sameAs 2903669223 @default.
W2903669223 citedByCount "1" @default.
W2903669223 countsByYear W29036692232020 @default.
W2903669223 crossrefType "book-chapter" @default.
W2903669223 hasAuthorship W2903669223A5055403360 @default.
W2903669223 hasAuthorship W2903669223A5077109115 @default.
W2903669223 hasAuthorship W2903669223A5087367545 @default.
W2903669223 hasBestOaLocation W29036692231 @default.
W2903669223 hasConcept C111919701 @default.
W2903669223 hasConcept C114614502 @default.
W2903669223 hasConcept C154945302 @default.
W2903669223 hasConcept C173608175 @default.
W2903669223 hasConcept C33923547 @default.
W2903669223 hasConcept C41008148 @default.
W2903669223 hasConcept C48044578 @default.
W2903669223 hasConcept C74193536 @default.
W2903669223 hasConceptScore W2903669223C111919701 @default.
W2903669223 hasConceptScore W2903669223C114614502 @default.
W2903669223 hasConceptScore W2903669223C154945302 @default.
W2903669223 hasConceptScore W2903669223C173608175 @default.
W2903669223 hasConceptScore W2903669223C33923547 @default.
W2903669223 hasConceptScore W2903669223C41008148 @default.
W2903669223 hasConceptScore W2903669223C48044578 @default.
W2903669223 hasConceptScore W2903669223C74193536 @default.
W2903669223 hasLocation W29036692231 @default.
W2903669223 hasLocation W29036692232 @default.
W2903669223 hasOpenAccess W2903669223 @default.
W2903669223 hasPrimaryLocation W29036692231 @default.
W2903669223 hasRelatedWork W1531780705 @default.
W2903669223 hasRelatedWork W1547595128 @default.
W2903669223 hasRelatedWork W1580730938 @default.
W2903669223 hasRelatedWork W1595151633 @default.
W2903669223 hasRelatedWork W1604898313 @default.
W2903669223 hasRelatedWork W1784521533 @default.
W2903669223 hasRelatedWork W2000058275 @default.
W2903669223 hasRelatedWork W2370911386 @default.
W2903669223 hasRelatedWork W4250047567 @default.
W2903669223 hasRelatedWork W2503642292 @default.
W2903669223 isParatext "false" @default.
W2903669223 isRetracted "false" @default.
W2903669223 magId "2903669223" @default.
W2903669223 workType "book-chapter" @default.