Matches in SemOpenAlex for { <https://semopenalex.org/work/W4379794547> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W4379794547 endingPage "57528" @default.
- W4379794547 startingPage "57514" @default.
- W4379794547 abstract "Nowadays, convolutional neural networks are among the most widely used types of deep learning networks thanks to their usefulness in many application domains. There are many efforts to find methods to increase their training and inference performance and efficiency. One of the most widely used technique to implement convolution consists of flattening tensors into 2D matrices and carrying out the operation through a matrix-matrix multiplication routine, which has highly optimized implementations in high-performance libraries. However, this kind of approach uses extra time and memory to transform and store the tensors involved. For this reason, <italic xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>direct convolution</i> is becoming increasingly popular. Direct convolution can be implemented as a series of nested loops iterating over tensor dimensions and it does not require extra memory. In this work, we evaluate on various multi-core CPUs the performance and scalability effects deriving from different parallelization strategies, loop organizations, and SIMD-vectorization approaches with different compilers in relation with architectural aspects. We discuss each parameter thoroughly and distill our findings in a set of heuristics that can be used to quickly achieve a high-performance implementation in accordance to the underlying hardware and the characteristics of the convolutional layer at hand. By adopting a per-layer approach, we increase performance up to 60-70% compared to a static implementation for all the layers. Moreover, our results are comparable, or even better (up to <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink> <tex-math notation=LaTeX>$1.67times $ </tex-math></inline-formula> speedup) than matrix-matrix multiplication-based convolution in a multi-core system." @default.
- W4379794547 created "2023-06-09" @default.
- W4379794547 creator A5001351683 @default.
- W4379794547 creator A5004350640 @default.
- W4379794547 creator A5021283515 @default.
- W4379794547 creator A5047011963 @default.
- W4379794547 date "2023-01-01" @default.
- W4379794547 modified "2023-09-25" @default.
- W4379794547 title "Analysis and Optimization of Direct Convolution Execution on Multi-Core Processors" @default.
- W4379794547 cites W1970456555 @default.
- W4379794547 cites W2078224158 @default.
- W4379794547 cites W2120615054 @default.
- W4379794547 cites W2155893237 @default.
- W4379794547 cites W2194775991 @default.
- W4379794547 cites W2540279855 @default.
- W4379794547 cites W2810819381 @default.
- W4379794547 cites W2913146885 @default.
- W4379794547 cites W2945146780 @default.
- W4379794547 cites W2983655274 @default.
- W4379794547 cites W2998600257 @default.
- W4379794547 cites W3100321043 @default.
- W4379794547 cites W3156745629 @default.
- W4379794547 cites W3184382748 @default.
- W4379794547 cites W3192477179 @default.
- W4379794547 cites W4226207184 @default.
- W4379794547 cites W4293100236 @default.
- W4379794547 cites W4293149165 @default.
- W4379794547 cites W4311543031 @default.
- W4379794547 doi "https://doi.org/10.1109/access.2023.3283312" @default.
- W4379794547 hasPublicationYear "2023" @default.
- W4379794547 type Work @default.
- W4379794547 citedByCount "0" @default.
- W4379794547 crossrefType "journal-article" @default.
- W4379794547 hasAuthorship W4379794547A5001351683 @default.
- W4379794547 hasAuthorship W4379794547A5004350640 @default.
- W4379794547 hasAuthorship W4379794547A5021283515 @default.
- W4379794547 hasAuthorship W4379794547A5047011963 @default.
- W4379794547 hasBestOaLocation W43797945471 @default.
- W4379794547 hasConcept C113775141 @default.
- W4379794547 hasConcept C114614502 @default.
- W4379794547 hasConcept C121332964 @default.
- W4379794547 hasConcept C154945302 @default.
- W4379794547 hasConcept C169590947 @default.
- W4379794547 hasConcept C17349429 @default.
- W4379794547 hasConcept C173608175 @default.
- W4379794547 hasConcept C177264268 @default.
- W4379794547 hasConcept C199360897 @default.
- W4379794547 hasConcept C33923547 @default.
- W4379794547 hasConcept C41008148 @default.
- W4379794547 hasConcept C41681595 @default.
- W4379794547 hasConcept C45347329 @default.
- W4379794547 hasConcept C48044578 @default.
- W4379794547 hasConcept C50644808 @default.
- W4379794547 hasConcept C62520636 @default.
- W4379794547 hasConcept C74193536 @default.
- W4379794547 hasConcept C77088390 @default.
- W4379794547 hasConcept C81363708 @default.
- W4379794547 hasConcept C84114770 @default.
- W4379794547 hasConceptScore W4379794547C113775141 @default.
- W4379794547 hasConceptScore W4379794547C114614502 @default.
- W4379794547 hasConceptScore W4379794547C121332964 @default.
- W4379794547 hasConceptScore W4379794547C154945302 @default.
- W4379794547 hasConceptScore W4379794547C169590947 @default.
- W4379794547 hasConceptScore W4379794547C17349429 @default.
- W4379794547 hasConceptScore W4379794547C173608175 @default.
- W4379794547 hasConceptScore W4379794547C177264268 @default.
- W4379794547 hasConceptScore W4379794547C199360897 @default.
- W4379794547 hasConceptScore W4379794547C33923547 @default.
- W4379794547 hasConceptScore W4379794547C41008148 @default.
- W4379794547 hasConceptScore W4379794547C41681595 @default.
- W4379794547 hasConceptScore W4379794547C45347329 @default.
- W4379794547 hasConceptScore W4379794547C48044578 @default.
- W4379794547 hasConceptScore W4379794547C50644808 @default.
- W4379794547 hasConceptScore W4379794547C62520636 @default.
- W4379794547 hasConceptScore W4379794547C74193536 @default.
- W4379794547 hasConceptScore W4379794547C77088390 @default.
- W4379794547 hasConceptScore W4379794547C81363708 @default.
- W4379794547 hasConceptScore W4379794547C84114770 @default.
- W4379794547 hasFunder F4320322183 @default.
- W4379794547 hasLocation W43797945471 @default.
- W4379794547 hasLocation W43797945472 @default.
- W4379794547 hasOpenAccess W4379794547 @default.
- W4379794547 hasPrimaryLocation W43797945471 @default.
- W4379794547 hasRelatedWork W1583465708 @default.
- W4379794547 hasRelatedWork W1601646354 @default.
- W4379794547 hasRelatedWork W1726221972 @default.
- W4379794547 hasRelatedWork W2265161186 @default.
- W4379794547 hasRelatedWork W2890419659 @default.
- W4379794547 hasRelatedWork W3045877795 @default.
- W4379794547 hasRelatedWork W4235959758 @default.
- W4379794547 hasRelatedWork W4245265375 @default.
- W4379794547 hasRelatedWork W4297693701 @default.
- W4379794547 hasRelatedWork W2479014312 @default.
- W4379794547 hasVolume "11" @default.
- W4379794547 isParatext "false" @default.
- W4379794547 isRetracted "false" @default.
- W4379794547 workType "article" @default.