Matches in SemOpenAlex for { <https://semopenalex.org/work/W2585774018> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2585774018 abstract "OpenCL FPGA has recently gained great popularity with emerging needs for workload acceleration such as Convolutional Neural Network (CNN), which is the most popular deep learning architecture in the domain of computer vision. While OpenCL enhances the code portability and programmability of FPGA, it comes at the expense of performance. The key challenge is to optimize the OpenCL kernels to efficiently utilize the flexible hardware resources in FPGA. Simply optimizing the OpenCL kernel code through various compiler options turns out insufficient to achieve desirable performance for both compute-intensive and data-intensive workloads such as convolutional neural networks. In this paper, we first propose an analytical performance model and apply it to perform an in-depth analysis on the resource requirement of CNN classifier kernels and available resources on modern FPGAs. We identify that the key performance bottleneck is the on-chip memory bandwidth. We propose a new kernel design to effectively address such bandwidth limitation and to provide an optimal balance between computation, on-chip, and off-chip memory access. As a case study, we further apply these techniques to design a CNN accelerator based on the VGG model. Finally, we evaluate the performance of our CNN accelerator using an Altera Arria 10 GX1150 board. We achieve 866 Gop/s floating point performance at 370MHz working frequency and 1.79 Top/s 16-bit fixed-point performance at 385MHz. To the best of our knowledge, our implementation achieves the best power efficiency and performance density compared to existing work." @default.
- W2585774018 created "2017-02-10" @default.
- W2585774018 creator A5013651672 @default.
- W2585774018 creator A5019114599 @default.
- W2585774018 date "2017-02-22" @default.
- W2585774018 modified "2023-10-14" @default.
- W2585774018 title "Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network" @default.
- W2585774018 cites W1522241184 @default.
- W2585774018 cites W2094756095 @default.
- W2585774018 cites W2112796928 @default.
- W2585774018 cites W2276486856 @default.
- W2585774018 cites W2294282016 @default.
- W2585774018 cites W4256629673 @default.
- W2585774018 doi "https://doi.org/10.1145/3020078.3021698" @default.
- W2585774018 hasPublicationYear "2017" @default.
- W2585774018 type Work @default.
- W2585774018 sameAs 2585774018 @default.
- W2585774018 citedByCount "158" @default.
- W2585774018 countsByYear W25857740182017 @default.
- W2585774018 countsByYear W25857740182018 @default.
- W2585774018 countsByYear W25857740182019 @default.
- W2585774018 countsByYear W25857740182020 @default.
- W2585774018 countsByYear W25857740182021 @default.
- W2585774018 countsByYear W25857740182022 @default.
- W2585774018 countsByYear W25857740182023 @default.
- W2585774018 crossrefType "proceedings-article" @default.
- W2585774018 hasAuthorship W2585774018A5013651672 @default.
- W2585774018 hasAuthorship W2585774018A5019114599 @default.
- W2585774018 hasConcept C108583219 @default.
- W2585774018 hasConcept C111919701 @default.
- W2585774018 hasConcept C113775141 @default.
- W2585774018 hasConcept C114614502 @default.
- W2585774018 hasConcept C118524514 @default.
- W2585774018 hasConcept C149635348 @default.
- W2585774018 hasConcept C154945302 @default.
- W2585774018 hasConcept C169590947 @default.
- W2585774018 hasConcept C173608175 @default.
- W2585774018 hasConcept C188045654 @default.
- W2585774018 hasConcept C2780513914 @default.
- W2585774018 hasConcept C33923547 @default.
- W2585774018 hasConcept C41008148 @default.
- W2585774018 hasConcept C42935608 @default.
- W2585774018 hasConcept C48044578 @default.
- W2585774018 hasConcept C63000827 @default.
- W2585774018 hasConcept C74193536 @default.
- W2585774018 hasConcept C81363708 @default.
- W2585774018 hasConceptScore W2585774018C108583219 @default.
- W2585774018 hasConceptScore W2585774018C111919701 @default.
- W2585774018 hasConceptScore W2585774018C113775141 @default.
- W2585774018 hasConceptScore W2585774018C114614502 @default.
- W2585774018 hasConceptScore W2585774018C118524514 @default.
- W2585774018 hasConceptScore W2585774018C149635348 @default.
- W2585774018 hasConceptScore W2585774018C154945302 @default.
- W2585774018 hasConceptScore W2585774018C169590947 @default.
- W2585774018 hasConceptScore W2585774018C173608175 @default.
- W2585774018 hasConceptScore W2585774018C188045654 @default.
- W2585774018 hasConceptScore W2585774018C2780513914 @default.
- W2585774018 hasConceptScore W2585774018C33923547 @default.
- W2585774018 hasConceptScore W2585774018C41008148 @default.
- W2585774018 hasConceptScore W2585774018C42935608 @default.
- W2585774018 hasConceptScore W2585774018C48044578 @default.
- W2585774018 hasConceptScore W2585774018C63000827 @default.
- W2585774018 hasConceptScore W2585774018C74193536 @default.
- W2585774018 hasConceptScore W2585774018C81363708 @default.
- W2585774018 hasLocation W25857740181 @default.
- W2585774018 hasOpenAccess W2585774018 @default.
- W2585774018 hasPrimaryLocation W25857740181 @default.
- W2585774018 hasRelatedWork W1483783221 @default.
- W2585774018 hasRelatedWork W2112678088 @default.
- W2585774018 hasRelatedWork W2353897323 @default.
- W2585774018 hasRelatedWork W2585774018 @default.
- W2585774018 hasRelatedWork W2612090114 @default.
- W2585774018 hasRelatedWork W2765114562 @default.
- W2585774018 hasRelatedWork W2940114103 @default.
- W2585774018 hasRelatedWork W3028015399 @default.
- W2585774018 hasRelatedWork W3116132749 @default.
- W2585774018 hasRelatedWork W4213040080 @default.
- W2585774018 isParatext "false" @default.
- W2585774018 isRetracted "false" @default.
- W2585774018 magId "2585774018" @default.
- W2585774018 workType "article" @default.