Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386710522> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4386710522 abstract "The k-nearest neighbors (KNN) algorithm is an essential algorithm in many applications, such as similarity search, image classification, and database query. With the rapid growth in the dataset size and the feature dimension of each data point, processing KNN becomes more compute and memory hungry. Most prior studies focus on accelerating the computation of KNN using the abundant parallel resource on FPGAs. However, they often overlook the memory access optimizations on FPGA platforms and only achieve a marginal speedup over a multi-thread CPU implementation for large datasets. In this paper, we design and implement CHIP-KNN: an HLS-based, configurable, and high-performance KNN accelerator. CHIP-KNN optimizes the off-chip memory access on modern HBM-based FPGAs such as the AMD/Xilinx Alveo U280 FPGA board. CHIP-KNN is configurable for all essential parameters used in the algorithm, including the size of the search dataset, the feature dimension and data type representation of each data point, the distance metric, and the number of nearest neighbors - K. In terms of design architecture, we explore and discuss the trade-offs between two design versions: CHIP-KNNv1 (Ping-Pong buffer based) and CHIP-KNNv2 (streaming-based). Moreover, we investigate the routing congestion issue in our accelerator design, implement hierarchical structures to shorten critical paths, and integrate an open-source floorplanning optimization tool called TAPA/AutoBridge to eliminate the place-and-route issues. To explore the design space and balance the computation and memory access performance, we also build an analytical performance model. Given a user configuration of the KNN parameters, our tool can automatically generate TAPA HLS C code for the optimal accelerator design and the corresponding host code, on the HBM-based FPGA platform. Our experimental results on the Alveo U280 show that, compared to a 48-thread CPU implementation, CHIP-KNNv2 achieves a geomean performance speedup of 15x, with a maximum speedup of 45x. Additionally, we show that CHIP-KNNv2 achieves up to 2.1x performance speedup over CHIP-KNNv1 while increasing configurability. Compared with the state-of-the-art Facebook AI Similarity Search (FAISS) [23] GPU implementation running on a Nvidia Tesla V100 GPU, CHIP-KNNv2 achieves an average latency reduction of 30.6x while requiring 34.3% of GPU power consumption." @default.
- W4386710522 created "2023-09-14" @default.
- W4386710522 creator A5004089365 @default.
- W4386710522 creator A5032269879 @default.
- W4386710522 creator A5032647120 @default.
- W4386710522 creator A5065889904 @default.
- W4386710522 creator A5067662998 @default.
- W4386710522 date "2023-09-13" @default.
- W4386710522 modified "2023-10-18" @default.
- W4386710522 title "CHIP-KNNv2: A <u>C</u> onfigurable and <u>Hi</u> gh- <u>P</u> erformance <u>K</u> - <u>N</u> earest <u>N</u> eighbors Accelerator on HBM-based FPGAs" @default.
- W4386710522 cites W1518139028 @default.
- W4386710522 cites W2028960610 @default.
- W4386710522 cites W2047207335 @default.
- W4386710522 cites W2073137766 @default.
- W4386710522 cites W2080592089 @default.
- W4386710522 cites W2157234796 @default.
- W4386710522 cites W2427881153 @default.
- W4386710522 cites W2559192460 @default.
- W4386710522 cites W2626616508 @default.
- W4386710522 cites W2896983500 @default.
- W4386710522 cites W2913535816 @default.
- W4386710522 cites W2993580201 @default.
- W4386710522 cites W2998702515 @default.
- W4386710522 cites W3101568187 @default.
- W4386710522 cites W3179878372 @default.
- W4386710522 cites W4210859512 @default.
- W4386710522 cites W4213187929 @default.
- W4386710522 cites W4304140717 @default.
- W4386710522 doi "https://doi.org/10.1145/3616873" @default.
- W4386710522 hasPublicationYear "2023" @default.
- W4386710522 type Work @default.
- W4386710522 citedByCount "0" @default.
- W4386710522 crossrefType "journal-article" @default.
- W4386710522 hasAuthorship W4386710522A5004089365 @default.
- W4386710522 hasAuthorship W4386710522A5032269879 @default.
- W4386710522 hasAuthorship W4386710522A5032647120 @default.
- W4386710522 hasAuthorship W4386710522A5065889904 @default.
- W4386710522 hasAuthorship W4386710522A5067662998 @default.
- W4386710522 hasConcept C13164978 @default.
- W4386710522 hasConcept C173608175 @default.
- W4386710522 hasConcept C41008148 @default.
- W4386710522 hasConcept C42935608 @default.
- W4386710522 hasConcept C68339613 @default.
- W4386710522 hasConcept C9390403 @default.
- W4386710522 hasConceptScore W4386710522C13164978 @default.
- W4386710522 hasConceptScore W4386710522C173608175 @default.
- W4386710522 hasConceptScore W4386710522C41008148 @default.
- W4386710522 hasConceptScore W4386710522C42935608 @default.
- W4386710522 hasConceptScore W4386710522C68339613 @default.
- W4386710522 hasConceptScore W4386710522C9390403 @default.
- W4386710522 hasLocation W43867105221 @default.
- W4386710522 hasOpenAccess W4386710522 @default.
- W4386710522 hasPrimaryLocation W43867105221 @default.
- W4386710522 hasRelatedWork W1509211761 @default.
- W4386710522 hasRelatedWork W1531488649 @default.
- W4386710522 hasRelatedWork W1583465708 @default.
- W4386710522 hasRelatedWork W1585350690 @default.
- W4386710522 hasRelatedWork W2133693067 @default.
- W4386710522 hasRelatedWork W2366027386 @default.
- W4386710522 hasRelatedWork W2391299576 @default.
- W4386710522 hasRelatedWork W2582456645 @default.
- W4386710522 hasRelatedWork W3037767301 @default.
- W4386710522 hasRelatedWork W2479014312 @default.
- W4386710522 isParatext "false" @default.
- W4386710522 isRetracted "false" @default.
- W4386710522 workType "article" @default.