Matches in SemOpenAlex for { <https://semopenalex.org/work/W3161672423> ?p ?o ?g. }
- W3161672423 endingPage "25" @default.
- W3161672423 startingPage "1" @default.
- W3161672423 abstract "QR decomposition is one of the most useful factorization kernels in modern numerical linear algebra algorithms. In particular, the decomposition of tall-and-skinny matrices (TSMs) has major applications in areas including scientific computing, machine learning, image processing, wireless networks, and numerical methods. Traditionally, CPUs and GPUs have achieved better throughput on these applications by using large cache hierarchies and compute cores running at a high frequency, leading to high power consumption. With the advent of heterogeneous platforms, however, FPGAs are emerging as a promising viable alternative. In this work, we propose a high-throughput FPGA-based engine that has a very high computational efficiency (ratio of achieved to peak throughput) compared to similar QR solvers running on FPGAs. Although comparable QR solvers achieve an efficiency of 36%, our design exhibits an efficiency of 54%. For TSMs, our experimental results show that our design can outperform highly optimized QR solvers running on CPUs and GPUs. For TSMs with more than 50K rows, our design outperforms the Intel MKL solver running on an Intel quad-core processor by a factor of 1.5×. For TSMs containing 256 columns or less, our design outperforms the NVIDIA CUBLAS solver running on a K40 GPU by a factor of 3.0×. In addition to being fast, our design is energy efficient—competing platforms execute up to 0.6 GFLOPS/Joule, whereas our design executes more than 1.0 GFLOPS/Joule." @default.
- W3161672423 created "2021-05-24" @default.
- W3161672423 creator A5022569738 @default.
- W3161672423 creator A5033921328 @default.
- W3161672423 creator A5044893758 @default.
- W3161672423 creator A5076150563 @default.
- W3161672423 date "2021-05-10" @default.
- W3161672423 modified "2023-09-27" @default.
- W3161672423 title "Acceleration of Parallel-Blocked QR Decomposition of Tall-and-Skinny Matrices on FPGAs" @default.
- W3161672423 cites W182691100 @default.
- W3161672423 cites W1961751213 @default.
- W3161672423 cites W1973855976 @default.
- W3161672423 cites W1986431723 @default.
- W3161672423 cites W1999085092 @default.
- W3161672423 cites W2006808488 @default.
- W3161672423 cites W2012366931 @default.
- W3161672423 cites W2013417994 @default.
- W3161672423 cites W2049009664 @default.
- W3161672423 cites W2064872546 @default.
- W3161672423 cites W2073260424 @default.
- W3161672423 cites W2076804384 @default.
- W3161672423 cites W2078095679 @default.
- W3161672423 cites W2097095666 @default.
- W3161672423 cites W2111221242 @default.
- W3161672423 cites W2113755305 @default.
- W3161672423 cites W2117060831 @default.
- W3161672423 cites W2124408528 @default.
- W3161672423 cites W2133545594 @default.
- W3161672423 cites W2135120170 @default.
- W3161672423 cites W2139116943 @default.
- W3161672423 cites W2154641788 @default.
- W3161672423 cites W2157237396 @default.
- W3161672423 cites W2161761794 @default.
- W3161672423 cites W2162322364 @default.
- W3161672423 cites W2587930272 @default.
- W3161672423 cites W2789139117 @default.
- W3161672423 cites W3013247086 @default.
- W3161672423 cites W82325001 @default.
- W3161672423 doi "https://doi.org/10.1145/3447775" @default.
- W3161672423 hasPublicationYear "2021" @default.
- W3161672423 type Work @default.
- W3161672423 sameAs 3161672423 @default.
- W3161672423 citedByCount "0" @default.
- W3161672423 crossrefType "journal-article" @default.
- W3161672423 hasAuthorship W3161672423A5022569738 @default.
- W3161672423 hasAuthorship W3161672423A5033921328 @default.
- W3161672423 hasAuthorship W3161672423A5044893758 @default.
- W3161672423 hasAuthorship W3161672423A5076150563 @default.
- W3161672423 hasBestOaLocation W31616724231 @default.
- W3161672423 hasConcept C11413529 @default.
- W3161672423 hasConcept C119599485 @default.
- W3161672423 hasConcept C121332964 @default.
- W3161672423 hasConcept C127413603 @default.
- W3161672423 hasConcept C133095886 @default.
- W3161672423 hasConcept C139352143 @default.
- W3161672423 hasConcept C149635348 @default.
- W3161672423 hasConcept C157764524 @default.
- W3161672423 hasConcept C158693339 @default.
- W3161672423 hasConcept C173608175 @default.
- W3161672423 hasConcept C188060507 @default.
- W3161672423 hasConcept C199360897 @default.
- W3161672423 hasConcept C2524010 @default.
- W3161672423 hasConcept C2742236 @default.
- W3161672423 hasConcept C2778770139 @default.
- W3161672423 hasConcept C33923547 @default.
- W3161672423 hasConcept C35912277 @default.
- W3161672423 hasConcept C3826847 @default.
- W3161672423 hasConcept C41008148 @default.
- W3161672423 hasConcept C42935608 @default.
- W3161672423 hasConcept C45374587 @default.
- W3161672423 hasConcept C459310 @default.
- W3161672423 hasConcept C555944384 @default.
- W3161672423 hasConcept C62520636 @default.
- W3161672423 hasConcept C76155785 @default.
- W3161672423 hasConceptScore W3161672423C11413529 @default.
- W3161672423 hasConceptScore W3161672423C119599485 @default.
- W3161672423 hasConceptScore W3161672423C121332964 @default.
- W3161672423 hasConceptScore W3161672423C127413603 @default.
- W3161672423 hasConceptScore W3161672423C133095886 @default.
- W3161672423 hasConceptScore W3161672423C139352143 @default.
- W3161672423 hasConceptScore W3161672423C149635348 @default.
- W3161672423 hasConceptScore W3161672423C157764524 @default.
- W3161672423 hasConceptScore W3161672423C158693339 @default.
- W3161672423 hasConceptScore W3161672423C173608175 @default.
- W3161672423 hasConceptScore W3161672423C188060507 @default.
- W3161672423 hasConceptScore W3161672423C199360897 @default.
- W3161672423 hasConceptScore W3161672423C2524010 @default.
- W3161672423 hasConceptScore W3161672423C2742236 @default.
- W3161672423 hasConceptScore W3161672423C2778770139 @default.
- W3161672423 hasConceptScore W3161672423C33923547 @default.
- W3161672423 hasConceptScore W3161672423C35912277 @default.
- W3161672423 hasConceptScore W3161672423C3826847 @default.
- W3161672423 hasConceptScore W3161672423C41008148 @default.
- W3161672423 hasConceptScore W3161672423C42935608 @default.
- W3161672423 hasConceptScore W3161672423C45374587 @default.
- W3161672423 hasConceptScore W3161672423C459310 @default.
- W3161672423 hasConceptScore W3161672423C555944384 @default.
- W3161672423 hasConceptScore W3161672423C62520636 @default.