Matches in SemOpenAlex for { <https://semopenalex.org/work/W2284591528> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W2284591528 endingPage "446" @default.
- W2284591528 startingPage "446" @default.
- W2284591528 abstract "Tridiagonal diagonally dominant linear systems arise in many scientific and engineering applications. The standard Thomas algorithm for solving such systems is inherently serial, forming a bottleneck in computation. Algorithms such as cyclic reduction and SPIKE reduce a single large tridiagonal system into multiple small independent systems which are solved in parallel. We develop portable cyclic reduction and the SPIKE algorithm for Open Computing Language implementations on a range of co-processors in a heterogeneous computing environment, including field programmable gate arrays, graphics processing units and other multi-core processors. We evaluate these designs in the context of solver performance, resource efficiency and numerical accuracy. References Altera. Implementing FPGA design with the OpenCL standard. Technical report, Altera Inc., November 2013. http://www.altera.com/literature/wp/wp-01173-opencl.pdf Altera. Altera SDK for OpenCL: Best practices guide. Technical report, Altera Inc., May 2015. https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/opencl-sdk/aocl_optimization_guide.pdf P. Arbenz, A. Cleary, J. Dongarra, and M. Hegland. A comparison of parallel solvers for diagonally dominant and general narrow-banded linear systems. Parallel and Distributed Computing Practices, 2:385–400, 1999. http://www.scpe.org/index.php/scpe/article/view/152 P. Arbenz, A. Cleary, J. Dongarra, and M. Hegland. A comparison of parallel solvers for diagonally dominant and general narrow-banded linear systems II. In Euro-Par'99 Parallel Processing, Berlin, vol. 1685 of Lecture Notes in Computer Science pp. 1078–1087. Springer, 1999. doi:10.1007/3-540-48311-X_151 L.-W. Chang, J. A. Stratton, H.-S. Kim, and W. W. Hwu. A scalable, numerically stable, high-performance tridiagonal solver using GPUs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE Computer Society Press, 2012. doi:10.1109/SC.2012.12 Y. Dou, Y. Lei, G. Wu, S. Guo, J. Zhuo, and L. Shen. FPGA accelerating double/quad-double high precision floating-point applications for exascale computing. In Proceedings of the 24th ACM International Conference on Supercomputing, pp. 325–336, 2010. doi:10.1145/1810085.1810129 C. R. Dun, M. Hegland, and M. R. Osborne. Parallel stable solution methods for tridiagonal linear systems of equations. In Computational Techniques and Applications: CTAC95 Proceedings of the Seventh Biennial Conference, pp. 267–274. World Scientific, 1996. doi:10.1142/9789814530651 W. Gander and G. H. Golub. Cyclic reduction–-history and applications. In Scientific Computing, Proceedings of the Workshop, 1997, pp. 73–111. Springer, 1998. http://www.springer.com/fr/book/9789813083608 Khronos OpenCL Working Group. The OpenCL specification. Technical report, Khronos Group, October 2009. http://www.khronos.org/registry/cl/specs/opencl-1.0.pdf M. Hegland. On the parallel solution of tridiagonal systems by wrap-around partitioning and incomplete LU factorization. Numer. Math., 59(1):453–472, 1991. doi:10.1007/BF01385791 D. Jensen and A. F. Rodrigues. Embedded systems and exascale computing. Comput. Sci. Eng., 12(6):20–29, 2010. doi:10.1109/MCSE.2010.95 S. Palmer. Accelerating implicit finite difference schemes using a hardware optimised implementation of the thomas algorithm for FPGAs. Technical Report, 2014. http://arxiv.org/abs/1402.5094 E. Polizzi and A. H. Sameh. A parallel hybrid banded system solver: the spike algorithm. Parallel Comput., 32:177–194, 2006. doi:10.1016/j.parco.2005.07.005 W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes in FORTRAN: The art of Scientific Computation. Cambridge University Press, Cambridge, England, 2nd edition, 1992. http://www.cambridge.org/au/academic/subjects/mathematics/numerical-recipes/numerical-recipes-fortran-77-art-scientific-computing-volume-1-2nd-edition J. Shalf, S. Dosanjh and J. Morrison. Exascale computing technology challenges. In High Performance Computing for Computational Science–-VECPAR 2010, vol. 6449 of Lecture Notes in Computer Science, pp 1–25. Springer, 2011. doi:10.1007/978-3-642-19328-6_1 S. Skalicky, S. Lopez, M. Lukowiak, J. Letendre and M. Ryan. Performance modeling of pipelined linear algebra architectures on FPGAs. In Reconfigurable Computing: Architectures, Tools and Applications, vol. 7806, Lecture Notes in Computer Science pp. 146-153. Springer, 2013. doi:10.1007/978-3-642-36812-7_14 D. J. Warne, R. F. Hayward, N. A. Kelson, J. E. Banks, and L. Mejias. Pulse-coupled neural network performance for real-time identification of vegetation during forced landing. In Engineering Mathematics and Applications Conference (EMAC2013), vol. 55 of ANZIAM J., pp. C1–C16, 2014. http://journal.austms.org.au/ojs/index.php/ANZIAMJ/article/view/7851 D. J. Warne, N. A. Kelson, and R. F. Hayward. Solving tri-diagonal linear systems using field programmable gate arrays. In Proceedings of the 4th International Conference on Computational Methods (ICCM2012), 2012. http://eprints.qut.edu.au/54894/ D. J. Warne, N. A. Kelson, and R. F. Hayward. Comparison of high level FPGA hardware design for solving tri-diagonal linear systems. Proc. Comput. Sci., 29:95–101, 2014. doi:10.1016/j.procs.2014.05.009 L. Wirbel. Xilinx SDAccel: A unified development environment for tomorrow's data center. Technical report, The Linley Group Inc., November 2014. http://www.xilinx.com/publications/prod_mktg/sdnet/sdaccel-wp.pdf G. Wu, Y. Dou, J. Sun and G. D. Peterson. A high performance and memory efficient LU decomposer on FPGAs. IEEE T. Comput., 61:366–378, 2012. doi:10.1109/TC.2010.278" @default.
- W2284591528 created "2016-06-24" @default.
- W2284591528 creator A5020949598 @default.
- W2284591528 creator A5069576646 @default.
- W2284591528 creator A5072784448 @default.
- W2284591528 creator A5084386940 @default.
- W2284591528 creator A5086404682 @default.
- W2284591528 date "2016-02-24" @default.
- W2284591528 modified "2023-09-27" @default.
- W2284591528 title "Implementation of parallel tridiagonal solvers for a heterogeneous computing environment" @default.
- W2284591528 doi "https://doi.org/10.21914/anziamj.v56i0.9371" @default.
- W2284591528 hasPublicationYear "2016" @default.
- W2284591528 type Work @default.
- W2284591528 sameAs 2284591528 @default.
- W2284591528 citedByCount "4" @default.
- W2284591528 countsByYear W22845915282017 @default.
- W2284591528 countsByYear W22845915282019 @default.
- W2284591528 countsByYear W22845915282020 @default.
- W2284591528 countsByYear W22845915282022 @default.
- W2284591528 crossrefType "journal-article" @default.
- W2284591528 hasAuthorship W2284591528A5020949598 @default.
- W2284591528 hasAuthorship W2284591528A5069576646 @default.
- W2284591528 hasAuthorship W2284591528A5072784448 @default.
- W2284591528 hasAuthorship W2284591528A5084386940 @default.
- W2284591528 hasAuthorship W2284591528A5086404682 @default.
- W2284591528 hasBestOaLocation W22845915281 @default.
- W2284591528 hasConcept C121332964 @default.
- W2284591528 hasConcept C149635348 @default.
- W2284591528 hasConcept C158693339 @default.
- W2284591528 hasConcept C173608175 @default.
- W2284591528 hasConcept C199360897 @default.
- W2284591528 hasConcept C2778770139 @default.
- W2284591528 hasConcept C41008148 @default.
- W2284591528 hasConcept C42935608 @default.
- W2284591528 hasConcept C51647924 @default.
- W2284591528 hasConcept C62520636 @default.
- W2284591528 hasConceptScore W2284591528C121332964 @default.
- W2284591528 hasConceptScore W2284591528C149635348 @default.
- W2284591528 hasConceptScore W2284591528C158693339 @default.
- W2284591528 hasConceptScore W2284591528C173608175 @default.
- W2284591528 hasConceptScore W2284591528C199360897 @default.
- W2284591528 hasConceptScore W2284591528C2778770139 @default.
- W2284591528 hasConceptScore W2284591528C41008148 @default.
- W2284591528 hasConceptScore W2284591528C42935608 @default.
- W2284591528 hasConceptScore W2284591528C51647924 @default.
- W2284591528 hasConceptScore W2284591528C62520636 @default.
- W2284591528 hasLocation W22845915281 @default.
- W2284591528 hasOpenAccess W2284591528 @default.
- W2284591528 hasPrimaryLocation W22845915281 @default.
- W2284591528 hasRelatedWork W1502414128 @default.
- W2284591528 hasRelatedWork W1788283770 @default.
- W2284591528 hasRelatedWork W2045992017 @default.
- W2284591528 hasRelatedWork W2131542146 @default.
- W2284591528 hasRelatedWork W2136415956 @default.
- W2284591528 hasRelatedWork W2141524575 @default.
- W2284591528 hasRelatedWork W3122137097 @default.
- W2284591528 hasRelatedWork W4239609053 @default.
- W2284591528 hasRelatedWork W808639126 @default.
- W2284591528 hasRelatedWork W2283824598 @default.
- W2284591528 hasVolume "56" @default.
- W2284591528 isParatext "false" @default.
- W2284591528 isRetracted "false" @default.
- W2284591528 magId "2284591528" @default.
- W2284591528 workType "article" @default.