Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures

Baboulin, Marc and Dongarra, Jack and Tomov, Stanimire (2009) Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures. [MIMS Preprint]

[thumbnail of baboulin_dongarra_tomov_060508.pdf] PDF
baboulin_dongarra_tomov_060508.pdf

Download (296kB)

Abstract

We address some key issues in designing dense linear algebra (DLA) algorithms that are common for both multi/many-cores and special purpose architectures (in particular GPUs). We present them in the context of an LU factorization algorithm, where randomization techniques are used as an alternative to pivoting. This approach yields an algorithm based entirely on a collection of small Level 3 BLAS type computational tasks, which has emerged as a common goal in designing DLA algorithms for new architectures. Other common trends, also considered here, are block asynchronous task execution and “Block” layouts for the data associated with the separate tasks. We present numerical results and other specific experiments with DLA algorithms on NVIDIA GPUs using CUDA. The GPU results are also of interest themselves as we show a performance of up to 160 Glop/s on a single Quadro FX 5600 card.

Item Type: MIMS Preprint
Additional Information: Appears also as Technical Report UT-CS-08-615, Department of Computer Science, University of Tennessee, Knoxville, TN, USA, May 2008 and as LAPACK Working Note 200"
Uncontrolled Keywords: dense linear algebra, parallel algorithms, LU factorization, multicore processors, graphic process units.
Subjects: MSC 2010, the AMS's Mathematics Subject Classification > 65 Numerical analysis
MSC 2010, the AMS's Mathematics Subject Classification > 68 Computer science
Depositing User: Ms Lucy van Russelt
Date Deposited: 13 Jan 2009
Last Modified: 20 Oct 2017 14:12
URI: https://eprints.maths.manchester.ac.uk/id/eprint/1207

Actions (login required)

View Item View Item