2009.6: Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited
2009.6: Hatem Ltaief, Jakub Kurzak and Jack Dongarra (2009) Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited.
Full text available as:
|PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader|
The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. , to the context of multicore architec- tures using algorithms-by-tiles. In particular, the Block Hessenberg Re- duction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenvalue problem. Although expensive, orthogonal transformations are commonly used for this re- duction because they guarantee stability, as opposed to Gaussian Elimi- nation. Two versions of the Block Hessenberg Reduction are presented in this paper, the rst one with Householder re ectors and the second one with Givens rotations. A short investigation on variants of Fast Givens Rotations is also mentioned. Furthermore, in the last Top500 list from June 2008, 98% of the fastest parallel systems in the world are based on multicores. The emerging petascale systems consisting of hundreds of thousands of cores have exacerbated the problem even more and it becomes judicious to eciently integrate existing or new numerical lin- ear algebra algorithms suitable for such hardwares. By exploiting the concepts of algorithms-by-tiles in the multicore environment (i.e., high level of parallelism with ne granularity and high performance data rep- resentation combined with a dynamic data driven execution), the Block Hessenberg Reduction presented here achieves 72% of the DGEMM peak on a 12000 12000 matrix with 16 Intel Tigerton 2:4 GHz processors.
|Item Type:||MIMS Preprint|
Appears also as Technical Report UT-CS-08-631, Department of Computer Science, University of Tennessee, Knoxville, TN, USA, June 2008 and as LAPACK Working Note 209"
|Subjects:||MSC 2000 > 65 Numerical analysis|
MSC 2000 > 68 Computer science
|Deposited By:||Ms Lucy van Russelt|
|Deposited On:||13 January 2009|