Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction
This paper is to enhance the parallelism of the tile bi-diagonal transformation using tree reduction on multicore architectures. First introduced by Ltaief et. al, the bi-diagonal transformation using tile algorithms with a two-stage approach has shown very promising results on square matrices. However, for tall and skinny matrices, the inherent problem of processing the panel in a domino-like fashion generates unnecessary sequential tasks. By using tree reduction, the panel is horizontally split, which creates another dimension of parallelism and engenders many concurrent tasks to be dynamically scheduled on the available cores.