Hierarchical Approach to Optimization of Parallel Matrix Multiplication on Large-Scale Platforms

Provided by: Springer Healthcare
Topic: Hardware
Format: PDF
Many state-of-the-art parallel algorithms, which are widely used in scientific applications executed on high-end computing systems, were designed in the twentieth century with relatively small-scale parallelism in mind. Indeed, while in 1990s a system with few hundred cores was considered a powerful supercomputer; modern top supercomputers have millions of cores. In this paper, the authors present a hierarchical approach to optimization of message-passing parallel algorithms for execution on large-scale distributed-memory systems. The idea is to reduce the communication cost by introducing hierarchy and hence more parallelism in the communication scheme.

Find By Topic