Block-Asynchronous Multigrid Smoothers for GPU-Accelerated Systems
This paper explores the need for asynchronous iteration algorithms as smoothers in multi-grid methods. The hardware target for the new algorithms is top-of-the-line, highly parallel hybrid architectures - multicore-based systems enhanced with GPGPUs. These architectures are the most likely candidates for future high-end supercomputers. To pave the road for their efficient use, the authors must resolve challenges related to the fact that data movement, not floating-point operations, is the bottleneck to performance. Their work is in this direction - they designed block-asynchronous multi-grid smoothers that perform more flops in order to reduce synchronization, and hence data movement.