Nanyang Technological University
Stencils represent an important class of computations that are used in many scientific disciplines. Increasingly, many of the stencil computations in scientific applications are being offloaded to GPUs to improve running times. Since a large part of the simulation time is spent inside the stencil kernels, optimizing the kernel is therefore important in the context of achieving greater computation efficiencies and reducing simulation time. In this paper, the authors proposed a novel in-plane method for stencil computations on GPUs and compared its performance with the conventional method implemented in the Nvidia SDK.