Institute of Electrical & Electronic Engineers
The authors present a memory model to analyze and improve the performance of scientific algorithms on Graphics Processing Units (GPUs). Their memory model is based on texturing hardware, which uses a 2D block-based array representation to perform the underlying computations. They incorporate many characteristics of GPU architectures including smaller cache sizes, 2D block representations, and use the 3C's model to analyze the cache misses. Moreover, they present techniques to improve the performance of nested loops on GPUs.