An Application-Oriented Approach for Accelerating Data-Parallel Computation With Graphics Processing Unit

Source: Virginia Polytechnic Institute and State University

Favorite

Free registration required

This paper presents a novel parallelization and quantitative characterization of various optimization strategies for data-parallel computation on a Graphics Processing Unit (GPU) using NVIDIA's new GPU programming framework, Compute Unified Device Architecture (CUDA). CUDA is an easy-to-use development framework that has drawn the attention of many different application areas looking for dramatic speed-ups in their code. However, the performance tradeoffs in CUDA are not yet fully understood, especially for data-parallel applications. Consequently, they study two fundamental mathematical operations that are common in many data-parallel applications: convolution and accumulation. Specifically, they profile and optimize the performance of these operations on a 128-core NVIDIA GPU.
Format:PDF Size:1456.70
Date:Apr 2008