Fitting FFT Onto the G80 Architecture

There are two sources of motivation for this paper. First is the recent success in running matrix-matrix multiply on G80 GPUs. In this paper, the authors present a novel implementation of FFT on GeForce 8800GTX that achieves 144 G-flop/s that is nearly 3x faster than best rate achieved in the current vendor’s numerical libraries. This performance is achieved by exploiting the Cooley-Tukey framework to make use of the hardware capabilities, such as the massive vector register files and small on-chip local storage. They also consider performance of the FFT on few other platforms.

Download Now

Provided by:
UC Regents: Topic:
Hardware: Format:
PDF

Download Now

Find By Topic

Fitting FFT Onto the G80 Architecture

Subscribe to the Innovation Insider Newsletter

TechRepublic Premium

TechRepublic Premium, directly to your inbox. Sign up today.

Fitting FFT Onto the G80 Architecture

Subscribe to the Innovation Insider Newsletter

Subscribe to the Innovation Insider Newsletter

Resource Details

Create a TechRepublic Account

Sign in to TechRepublic

Reset Password

Welcome. Tell us a little bit about you.

Want to receive more TechRepublic news?

You're All Set