
Hardware
HardwareFitting FFT Onto the G80 Architecture
There are two sources of motivation for this paper. First is the recent success in running matrix-matrix multiply on G80 GPUs. In this paper, the authors present a novel implementation of FFT on GeForce 8800GTX that achieves 144 G-flop/s that is nearly 3x faster than best rate achieved in the current vendor’s numerical libraries. This ...