On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
The Graphics Processing Unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers between the CPU and GPU over PCIe. Emerging heterogeneous computing architectures that "fuse" the functionality of the CPU and GPU, e.g., AMD Fusion and Intel Knights Ferry, hold the promise of addressing the PCIe bottleneck. In this paper, the authors empirically characterize and analyze the efficacy of AMD Fusion, an architecture that combines general purpose x86 cores and programmable accelerator cores on the same silicon die.