The University of Maine at Machias
More cores, not faster clock speeds, drive performance enhancement in today's processors. The authors describe a novel parallel steady-state solver that uses NVIDIA's Compute Unified Device Architecture (CUDA) library to perform calculations on a Graphics Processing Unit (GPU). They demonstrate speed-ups of over 8 times compared with a CPU-only solver. They also discuss a parallel implementation which runs on multiple GPUs on separate machines, and explain how they deal with allocating appropriate amounts of work to heterogeneous computing resources.