Use of Multiple GPUs on Shared Memory Multiprocessors for Ultrasound Propagation Simulations

In this paper, the authors outline their effort to migrate a compute intensive application of ultrasound propagation being developed in Matlab to a cluster computer where each node has seven GPUs. Their goal is to perform realistic simulations in hours and minutes instead of weeks and days. In order to reach this goal they investigate architecture characteristics of the target system focusing on the PCI-Express subsystem and new features proposed in CUDA version 4.0, especially simultaneous host to device, device to host and peer-to-peer transfers that the application is going to highly benefit from.

Provided by: Australian Computer Society Topic: Hardware Date Added: Feb 2012 Format: PDF

Find By Topic