University of Debrecen
The computational power provided by many-core Graphics Processing Units (GPUs) has been exploited in many applications. The programming techniques currently employed on these GPUs are not sufficient to address problems exhibiting irregular, and unbalanced workload. The problem is exacerbated when trying to effectively exploit multiple GPUs concurrently, which are commonly available in many modern systems. In this paper, the authors propose a task-based dynamic load-balancing solution for single and multi-GPU systems. The solution allows load balancing at a finer granularity than what is supported in current GPU programming APIs, such as NVIDIA's CUDA. They evaluate their approach using both micro-benchmarks and a molecular dynamics application that exhibits significant load imbalance.