Convergence and Scalarization for Data-Parallel Architectures

Download Now
Provided by: Institute of Electrical & Electronic Engineers
Topic: Hardware
Format: PDF
Modern throughput processors such as GPUs achieve high performance and efficiency by exploiting data parallelism in application kernels expressed as threaded code. One drawback of this approach compared to conventional vector architectures is redundant execution of instructions that are common across multiple threads, resulting in energy inefficiency due to excess instruction dispatch, register file accesses, and memory operations. This paper proposes to alleviate these overheads while retaining the threaded programming model by automatically detecting the scalar operations and factoring them out of the parallel code.
Download Now

Find By Topic