Association for Computing Machinery
State-of-art Graphics Processing Units (GPUs) employ the Single-Instruction Multiple-Data (SIMD) style execution to achieve both high computational throughput and energy efficiency. As previous works have shown, there exists significant computational redundancy in SIMD execution, where different execution lanes operate on the same operand values. Such value locality is referred to as uniform vectors. In this paper, the authors first show that besides redundancy within a uniform vector, different vectors can also have the identical values. Then, they propose detailed architecture designs to exploit both types of redundancy.