The wide availability and the Single-Instruction Multiple-Thread (SIMT)-style programming model have made Graphics Processing Units (GPUs) a promising choice for high performance computing. However, because of the SIMT style processing, an instruction will be executed in every thread even if the operands are identical for all the threads. To overcome this inefficiency, the AMD's latest Graphics Core Next (GCN) architecture integrates a scalar unit into a SIMT unit. In GCN, both the SIMT unit and the scalar unit share a single SIMT-style instruction stream. Depending on its type, an instruction is issued to either a scalar or a SIMT unit.