Affine Vector Cache for Memory Bandwidth Savings

Download Now
Provided by: CNRS
Topic: Hardware
Format: PDF
Preserving memory locality is a major issue in highly-multithreaded architectures such as GPUs. These architectures hide latency by maintaining a large number of threads in flight. As each thread needs to maintain a private working set, all threads collectively put tremendous pressure on on-chip memory arrays, at significant cost in area and power. The authors show that thread-private data in GPU-like implicit SIMD architectures can be compressed by a factor up to 16 by taking advantage of correlations between values held by different threads.
Download Now

Find By Topic