Simulation and Architecture Improvements of Atomic Operations on GPU Scratchpad Memory

Provided by: Institute of Electrical & Electronic Engineers
Topic: Hardware
Format: PDF
GPUs are increasingly used as compute accelerators. With a large number of cores executing an even larger number of threads, significant speed-ups can be attained for parallel workloads. Applications that rely on atomic operations, such as histogram and Hough transform, suffer from serialization of threads in case they update the same memory location. Previous paper shows that reducing this serialization with software techniques can increase performance by an order of magnitude. The authors observe, however, that some serialization remains and still slows down these applications.

Find By Topic