Georgia Institute of Technology
Multicore processors have been effective in scaling application performance by dividing computation among multiple threads running in parallel. However, application performance does not necessarily improve as more cores are added. Application performance can be limited due to multiple bottlenecks including contention for shared resources such as caches and memory. In this paper, the authors perform a scalability analysis of parallel applications on a 64-threaded Intel Nehalem-EX based system. They find that applications which scale well on small number of cores, exhibit poor scalability on large number of cores.