Association for Computing Machinery
Extracting high-performance from the emerging Chip Multi-Processors (CMPs) requires that the application be divided into multiple threads. Each thread executes on a separate core thereby increasing concurrency and improving performance. As the number of cores on a CMP continues to increase, the performance of some multi-threaded applications will benefit from the increased number of threads, whereas, the performance of other multi-threaded applications will become limited by data-synchronization and off-chip bandwidth. For applications that get limited by data synchronization, increasing the number of threads significantly degrades performance and increases on-chip power.