Exploiting Fine-Grained Data Parallelism With Chip Multiprocessors and Fast Barriers
Source: University of California
The authors examine the ability of CMPs, due to their lower on-chip communication latencies, to exploit data parallelism at inner-loop granularities similar to that commonly targeted by vector machines. Parallelizing code in this manner leads to a high frequency of barriers, and they explore the impact of different barrier mechanisms upon the efficiency of this approach. To further exploit the potential of CMPs for fine-grained data parallel tasks, they present barrier filters, a mechanism for fast barrier synchronization on chip multi-processors to enable vector computations to be efficiently distributed across the cores of a CMP.
| Format: | Size: | 1140.90 | |
| Date: | Oct 2006 |



