Hierarchical Domain Partitioning for Hierarchical Architectures
Source: University of Virginia
The history of parallel computing shows that good performance is heavily dependent on data locality. Prior knowledge of data access patterns allows for optimizations that reduce data movement, achieving lower data access latencies. Compilers and runtime systems, however, have difficulties in speculating on locality issues among threads. Future multicore architectures are likely to present a hierarchical model of parallelism, with multiple threads on a core and multiple cores on a chip. With such a system, data affinity and localization becomes even more important to efficiently use per-core resources.