Download now Free registration required
With increasing numbers of cores, future CMPs (Chip Multi-Processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. Although such an organization is effective for avoiding access hot-spots, it can cause a significant number of non-local L2 accesses for many commonly occurring regular data access patterns. In this paper, the authors develop a compile-time framework for data locality optimization via data layout transformation.
- Format: PDF
- Size: 306.1 KB