A Software Approach to Unifying Multicore Caches
Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core.