Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems
Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memory, however, decreases apparent spatial locality because access streams from independent threads are interleaved. Memory access scheduling recovers only a fraction of the original locality because of buffering limits. The authors investigate new techniques to reduce inter-thread access interference. They propose to partition the internal memory banks between cores to isolate their access streams and eliminate locality interference. They implement this by extending the physical frame allocation algorithm of the OS such that physical frames mapped to the same DRAM bank can be exclusively allocated to a single thread.