A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM
Modern DRAMs have multiple banks to serve multiple memory requests in parallel. However, when two requests go to the same bank, they have to be served serially, exacerbating the high latency of off-chip memory. Adding more banks to the system to mitigate this problem incurs high system cost. The authors' goal in this paper is to achieve the benefits of increasing the number of banks with a low cost approach. To this end, they propose three new mechanisms that overlap the latencies of different requests that go to the same bank. The key observation exploited by their mechanisms is that a modern DRAM bank is implemented as a collection of sub-arrays that operate largely independently while sharing few global peripheral structures.