Association for Computing Machinery
In server workloads, large instruction working sets result in high L1 instruction cache miss rates. Fast access requirements preclude large instruction caches that can accommodate the deep software stacks prevalent in server applications. Prefetching has been a promising approach to mitigate instruction-fetch stalls by relying on recurring instruction streams of server workloads to predict future instruction misses. By recording and replaying instruction streams from dedicated storage next to each core, stream-based prefetchers have been shown to overcome instruction fetch stalls.