CMP Directory Coherence: One Granularity Does Not Fit All
To support legacy software, large CMPs often provide cache coherence via an on-chip directory rather than snooping. In those designs, a key challenge is maximizing the effectiveness of precious on-chip directory state. Most current directory protocols miss an opportunity by organizing all state in per-block records. To increase the \"Reach\" of on-chip directory state, the authors apply ideas from snooping region coherence to develop a dual-grain CMP directory protocol. They trade enable a tradeoff between unnecessary probes (e.g., invalidations) and on-chip directory storage size by organizing a directory entry with both per-1KB-region state and per-64B-block state.