Macro Data Load: An Efficient Mechanism for Enhancing Loaded Data Reuse
This paper presents a study on macro data load, a novel mechanism to increase the amount of loaded data reuse within a processor. A macro data load brings into the processor a maximum-width data the cache port allows. In a 64-bit processor, for example, a byte load will bring a full 64-bit data from cache and save it in an internal hardware structure, while using for itself only the specified byte out of the 64-bit data. The saved data can be opportunistically reused by later loads internally, reducing relatively more expensive cache accesses.