Date Added: Aug 2011
Many dynamic simulation programs contain complex, irregular memory reference patterns, and require runtime optimizations to enhance data locality. Current approaches periodically stop the execution of an application to reorder the computation or data based on the current program state to improve the data locality for the next period of execution. In this paper, the authors examine the implications that modern heterogeneous Chip MultiProcessors (CMP) architecture imposes on the optimization paradigm. They develop three techniques to enhance the optimizations. The first is asynchronous data transformation, which moves data reordering off the critical path through dependence circumvention.