Locating Cache Performance Bottlenecks Using Data Profiling
Every organization understands the need of effective use of CPU data caches to ensure good performance. Yet, this is made difficult due to the fact that it is extremely tough to spot poor cache patterns with the existing execution profiling tools. Typical profilers attribute costs to specific code locations. It is possible to spread the costs due to frequent cache misses on a given piece of data over instructions throughout the application. This will spread the cost a large number of instructions and make it appear smaller, which will seem to be insignificant in a code profiler's output. This paper presents the use of Data Profiling (DProf) in locating cache performance bottlenecks. With DProf, programmers are able to better understand cache miss costs. This is done by attributing misses to data types instead of code. This also helps programmers to locate data structures that experience misses in many places in the application's code. DProf provides programmers with a number of new views of cache miss data. These also includes data profile, which reports the data types with the most cache misses, and a data flow graph, which summarizes how objects of a given type are accessed throughout their lifetime. The paper illustrates the use DProf through two case studies. These studies make use of DProf to find and fix cache performance bottlenecks in Linux.