Summary:
Our work introducing the notion of cache miss equations (CMEs) allows
for the tight customization of a program's memory reference characteristics
to a particular memory hierarchy. Five to ten years ago, the best
way of understanding most aspects of cache behavior was to run the program
and use either simulation techniques or rudimentary hardware counters to
measure cache performance. CMEs demonstrate that much of this performance
data can be mathematically predicted at compile-time, rather than empirically
observed at run-time.
CMEs perform compile-time analysis to generate a system of linear Diophantine equations describing the program's memory behavior. Solutions to these equations represent potential cache miss points. Using mathematical techniques applicable to linear Diophantine equations, the compiler can then perform mathematically-precise optimizations geared towards reducing cache misses in the code. This allows one to improve program memory performance in a programmer-invisible way. Our work on CMEs represents the first precise, compile-time analysis of memory behavior that works on caches of arbitrary associativity.
Our basic CME framework has already led to new tiling and padding algorithms for scientific codes, that improve over existing techniques based on heuristics or much simpler mathematical models. These optimizations have reduced cache misses in some programs by 60% or more, contributing to 2X or more performance improvements in some programs. The real power of CMEs, however, lies in their ability to provide a precise, unifying framework supporting a range of inter-related optimizations. For example, we have used CMEs to build an automated diagnosis framework to discern memory performance problems and select the appropriate compiler transformation to fix them.