Lynx: A Dynamic Instrumentation System for Data-Parallel Applications on GPGPU Architectures
As parallel execution platforms continue to proliferate, there is a growing need for real-time introspection tools to provide insight into platform behavior for performance debugging, correctness checks, and to drive effective resource management schemes. To address this need, the authors present the Lynx dynamic instrumentation system. Lynx provides the capability to write instrumentation routines that are selective, instrumenting only what is needed, transparent, without changes to the applications' source code, customizable, and efficient. Lynx is embedded into the broader GPU Ocelot system, which provides run-time code generation of CUDA programs for heterogeneous architectures.