Institute of Electrical and Electronics Engineers
The ability to trace request execution paths is critical for diagnosing performance faults in large-scale distributed systems. Previous black-box and white-box approaches are either inaccurate or invasive. The authors present a novel semantics-assisted gray-box tracing approach, called Rake, which can accurately trace individual request by observing network traffic. Rake infers the causality between messages by identifying polymorphic IDs in messages according to application semantics. To make Rake universally applicable, they design a Rake language so that users can easily describe necessary semantics of their applications while reusing the core Rake component. They evaluate Rake using a few popular distributed applications, including web search, distributed computing cluster, content provider network, and online chatting.