Kahuna: Problem Diagnosis for MapReduce-Based Cloud Computing Environments

Free registration required

Executive Summary

The authors present Kahuna, an approach that aims to diagnose performance problems in Map Reduce systems. Central to Kahuna's approach is the insight on peer-similarity, that nodes behave alike in the absence of performance problems, and that a node that behaves differently is the likely culprit of a performance problem. The authors also present empirical evidence of the peer-similarity observations from the 4000-processor Yahoo! M45 Hadoop cluster. In addition, the authors demonstrate Kahuna's effectiveness through experimental evaluation of two algorithms for a number of reported performance problems, on four different workloads in a 100-node Hadoop cluster running on Amazon's EC2 infrastructure.

  • Format: PDF
  • Size: 2806.76 KB