Tenzing A SQL Implementation On The MapReduce Framework
Tenzing is a query engine built on top of MapReduce for ad hoc analysis of Google data. Tenzing supports a mostly complete SQL implementation (with several extensions) combined with several key characteristics such as het-erogeneity, high performance, scalability, reliability, meta-data awareness, low latency, support for columnar storage and structured data, and easy extensibility. Tenzing is currently used internally at Google by 1000+ employees and serves 10000+ queries per day over 1.5 petabytes of compressed data. In this paper, the authors describe the architecture and implementation of Tenzing, and present benchmarks of typical analytical queries.