Virtualization

TritonSort: A Balanced Large-Scale Sorting System

Free registration required

Executive Summary

The authors present TritonSort, a highly efficient, scalable sorting system. It is designed to process large datasets, and has been evaluated against as much as 100 TB of input data spread across 832 disks in 52 nodes at a rate of 0.916 TB/min. When evaluated against the annual Indy GraySort sorting benchmark, TritonSort is 60% better in absolute performance and has over six times the per-node efficiency of the previous record holder. In this paper, they describe the hardware and software architecture necessary to operate TritonSort at this level of efficiency. Through careful management of system resources to ensure cross-resource balance, they are able to sort data at approximately 80% of the disks' aggregate sequential write speed.

  • Format: PDF
  • Size: 546.6 KB