Cassandra - A Decentralized Structured Storage System

Free registration required

Executive Summary

This paper at large discusses the development, design process, the basic data model in detail and overview of the clients for Cassandra, which is a decentralized structured storage system. The need to develop a storage system that could handle and manage large amounts of structured data spread out across hundreds of commodity servers, led to the formation of the Cassandra. It was noticed that servers which were already being used could not be used to meet the reliability and scalability needed to run on top of an infrastructure of hundreds of nodes possibly spread across different data centers. Therefore, a system was designed to run on cheap commodity hardware and handle high write through-put without compromising on the performance, reliability, wide applicability and scalability named Cassandra was designed. Cassandra is a structured storage system built by Facebook to fulfill the storage needs of the Inbox Search problem. It mixes up the end result makes of the well known techniques to achieve its scalability and availability. This distributed storage management system at Facebook is used to handle a high write throughput, billions of writers per day, and also scale with the number of users. This storage system is now used at large as a backend storage system for multiple services. In the near future this structure will also be provided with added features like adding compression, ability to support atomicity across keys, and secondary index support.

  • Format: PDF
  • Size: 130.3 KB