Association for Computing Machinery
Column-oriented database system architectures invite a re-evaluation of how and when data in databases is compressed. Storing data in a column-oriented fashion greatly increases the similarity of adjacent records on disk and thus opportunities for compression. The ability to compress many adjacent tuples at once lowers the per-tuple cost of compression, both in terms of CPU and space overheads. In this paper, the authors discuss how they extended c-store (a column-oriented DBMS) with a compression sub-system. They show how compression schemes not traditionally used in row-oriented DBMSs can be applied to column-oriented systems.