Storing Matrices on Disk: Theory and Practice Revisited

The authors consider the problem of storing arrays on disk to support scalable data analysis involving linear algebra. They propose Linearized Array B-tree, or LAB-tree, which supports flexible array layouts and automatically adapts to varying sparsity across parts of an array and over time. They reexamine the B-tree splitting strategy for handling insertions and the flushing policy for batching updates, and show that common practices may in fact be suboptimal. Through theoretical and empirical studies, they propose alternatives with good theoretical guarantees and/or practical performance.

Provided by: VLD Digital Topic: Data Management Date Added: Sep 2011 Format: PDF

Find By Topic