SORT: A Similarity-Ownership Based Routing Scheme to Improve Data Read Performance for Deduplication Clusters

Existing data routing schemes developed for deduplication clusters have never addressed the data read performance, although it has been a well-known problem that the reads require non-trivial random disk seeks significantly affecting the data read performance in deduplication systems. In this paper, the authors propose SORT, a Similarity-Ownership based Routing scheme that exploits both the data similarity and ownership to improve the data read performance for deduplication clusters. Their experimental results fed with real-world datasets show that SORT reduces about 10% of random disk seeks while at the cost of only 0.11% of deduplication efficiency, achieving an optimal tradeoff between the deduplication efficiency and data read performance compared to other existing routing schemes.

Provided by: AICIT Topic: Big Data Date Added: Oct 2011 Format: PDF

Find By Topic