Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup

Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. The authors present extreme binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time.

Provided by: Institute of Electrical & Electronic Engineers Topic: Big Data Date Added: Sep 2009 Format: PDF

Find By Topic