Provided by: Science and Development Network (SciDev.Net)
Topic: Data Management
This paper presents an algorithm and structure for a deduplication method which can be efficiently define identical data between files existing different machines with high rate and performing it within rapid time. The algorithm identifies to some part of the destination file, and only sends those parts which cannot be matched in this way. The fundamental aspects of reaching faster and accurately looking up result is that data are expressed as fixed-size block chunks and indexed by its anchor byte values in \"Index-table\". \"Index-table\" is a 256x256 sized table structure; indexing the edge chunk byte values are used as their cell row and column numbers.