A Survey Paper on Deduplication by Using Genetic Algorithm Alongwith Hash-Based Algorithm

Download Now
Provided by: International Journal of Engineering Research and Applications (IJERA)
Topic: Data Management
Format: PDF
In today's world, by increasing the volume of information available in digital libraries, most of the system may be affected by the existence of replicas in their warehouses. This is due to the fact that, clean and replica-free warehouse not only allow the retrieval of information which is of higher quality but also lead to more concise data and reduces computational time and resources to process this data. Here, the authors propose a genetic programming approach along with hash-based similarity i.e., with MD5 and SHA-1 algorithm.
Download Now

Find By Topic