Provided by: International Journal of Engineering Research and Applications (IJERA)
Topic: Data Management
Date Added: Jan 2014
In today's world, by increasing the volume of information available in digital libraries, most of the system may be affected by the existence of replicas in their warehouses. This is due to the fact that, clean and replica-free warehouse not only allow the retrieval of information which is of higher quality but also lead to more concise data and reduces computational time and resources to process this data. Here, the authors propose a genetic programming approach along with hash-based similarity i.e., with MD5 and SHA-1 algorithm.