IJCTT-International Journal of Computer Trends and Technology
Different types of digital libraries and ecommerce websites are exist with duplicate contents. Previously many systems are present for removing replica or duplicate items. Previous approaches are implemented in different repositories for detection of duplicate records. It can provides the organized or alignment based results. Those approaches are detect the results are near duplicate and range based results. These approaches are reducing the computation cost and time. Result is not contains any quality data. Increasing the digital libraries data quality new approaches are implementing in present system.