An Optimized Approach of Modified BAT Algorithm to Record Deduplication

Download Now
Provided by: International Journal of Computer Applications
Topic: Big Data
Format: PDF
The task of recognizing, in a data warehouse, records that pass on to the identical real world entity despite misspelling words, kinds, special writing styles or even unusual schema versions or data types is called as the record de-duplication. In existing research they offered a Genetic Programming (GP) approach to record de-duplication. Their approach combines several different parts of substantiation extracted from the data content to generate a de-duplication purpose that is capable to recognize whether two or more entries in a depository are duplications or not.
Download Now

Find By Topic