Constraint Based Record Matching
Record matching, which identifies the records that represents the same real world entity is an important step for data integration. Most state-of-the-art record matching methods are supervised including the UDD which is termed and expected to be unsupervised. In web database scenario, the records to be matched are query results, which are dynamically generated on the fly. Such records are query dependent and pre-learned method using training examples from previous query results which may fail on the result of the new query. Thus the completely unsupervised system of Constraint Based Record Matching is proposed, where the FP tree is used. In CBRM technique, the authors prune the growth of the FP tree based on the constraints that are required by the user.