Sources Selection Methodology for Hidden Web Data Integration

Date Added: Aug 2009
Format: PDF

In the internet-scale hidden web data integration, the problem of sources (web databases) selection has been a primary challenge. This paper proposes a novel approach for web databases selection of internet-scale hidden web data integration. This approach is based on a benefit function that evaluates how much benefit the web database brings to a given status of integration system by integrating it. With the estimated benefit information, web databases selection can be made in an iteratively manner. Preliminary results show that the technique provides an effective mechanism to select and integrate web databases.