Similarity Based Web Data Extraction and Integration System for Web Content Mining
The Internet is a major source of all information that the authors essentially need. The information on the web cannot be analyzed and queried as per the user requests. Here, they propose and develop a similarity based Web Data Extraction and Integration System (WDES and WDICS) to extract search result pages from the web and integrate its contents to enable the user to perform intended analysis. The system provides for local replication of search result pages, in a manner convenient for offline browsing.