Data Management

Semantic Focused Crawling for Retrieving ECommerce Information

Free registration required

Executive Summary

Focused crawling is proposed to selectively seek out pages that are relevant to a predefined set of topics without downloading all pages of the Web. With the rapid growth of the E-commerce, how to discovery the specific information such as about buyer, seller and products etc. adapting for the online business user becomes a focused issue to the information search engine. The author present a novel semantic approach for building an intelligent focused crawler which deals with evaluating the page's content relevance to the E-commerce topic by the domain ontology and the hyperlinks connection to the commercial web pages by link analysis. In the process of crawling, the domain ontology can evolve automatically by machine learning based on the statistics and rules.

  • Format: PDF
  • Size: 776.1 KB