An Approach for Identifying URLs Based on Division Score and Link Score in Focused Crawler

Date Added: May 2010
Format: PDF

The rapid growth of the World Wide Web (WWW) poses unprecedented scaling challenges for general-purpose crawlers. Crawlers are software which can traverse the internet and retrieve web pages by hyperlinks. The focused crawler of a special-purpose search engine aims to selectively seek out pages that are relevant to a pre-defined set of topics, rather than to exploit all regions of the Web. Focused crawler is developed to collect relevant web pages of interested topics from the Internet. Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size of the web.