Study of Webcrawler: Implementation of Efficient and Fast Crawler

A focused crawler is a web crawler that attempts to download only web pages that are relevant to a pre-defined topic or set of topics. Focused crawling also assumes that some labeled examples of relevant and not relevant pages are available. The topic can be represent by a set of keywords (the authors call them seed keywords) or example urls. The key for designing an efficient focus crawler is how to judge whether a web pages is relevant to the topic or not. It defines several relevance computation strategies and provides an empirical evaluation which has shown promising results. They developed a framework to fairly evaluate topical crawling algorithms under a number of performance metrics.

Provided by: IOSR Journal of Engineering Topic: Software Date Added: Dec 2012 Format: PDF

Download Now

Find By Topic