Realizing Peer-to-Peer and Distributed Web Crawler

The tremendous growth of the World Wide Web has made tools such as search engines and information retrieval systems have become essential. In this dissertation, the authors propose a fully distributed, peer-to-peer architecture for web crawling. The main goal behind the development of such a system is to provide an alternative but efficient, easily implementable and a decentralized system for crawling, indexing, caching and querying web pages. The main function of a webcrawler is to recursively visit web pages, extract all URLs form the page, parse the page for keywords and visit the extracted URLs recursively.

Provided by: International Journal of Advanced Research in Computer Engineering & Technology Topic: Collaboration Date Added: Jun 2012 Format: PDF

Find By Topic