"Design and Implementation of Scalable, Fully Distributed Web Crawler for a Web Search Engine"

Provided by: International Journal of Computer Applications
Topic: Software
Format: PDF
"The web is a context in which traditional Information Retrieval (IR) methods are challenged. Given the volume of the web and its speed of change, the coverage of modern web search engines is relatively small. Search engines attempt to crawl the web exhaustively with crawler for new pages, and to keep track of changes made to pages visited earlier. The centralized design of crawlers introduces limitations in the design of search engines. It has been recognized that as the size of the web grows, it is imperative to parallelize the crawling process. Contents other than standard documents (multimedia content and databases etc) also makes searching harder since these contents are not visible to the traditional crawlers."

Find By Topic