Web Crawler - An Overview
A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Web crawling is an important method for collecting data on, and keeping up with, the rapidly expanding Internet. A vast number of web pages are continually being added every day, and information is constantly changing. This paper is an overview of various types of Web Crawlers and the policies like selection, revisit, politeness, parallelization involved in it. The behavioral pattern of the Web crawler based on these policies is also taken for the study. The evolution of these web crawler from basic general purpose web crawler to the latest Adaptive web crawler is studied.