International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
World Wide Web is a collection of text documents, images, multimedia and other resources, which are linked by URLs and hyperlinks, usually accessed by web servers. According to the estimation WWW contains more than 2000 billion visible pages on web. Due to large number of pages on web, the search engine depends upon web crawlers to create and maintain indices for the web pages. A web crawler is a program which, giving one or more than one seed URLs, downloads the web pages associated with these URLs, extracts any hyperlinks present in them, and iteratively continues to download the web pages identified by these hyperlinks.