International Journal of Emerging Science and Engineering (IJESE)
World Wide Web (WWW) is a big dynamic network and a repository of interconnected documents and other resources, linked by hyperlinks and URLs. Web crawlers are used to recursively traverse and download web pages for search engines to create and maintain the web indices. Moreover, the need of maintaining the up-to-date pages causes repeated traversal of websites by crawler. Due to this, the resources like CPU cycles, disk space, and network bandwidth, etc., become overloaded which may lead to crashing of website and increase in web traffic. However, websites can limit the crawlers through robots exclusion protocol.