Improved Focused Crawler Using Inverted WAH Bitmap Index
Focused Crawlers are software which can traverse the internet and retrieve web pages by hyperlinks according to specific topic. The traditional web crawlers cannot function well to retrieve the relevant pages effectively. The focused crawler is a special-purpose search engine which aims to selectively seek out pages that are relevant. The main characteristic of focused crawling is that the crawler does not need to collect all web pages, but selects and retrieves only the relevant pages. So the major problem is how to retrieve the maximal set of relevant and quality pages. To address this problem, the authors have designed an Interactive focused crawler which calculates the relevancy of web page.