Analyzing Fine-Grained Hypertext Features for Enhanced Crawling and Topic Distillation
Source: Indian Institute of Technology Bombay
Early Web search engines closely resembled Information Retrieval (IR) systems which had matured over several decades. Around 1996 - 1999, it became clear that the spontaneous formation of hyperlink communities in the Web graph had much to offer to Web search, leading to a flurry of research on hyperlink-based ranking of query responses. In this paper the authors show that, over and above inter-page hyperlinks, much semantic information can be teased out of the manner in which markup tags, such as menu-bars, tables, and lists are used to organize pages, and the context in which hyperlinks are made from a page to another.
| Format: | Size: | 149.10 | |
| Date: | Jan 2011 |



