Analyzing Fine-Grained Hypertext Features for Enhanced Crawling and Topic Distillation

Early Web search engines closely resembled Information Retrieval (IR) systems which had matured over several decades. Around 1996 - 1999, it became clear that the spontaneous formation of hyperlink communities in the Web graph had much to offer to Web search, leading to a flurry of research on hyperlink-based ranking of query responses. In this paper the authors show that, over and above inter-page hyperlinks, much semantic information can be teased out of the manner in which markup tags, such as menu-bars, tables, and lists are used to organize pages, and the context in which hyperlinks are made from a page to another.

Provided by: Indian Institute of Technology Bombay Topic: Networking Date Added: Jan 2011 Format: PDF

Find By Topic