Building Enriched Web Page Representations using Link Paths
Anchor text has a history of enriching documents for a variety of tasks within the World Wide Web (WWW). Anchor texts are useful because they are similar to typical web queries, and because they express the document's context. Therefore, it is a common practice for web search engines to incorporate incoming anchor text into the document's standard textual representation. However, this approach will not suffice for documents with very few inlinks, and it does not incorporate the document's full context. To mediate these problems, the authors employ link paths, which contain anchor texts from paths through the web ending at the document in question.