Software

Using Clustering and Blade Clusters in the TeraByte Task

Free registration required

Executive Summary

Web search engines exploit conjunctive queries and special ranking criteria which differ from the disjunctive queries typically used for ad-hoc retrieval. The authors wanted to asses the effectiveness of those techniques in the TeraByte task, in particular scoring criteria like: Link popularity, proximity boosting, home page score, descriptions and anchor text. Since conjunctive queries sometimes produce low recall, they tested a new approach to query expansion, which extracts additional query terms from a clustering of the snippets from the first query. The technique proved effective, almost doubling the Mean Average Precision. However, the improvement was just enough to compensate for the drop that was introduced, contrary to the expectations, by the proximity boost.

  • Format: PDF
  • Size: 155.7 KB