Data Centers

Improving the Performance of List Intersection

Date Added: Aug 2009
Format: PDF

List intersection is a central operation, utilized excessively for query processing on text and databases. The authors present list intersection algorithms for an arbitrary number of sorted and unsorted lists tailored to the characteristics of modern hardware architectures. Two new list intersection algorithms are presented for sorted lists. The first algorithm, termed Dynamic Probes, dynamically decides the probing order on the lists exploiting information from previous probes at runtime. This information is utilized as a cache-resident micro-index. The second algorithm, termed Quantile-based, deduces in advance a good probing order, thus avoiding the overhead of adaptivity and is based on detecting lists with non-uniform distribution of document identifiers. For unsorted lists, they present a novel hash-based algorithm that avoids the overhead of sorting.