Effect of Inverted Index Partitioning Schemes on Performance of Query Processing in Parallel Text Retrieval Systems

Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this paper, the authors investigate the effect of these two index partitioning schemes on query processing. They conduct experiments on a 32-node PC cluster, considering the case where index is completely stored in disk.

Provided by: Springer Healthcare Topic: Big Data Date Added: Aug 2006 Format: PDF

Download Now

Find By Topic