A Novel Parallel Architecture Design of Information Retrieval System for Scientific Papers
Source: Science and Development Network (SciDev.Net)
Indexing allows converting raw document collection into easily searchable representation. Bigger scale indexing poses some challenges such as how to distribute indexing computation efficiently on a cluster of nodes. MapReduce framework can be an effective tool for parallelizing such tasks as inverted index construction. When performing search over the whole contents of a collection of documents, scanning them one-by-one is inefficient due to considerable response time. Usually larger collections are scanned, analyzed and indexed before making any query on them. This approach greatly reduces response time of searching.
| Format: | Size: | 259.62 | |
| Date: | Apr 2012 |



