A Hybrid Approach for Estimating Document Frequencies in Unstructured P2P Networks

Scalable search and retrieval over numerous web document collections distributed across different sites can be achieved by adopting a Peer-To-Peer (P2P) communication model. Terms and their document frequencies are the main components of text information retrieval and as such need to be computed, aggregated, and distributed throughout the system. This is a challenging problem in the context of unstructured P2P networks, since the local document collections may not reflect the global collection in an accurate way. This might happen due to skews in the distribution of documents to peers.

Provided by: Norwegian University of Science and Technology Topic: Data Centers Date Added: Jul 2010 Format: PDF

Find By Topic