Science and Development Network (SciDev.Net)
Peer-to-peer (P2P) databases are becoming prevalent on the Internet for distribution and sharing of documents, applications, and other digital media. The problem of answering large-scale ad hoc analysis queries, for example, aggregation queries, on these databases poses unique challenges. Exact solutions can be time consuming and difficult to implement, for the distributed and dynamic nature of P2P databases. In this paper, the authors have presented novel sampling-based techniques for approximate answering of ad-hoc aggregation queries in such databases.