Adaptively Parallelizing Distributed Range Queries

Source: VLDB Endowment

Favorite

Free registration required

The authors consider the problem of how to best parallelize range queries in a massive scale distributed database. In traditional systems the focus has been on maximizing parallelism, for example by laying out data to achieve the highest throughput. However, in a massive scale database such as the authors' PNUTS system or BigTable, maximizing parallelism is not necessarily the best strategy: the system has more than enough servers to saturate a single client by returning results faster than the client can consume them, and when there are multiple concurrent queries, maximizing parallelism for all of them will cause disk contention, reducing everybody's performance.
Format:PDF Size:227.20
Date:Aug 2009