Data Management

Leveraging Cloud Computing in Geodatabase Management

Date Added: Jun 2010
Format: PDF

In this paper, the authors leverage Cloud computing technologies in scaling out data management in geographical databases. In particular, they tackle the issue of data indexing in parallel. First, spatial data is partitioned and indexed in a Hadoop MapReduce cluster. Two main partitioning strategies are evaluated: a linear-complexity method based on Zorder values and an iterative algorithm based on X-means clustering. The advantages and drawbacks of each method are weighted in with relation to query performance. Second, interactive queries are processed from a local site using the index data structures built in the Cloud. They perform an experimental study on a real dataset of 110 million spatial objects representing property parcels in the United States.