Cloud

H2RDF: Adaptive Query Processing on RDF Data in the Cloud

Free registration required

Executive Summary

In this paper, the authors present H2RDF, a fully distributed RDF store that combines the MapReduce processing framework with a NoSQL distributed data store. Their system features two unique characteristics that enable efficient processing of both simple and multi-join SPARQL queries on virtually unlimited number of triples: join algorithms that exe-cute joins according to query selectivity to reduce processing; and adaptive choice among centralized and distributed (MapReduce-based) join execution for fast query responses. Their system efficiently answers both simple joins and complex multivariate queries and easily scales to 3 billion triples using a small cluster of 9 worker nodes.

  • Format: PDF
  • Size: 831.34 KB