Download Now Free registration required
In this paper, the authors present H2RDF, a fully distributed RDF store that combines the MapReduce processing framework with a NoSQL distributed data store. Their system features two unique characteristics that enable efficient processing of both simple and multi-join SPARQL queries on virtually unlimited number of triples: join algorithms that exe-cute joins according to query selectivity to reduce processing; and adaptive choice among centralized and distributed (MapReduce-based) join execution for fast query responses. Their system efficiently answers both simple joins and complex multivariate queries and easily scales to 3 billion triples using a small cluster of 9 worker nodes.
- Format: PDF
- Size: 831.34 KB