A Semantics-Oriented Storage Model for Big Heterogeneous RDF Data
Increasing availability of RDF data covering different domains is enabling ad-hoc integration of different kinds of data to suit varying needs. This usually results in large collections of data such as the billion triple challenge datasets or SNOMED CT that are not just "Big" in the sense of volume but also "Big" in variety of property and class types. However, techniques used by most RDF data processing systems fail to scale adequately in these scenarios. One major reason is that the storage models adopted by most of these systems, e.g., vertical partitioning, do not align well with the semantic units in the data and queries.