Association for Computing Machinery
Flexible exploration of large RDF datasets with unknown relationships can be enabled using 'Unbound-property' graph pattern queries. Relational-style processing of such queries using normalized relations, results in redundant information in intermediate results due to the repetition of adjoining bound (fixed) properties. Such redundancy negatively impacts the disk I/O, network transfer costs, and the required disk space while processing RDF query workloads on MapReduce-based systems. This paper proposes packing and lazy unpacking strategies to minimize the redundancy in intermediate results while processing unbound-property queries.