Association for Computing Machinery
Scalable processing of semantic web queries has become a critical need given the rapid upward trend in availability of semantic web data. The MapReduce paradigm is emerging as a platform of choice for large scale data processing and analytics due to its ease of use, cost effectiveness, and potential for unlimited scaling. Processing queries on semantic web triple models is a challenge on the mainstream MapReduce platform called apache Hadoop, and its extensions such as pig and hive. This is because such queries require numerous joins which leads to lengthy and expensive MapReduce workflows.