Association for Computing Machinery
As data intensive applications evolve, many research projects involving big data require efficient extraction and analysis of specific data subsets, rather than the whole dataset. Social media data analysis is one such example. While social media platforms such as Twitter provide tremendous data about all kinds of social activities, most research analyses focus on specific social events, such as presidential elections or protests. In order to support the requirements of such research use cases, the storage platform needs to provide not only a scalable solution for the overall large dataset, but also mechanisms for efficiently querying the target subsets and applying post-query analyses.