Navigating Big Data With High-Throughput, Energy-Efficient Data Partitioning
The global pool of data is growing at 2.5 quintillion bytes per day, with 90% of it produced in the last two years alone. There is no doubt the era of big data has arrived. This paper explores targeted deployment of hardware accelerators to improve the throughput and energy efficiency of large-scale data processing. In particular, data partitioning is a critical operation for manipulating large data sets. It is often the limiting factor in database performance and represents a significant fraction of the overall runtime of large data queries.