Performance Considerations of Data Acquisition in Hadoop System
Data have become more and more important these years, especially for big companies, and it is of great benefit to mine useful information in these data. Oil & Gas industry has to deal with vast amounts of data, both in real-time and historical context. As the amount of data is significant, it is usually infeasible or very time consuming to actually process the data. In the authors' project they investigate usage of Hadoop to solve this problem. In order to perform Hadoop jobs, data must first exist in the Hadoop file system, which creates the problem of data acquisition.