International Journal of Computer Science and Information Technologies
As more and more software moves to Data Analytics-as-a-Service (DAaaS), the web application has become more ubiquitous and log file analysis is becoming a necessary task for analyzing the client's behavior. Log files are getting generated very fast i.e., at the rate of 1-10Mb/s per server. A single data center can generate tens of terabytes of log data in a day which is very huge. In order to analyze such large datasets, the authors need parallel processing system and reliable data storage mechanism. Virtual database system is an effective solution for integrating the data, but it becomes inefficient for large datasets.