Analyzing Failures of a Semi-Structured Supercomputer Log File Efficiently by Using PIG on Hadoop

Download Now
Provided by: International Journal of Computer Science and Engineering (IJCSE)
Topic: Data Management
Format: PDF
Data sets used to fuel the recently popular concept of 'business intelligence' are becoming increasingly large. Conventional database management software is no longer efficient enough however; parallel database management systems and massive data-scale processing systems like MapReduce indeed look promising. Although, MapReduce is a good option, it is difficult to work with, as the programmer would have to think at the mapper and reducer level. In this paper, the authors present a simple yet efficient way to mine useful information where a program can be written as a series of steps.
Download Now

Find By Topic