Enhance the Efficiency of Clustering by Minimizing the Processing Time Using Hadoop MapReduce

Provided by: International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
Topic: Big Data
Format: PDF
Data is increasing day-by-day with the development of information technology. Extracting the required information from huge amount of data is a complex and time consuming process. Clustering can be considered the most important unsupervised learning in data mining. K-means clustering is a traditional and popular cluster analysis method in data mining but it is not suitable for large volumes of unstructured data sets. Therefore, this paper proposes k-means clustering with Hadoop Map-reducing technique. Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data in-parallel on large clusters.

Find By Topic