International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE)
K-means is a well-known clustering algorithm in the field of data mining. It is simple to implement and its speed allows it to run on large data sets. However, it also has a drawback. Advancement in many data collection techniques has been generating enormous amount of data, leaving scientists with the challenging task of processing them. Its performance will not be sufficient when it has to deal with large data sets. To solve this problem, a method is proposed in this paper by which k-means will be implemented using OpenCL heterogeneous computing platform with the help of Hadoop-MapReduce framework.