International Journal of Computer Applications
Big data if used properly can bring huge benefits to the business, science and humanity. The various properties of big data like volume, velocity, variety, variation and veracity render the existing techniques of data analysis ineffective. Big data analysis needs fusion of techniques for data mining with those of machine learning. The k-means algorithm is one such algorithm which has presence in both the fields. This paper describes an approximate algorithm based on k-means. It is a novel method for big data analysis which is very fast, scalable and has high accuracy.