Provided by: Universal Insurance Managers, Inc.
Topic: Big Data
K-mean clustering is a partitioning method which contains k cluster and n object. It partition a set of n object into k cluster so resulting intracluster similarity is high but intercluster similarity is low. K-means uses Euclidean distance for measure similarity in objects. It has a problem when clusters are differing size, densities, and non-global shapes. It cannot handle outlier. Another problem with k-means is selection of variables. The k-means type algorithms cannot select variables automatically because they treat all variables equally in the clustering process.