A Parallel Clustering Method Study Based on MapReduce
Clustering is considered as the most important task in data mining. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Many practical application problems should be solved with clustering method. It has been widely applied into all kinds of areas, such marketing, biology, library, insurance, earth-quake study, and World Wide Web (WWW) and so on. Many clustering methods have been studied, such as k-means, Fisher clustering, and Koehon clustering and so on.