Big Data

Graph Partition and Identification of Cluster Number in Data Analysis

Date Added: Feb 2011
Format: PDF

Modern computing with wide range of applications in different areas such as Internet, biology, and social science, involves large scale of data analysis. The relations of data can be modeled as graphs and graph partitioning problem can be effectively approximated by spectral approaches. A critically important problem in graph partition is determination of the cluster number k. Although eigen-gap heuristic is a principle for this problem and is supported by theory, it is difficult to be applied for the real-world data and complex graphs. In this paper, by considering the general data analysis scenario, the authors present an algorithm to determine the cluster number k and perform clustering task simultaneously.