Date Added: Aug 2009
Identification of groups of functionally related genes from high throughput gene expression data is an important step towards elucidating gene functions at a global scale. Most existing approaches treat gene expression data as points in a metric space, and apply conventional clustering algorithms to identify sets of genes that are close to each other in the metric space. However, the authors usually ignore the topology of the underlying biological networks. In this paper, they propose a network-based clustering method that is biologically more realistic. Given a gene expression data set, they apply a rank-based transformation to obtain a sparse co-expression network, and use a novel spectral clustering algorithm to identify natural community structures in the network, which correspond to gene functional modules.