International Journal of Computer Applications
Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. Many biclustering algorithms optimize a mean squared residue to discover biclusters from a gene expression dataset. In this paper, a two-phase method of finding a bicluster is developed. In the first phase, a modified version of k-means algorithm is applied to the gene expression data to generate k clusters.