Big Data

Improved Biclustering Algorithm for Gene Expression Data

Date Added: Oct 2011
Format: PDF

Biclustering algorithms simultaneously cluster both rows and columns. These types of algorithms are applied to gene expression data analysis to find a subset of genes that exhibit similar expression pattern under a subset of conditions. Cheng and Church introduced the mean squared residue measure to capture the coherence of a subset of genes over a subset of conditions. They provided a set of heuristic algorithms based primarily on node deletion to find one bicluster or a set of biclusters after masking discovered biclusters with random values. The mean squared residue is a popular measure of bicluster quality. One drawback, however, is that it is biased toward flat biclusters with low row variance.