Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Figure 1.

Performance of biDCG on synthetic data representing non-overlapping biclusters.

Panels A) and C) show the re-ordered heat maps computed with biDCG based on synthetic expression matrices representing 10 constant non-overlapping biclusters with noise levels 0.05 and 0.25, respectively, while panels (B) and (D) show similar results for additive biclusters with noise levels of 0 and 0.1, respectively. See text for the definition of “constant” and “additive” biclusters.

More »

Figure 1 Expand

Figure 2.

Effects of noise on the relevance and recovery levels of biclusters identified by biDCG and Bimax.

The biclustering techniques biDCG and Bimax [19] were applied on synthetic expression matrices designed to represent 10 biclusters, either constant (left panels, A and C), or additive (right panels, B and D). In both cases, the average relevance (i.e. the extent with which a generated bicluster represent a true bicluster) and the average recovery levels (i.e. the extent with which true biclusters are recovered) are plotted as a function of the noise level added to the expression matrices.

More »

Figure 2 Expand

Figure 3.

Effects of overlaps on the relevance and recovery levels of biclusters identified by biDCG and Bimax.

The biclustering techniques biDCG and Bimax [19] were applied on synthetic expression matrices designed to represent 10 biclusters, either constant (left panels, A and C), or additive (right panels, B and D). In both cases, the average relevance (i.e. the extent with which a generated bicluster represent a true bicluster) and the average recovery level (i.e. the extent with which true biclusters are recovered) are plotted as a function of the overlap level introduced in the expression matrices.

More »

Figure 3 Expand

Figure 4.

Iterative refinements of the biclusters identified by biDCG.

The biclustering method biDCG was applied on two synthetic expression matrices designed to represent 10 biclusters, either constant (left panel, A), or additive (right panel, B), both with overlap of 8 between the biclusters (see text for details). The initial biclusters (shown as white boxes) defined by simple applications of DCG on the whole matrix do not match correctly with the biclusters that were implanted; for example, DCG identified 11 biclusters in the constant cluster case (panel A). Iterative refinements of the biclusters however lead to the correct identification of all 10 reference biclusters, as shown as green sub matrices.

More »

Figure 4 Expand

Figure 5.

BiDCG analysis of lung cancer data.

The set of patients described in Bhattacharjee et al. [39] include 21 patients with squamous cell lung carcinomas (SQ), 20 patients with pulmonary carcinoids (COID), 6 patients with small cell lung carcinomas (SCLC), and 17 healthy patients with normal lungs (NL). Gene expression patterns over 1543 relevant genes were collected for each patient. The biDCG procedure applied to these data identified 7 biclusters, marked in white on the specially constructed heat map shown in panel A. Bicluster SQ_A for example identifies a set of genes, named also SQ_A, that best identifies patients with SQ lung cancers. Similarly, the three subsets of genes NL_A, NL_B, and NL_C can be thought as containing signature genes for healthy patients, while the subsets of genes COID_A, COID_B, and COID_C contain genes that identify best COID patients. Panel B shows the DCG tree on all patients based on all genes, while panels C, D, and E show the equivalent DCG trees based on the gene subsets SQ_A, COID_A, and NL_A, respectively. The color coding for the DCG trees is: purple, SQ, red, NL, green SCLC, and blue, COID.

More »

Figure 5 Expand

Table 1.

Dataset A: biclusters significantly enriched by any GO Biological Process category.

More »

Table 1 Expand

Figure 6.

BiDCG analysis of lung cancer data for patients with adenocarcinoma (AD).

We consider 65 patients with AD from the dataset described in Bhattacharjee et al. [39]. Gene expressions of 675 relevant genes are available for each patient. The biDCG procedure applied to these data identified 7 biclusters, marked in white on the specially constructed heat map. Each of these biclusters identifies a set of genes that can serve as signature for a specific type of patients, a so-called dual relationship.

More »

Figure 6 Expand

Table 2.

Dataset B: biclusters significantly enriched by any GO Biological Process category.

More »

Table 2 Expand

Table 3.

Proportion of biclusters significantly enriched by any GO Biological Process category for four biclustering methods. Results are shown for datasets A and B (see text for details).

More »

Table 3 Expand