Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

doi:10.1371/journal.pcbi.1001095

Figure 1.

Outline of our method.

(A) We first selected target genes that were differentially expressed in disease cases, using a multi-set cover approach. (B) In the second step, we detected genome-wide associations between gene expression changes of target genes and genomic alterations, allowing us to find potential causal genomic areas. (C) In the third step, we determined causal paths from genomic alterations (i.e. causal genes) to target genes by modeling and solving a current flow problem through a circuit of molecular interactions. (D) To select a final set of causal genes, we designed a weighted multi-set cover algorithm. Constructing a bipartite graph between candidate causal genes and disease cases, we labeled each edge with the associated set of target genes that were affected by the causal gene and were differentially expressed in the corresponding disease case. In the final set-cover, causal genes in boxes covered each disease case with at least two target genes, allowing one exception.

More »

Expand

Figure 2.

Lists of selected target, causal and hub genes.

Target and hub genes that are labeled red were up-regulated while genes labeled green were down-regulated. Causal genes are marked in red (green) if they were found in amplified (deleted) genomic regions. We defined hubs as genes that appeared in more than 10 causal pathways through the interaction network. Numbers in parentheses indicate the genes' actual occurrences.

More »

Expand

Table 1.

Functional analysis of genes selected in each step.

More »

Expand

Table 2.

Functional analysis of final causal genes.

More »

Expand

Figure 3.

The overlap of two different sets of causal/target genes.

In the Venn-diagram in (A) we show the overlap of two different sets of target genes. Even though these sets were almost disjoint, we found in (B) that the corresponding sets of their causal genes overlapped by up to 45%. Even though the initial sets of target genes were hardly similar, we concluded that our method remarkably compensated this disparity by determining strongly overlapping sets of causal genes.

More »

Expand

Figure 4.

Chromosomal analysis of causal genes.

(A) In the upper panel, we show the profile of genomic alterations in glioblastomas, where we observed large areas of genomic amplification on chromosome 7 and deletions on chromosome 10. Utilizing predictions of causal genes, we observe that the profile in the lower panel of occurrences (yellow bars) coincide well with the profile of alterations in the upper panel. Focusing on causal genes in the final set-cover (green bars), we recover the initial patterns. In (B), we constructed a matrix, showing the number of pairs of target and causal genes on their corresponding chromosomes. We found that causal genes on chromosomes 7 and 10 have numerous links to target genes on chromosomes 2, 3, 6, 10, 11, 12, 19 and 20 (boxed area). (C) Focusing on target genes in these chromosomal areas, we marked the presence of a causal path through a molecular interaction network between a target and causal gene as peach in the heat map. While bars indicated the differential expression of the corresponding genes (green: down, red: up), we found a large cluster of up-regulated target genes that were regulated by an array of largely down-regulated causal genes (boxed area).

More »

Expand

Table 3.

Enrichment of GO biological processes in causal subnetworks.

More »

Expand

Figure 5.

The network of causal paths from PTEN.

We observed that PTEN might exert its influence on target genes (the endpoints of each causal path) through prominent transcription factors such as TP53, MYC and MYB.

More »

Expand

Figure 6.

The network of causal paths from and to EGFR.

In (A) we show a network of causal paths that included EGFR as a causal gene. While this network was rather small, we found a large network of causal paths where EGFR was a target gene in (B). Specifically, we observed that EGFR might be influenced by numerous causal genes through prominent transcription factors.

More »

Expand