Skip to main content
Advertisement

< Back to Article

Figure 1.

Pictorial outline of methodology.

A. Genes overlapping the wingspan of associated SNPs are defined, and these genes code for associated proteins. B. Associated proteins are used to recover direct and indirect networks. Direct networks (left) are built from direct interactions between associated proteins according to the InWeb database (colored proteins). Connections between proteins within the same locus are not considered. Indirect networks (right) are built by allowing connections between associated proteins through a protein elsewhere in the genome (grey). Various network parameters to quantify connectivity, defined in the text, are assigned. C. Random networks are built from a within-degree node-label permutation method described in Text S1. An empirical distribution is constructed for each network parameter and used to evaluate the significance of networks. D. Using the same permutation method to score individual proteins, a subset of proteins per locus is nominated as candidates for harboring causal variants (red circles). Scores used to nominate candidates, described in Text S1, are Bonferroni corrected for the number of possible candidates within each locus. E. Candidate genes from D (nominal p-values used) are tested for co-expression.

More »

Figure 1 Expand

Figure 2.

RA and CD direct networks are significantly interconnected.

The direct network connectivity, the number of edges in the direct network, was enumerated for the disease networks and 50,000 random networks. A histogram was plotted to represent random expectation, and the disease network is shown by an arrow for (A) RA and (B) CD. See Figure S6 for remaining parameters and for parameters of height, lipids and T2D.

More »

Figure 2 Expand

Figure 3.

Candidate RA and CD genes are preferentially expressed in immune tissues.

We obtained tissue expression data for 126 different cell types from a publicly available database, which was grouped into immune, gastrointestinal (GI), neuronal and ‘other’ [46]. For each tissue, we compared the expression of RA (A) and CD (B) candidate genes to the rest of the genes in the genome using a one-tailed rank-sum test, resulting in a p-value for each tissue (-log(p) is plotted on the y-axis). A significant difference for a given tissue indicated that the candidate genes were enriched for expression in that tissue compared to all genes in the genome. To test whether our network prioritization identified genes that were co-enriched in specific tissues beyond what was expected from all genes in associated regions, we calculated the same p-values for the rest of the genes in RA and CD associated loci (i.e., the genes that weren't prioritized via our network permutations). In this figure, we plot the tissue enrichment scores for each tissue for the candidate genes (purple) and the non-prioritized genes in the remaining regions of association (black). We indicate the category of tissue on the bottom: immune (red), GI (yellow), neuronal (green) and other (blue). We ordered the tissues by decreasing enrichment score of the candidate genes.

More »

Figure 3 Expand

Table 1.

RA and CD candidate genes are preferentially expressed in immune tissues.

More »

Table 1 Expand

Figure 4.

Final disease networks.

Resultant networks built from candidate genes are depicted for RA and CD (A and B, respectively). Using only the candidate genes, we plotted the direct network as well as any other proteins connected to the direct network after filtering them on expression in any one of the tissues found to be specific to the core network. 610 such proteins connect to the RA network and 293 such proteins connect to the CD network. Large circles represent disease proteins, and small circles represent the connected proteins. Small red circles indicate proteins connected to the core network that were newly identified associated regions (10 proteins in CD and 1 protein in RA). The large circles are colored by locus.

More »

Figure 4 Expand