Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes

Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes

  • Ignacio Riquelme Medina, 
  • Zelmina Lubovac-Pilav


Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body’s inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D.


Type 1 diabetes (T1D) is a complex trait, which develops when the insulin producing beta cells are destroyed resulting in a decreased production and secretion of insulin. According to National Center for Chronic Disease Prevention and Health Promotion (CDC), diabetes is one of the most common chronic diseases worldwide, having a prevalence of 9.3% of the population in United States [1]. Furthermore, diabetes is also a major contributor of renal diseases, amputation, cardiovascular disease [2,3], and has been projected to be the seventh leading cause of deaths in 2030 [4].

T1D is a heterogeneous disease with both underlying genetic and environmental factors that influence the development and progression of the disease [5]. Important chromosomal regions that have been shown to contribute to disease susceptibility are the human leukocyte antigen (HLA) region at chromosome 6 and insulin gene region at chromosome 11 [3]. Nevertheless, only small percentage of genetically susceptible individuals progress to clinical disease, which indicates the involvement of environmental triggers.

Previous research in this field has been primarily focused on analysis of single susceptibility genes [610] or performing Genome Wide Association Studies (GWAS) to identify genetic determinants of disease [1115]. In addition, majority of the studies are focused on the beta cells whereas novel findings are pointing to the importance of the immune system in the disease development [16,17]. This study is based on the public data derived from the samples from peripheral blood mononuclear cells (PBMC), involved in innate immune activation that may play both pathological and protective role in T1D [18]. PBMC are suitable for the assessment of immunological markers in T1D children, as stated in earlier study [19].

Current studies on T1D do not take in account the interactions between the genes or proteins, which are crucial for understanding molecular mechanisms underlying complex disease. Recently, importance of network-based approaches to human disease has been highlighted [20]. Cellular interconnectedness effects the disease progression and should be taken into account in the identification of disease genes and pathways, which in turn, may provide new targets for drug development.

In this study, we hypothesize that pathogenesis of T1D is reflected by the perturbation of intercellular and intracellular networks, which may lead to identification of specific disease modules caused by a variation in one or more of the components. We adopted a well-established network-based approach, Weighted Gene Co-Expression Network Analysis (WGCNA) [21] to identify modules in co-expression gene networks that may be associated with T1D. To the best of our knowledge, this approach has not been applied in previous T1D studies. This method, in combination with functional enrichment and network topology measures, is also applied here to identify potential biomarkers for T1D that will aid in the understanding of the mechanisms of T1D.

We identified co-expression modules that show significant disruption, by comparing global co-expression network in T1D to the corresponding network derived from healthy controls. Within the identified co-expression disease modules that were chosen for further analysis, we found several significantly enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways with association to T1D, such as Type I diabetes mellitus, mTOR signaling pathway etc. Besides confirming some genes with previously known T1D involvement, such as fas cell surface death receptor (FAS), interleukin 1 beta (IL1B) and interleukin 8 (IL8), we also identified interesting candidate genes that could be further analyzed further to understand their roles in T1D.

Materials and Methods

Affymetrix Microarray

Microarray data GSE9006 from NCBI Gene Expression Omnibus (GEO) database was collected from peripheral blood mononuclear cell (PBMC) samples from 43 children with newly diagnosed T1D and 24 healthy controls [22]. For 20 patients, one and four month follow-up samples were also included in the analysis. The data was normalized using the global scaling normalization method and filter was applied based on Affymetrix flag calls, according to the same procedure as in [22]. Probe sets were selected for further analysis if present in at least 50% of samples in any group using R Bioconductor affy package [23]. After the initial filtering, 17,310 genes were left. The data set was preprocessed further by applying Significance Analysis of Microarrays (SAM) [24] to remove genes that show no or very low changes in expression between healthy and disease samples, but also for the purpose of getting manageable size of the data for WGCNA analysis. R Bioconductor package siggenes [25] that implements SAM was used (delta = 0.296), leaving the total of 10005 genes in the final set for further analysis.

Weighted Gene Co-Expression Analysis (WGCNA)

WGCNA is a method for deriving co-expression networks [21], and it is implemented in R WGCNA package [26]. The method for constructing network is as follows: first, a similarity co-expression matrix was calculated with Pearson's correlation cor(i,j) for all genes, and transformed to an adjacency matrix (AM) by using the soft thresholding power beta, to which co-expression similarity is raised (se Eq 1). (1) where aij represents the resulting adjacency that measures the connection strengths.

We chose the power beta based on criteria of approximating scale-free topology of the network, as prescribed in the original publication [26]. Power of beta = 9 was chosen based on the scale-free topology criterion. This criterion shows that the power parameter, beta, is the lowest integer such that the resulting network satisfies approximate scale-free topology (linear regression model fitting index R2 that should be larger than 0.8).

Then, topological overlap matrix (TOM) [27] was computed from AM, and TOM was in turn converted into a dissimilarity TOM. Finally, hierarchical clustering was used to produce tree (dendrogram) from dissimilarity TOM. By using dynamic tree cutting, different number of clusters (modules) was obtained from the tree. The resulting modules contained genes that are densely interconnected and we constructed two different networks, one using the healthy samples and the other using the T1D samples.

WGCNA can be used for summarizing obtained modules by using concept of eigengene, and further screening for suitable gene targets by calculating module membership (kME) measures, also known as eigengene-based connectivity [21,26]. Eigengenes are defined as the first principal component of the expression matrix for each module, and represent the weighted average of the expression profile for each module. The eigengenes can be used to merge clusters with a similar expression profile, leading to the final set of modules as a result of constructing the network.

Preservation of modules

Module preservation statistics in WGCNA was used to evaluate whether a given module defined in reference data set (healthy network) can also be found in the test data set (disease network). The overall significance of the preservation statistics was assessed using Zsummary Eq (2) that combines multiple preservation statistics into a single overall measure of preservation, which considers both aspects of density and connectivity preservation [28].


Based on the thresholds proposed in original method proposal [28], resulting Zsummary < 2 indicates no preservation, 2< Zsummary <10 indicates weak to moderate evidence of preservation, and Zsummary >10 means strong evidence of module preservation.

Pathway enrichment of the significant modules

We performed pathway enrichment analysis of selected modules by using two different tools, a network-based gene set enrichment analysis, EnrichNet ( [29] and meta-database ConcensusPathDB ( [30]. This includes enrichment in predefined pathways by, for example,—KEGG ( and Gene Ontology—GO terms (

Topological analysis with betweenness centrality measure

Modules obtained from WGCNA can be exported and analyzed with other tools. We performed topological centrality analysis by using R package Igraph [31]. More specifically, betweenness centrality (BC) of a node was used as a network topology measure [32]. BC is defined as the number of shortest paths between every two other nodes in the module that pass through that node Eq (3). (3) Where V is the set of nodes, σst is the number of shortest paths between nodes s and t, and σst(v) is the number of those paths that pass through node v.

Simply stated, it measures the relevance of a node (gene) as capable of transferring communication between the genes in the module. High values of BC indicate “high traffic nodes”, and hereby more biologically informative nodes in a module.

Visualization and exploring modules with VisANT

Module visualization and further analysis was performed with VisANT ( software, which allows visualization and analysis of networks of biological associations and interactions [33].


Co-expression network generation with WGCNA

The details of the gene co-expression network construction with WGCNA are explained in [21]. By applying the steps described in Materials and Methods, two different networks were generated; one for 24 healthy samples and the other for T1D samples (43 samples, and for 20 patients there are replicates—1 month and 4 months after diagnosis). Briefly, signed network adjacency matrices were obtained by raising the Pearson correlation matrices to a power beta = 9 which approximates scale-free topology. The adjacencies were transformed to dissimilarity matrix for subsequent average linkage hierarchical clustering using flashClust R package [34]. Resulting trees (dendrograms) are illustrated in Fig 1A (healthy samples) and Fig 1B (T1D samples) and each leaf (vertical lines) corresponds to a specific gene. This illustration is intended to show apparent changes in tree structures between two different networks that need further inspection.

Fig 1. Gene dendrogram generated with WGCNA for (A) healthy samples and (B) T1D samples.

Each leaf (vertical lines) in the dendrogram corresponds to a gene.

For further analysis, we cut the tree to generate modules (clusters) from the resulting dendrogram. A dynamic branch cutting method called “hybrid” is used to determine modules, which is implemented in cutreeHybrid function. Fig 2 shows resulting dendrogram for healthy samples with different cut-off levels corresponding to different sets of modules. Modules on the bottom of the figure are illustrated with different colors, and represent the branches of the clustering tree, which can be split by using deepSplit argument, which allows a rough control over sensitivity; we used following parameters: deepSplit = 1, cut height = 0.99, and minimum module size = 27. This parameter setting resulted in 55 modules with average size 235. Finally, module eigengenes are calculated, which provides quantitative assessments in further analysis.

Fig 2. Dendrogram denoting modules in healthy network.

Modules are illustrated with different colors obtained with different module detection sensitivity parameter called deepSplit. Each row with colored set of modules is detected with a certain deepSplit (between 0 and 3). The number of modules and average size are: deepSplit 0 (modules: 31, average size: 379.58) deepSplit 1 (modules: 55, average size: 235.05) deepSplit 2 (modules: 84, average size: 163.98) deepSplit 3 (modules: 105, average size: 137.3).

Comparing the modules between healthy and T1D networks

After generating 55 modules from network, based on healthy samples, we compared the results with the corresponding set of modules in T1D network. Initially, visual representation was done, to obtain a general idea of how modular structure changes between networks. In Fig 3, the same colors that are assigned to the genes based on the module membership in healthy network (Fig 3A) are applied to the corresponding genes in the T1D network (Fig 3B). Few modules are preserved, but there are considerable changes between the two networks which we further investigated using a quantitative way of assessing module preservation.

Fig 3. Visual representation of the changes in the module structure between (A) healthy network and (B) T1D network.

In contrast to the idea of the original paper which proposes identifying modules with strong evidence of preservation between reference and test network [28], we aimed to identify modules that are weakly preserved between networks (see Materials and Methods). We hypothesized that modules that are weakly preserved in T1D may highlight dysregulated pathways in disease that were acquired or lost when compared to a healthy network. Table 1 lists identified modules, along with their size and Zsummary values. Modules with Zsummary< 5 are considered to have low preservation, which is the cut-off used to select modules for further analysis. Following modules are selected based on that criterion: Royalblue, Navajowhite, Yellowgreen and Bisque. Grey and Gold modules are excluded from analysis, as these are artificial modules containing all genes that were not assigned to any module.

Table 1. Resulting modules in healthy network compared to modules in T1D network.

Module enrichment analysis

Pathway enrichment analysis of the interesting modules was performed with ConsensusPathDB [30] and EnrichNet [29] and results are shown for Royalblue module. Table 2 shows functionally enriched pathways obtained from ConsensusPathDB by setting q-value < 0.05. Results of pathway enrichment analysis obtained from EnrichNet are presented in Table 3, where XD-score denotes significance of the enriched pathway. Findings with higher scores are more significant than low-scoring results. Only significant hits with overlap size ≥ 2 (genes that are overlapping in the same pathway) were considered.

Both ConsensusPathDB and EnrichNet identified KEGG pathway “Type I diabetes mellitus” as functionally enriched in Royalblue module. Genes that are identified to be part of this pathway are FAS and IL1B. Another significantly enriched pathway identified by both tools (Tables 2 and 3) is “Graft-versus-host disease”, an immune-mediated disease that also involves FAS and IL1B [35]. Examples of other pathways identified by ConsensusPathDB only (Table 2) are “NOD-like receptor signaling pathway” and “Toll-like receptor signaling pathway”. These pathways are known to be key components in the innate immune system that may promote process leading to autoimmune diabetes [36,37].

Topological analysis with Betweenness centrality measure

Topological analysis of the modules obtained from WGCNA was focused on the betweenness centrality (BC) of the genes within the modules. Since this measure reflects influence over the “information transfer” between different nodes (genes), we identified genes for which betweenness is considerably changed between the two networks (healthy/T1D). BC values for the genes in Royalblue module are presented in Table 4.

Table 4. Betweenness centrality (BC) ranks for genes belonging to Royalblue module (in both healthy and T1D network).

Using betweenness value to rank genes in the healthy network, we identified DDX52, as the gene with highest betweenness (BChealthy = 2 658.15), suggesting that it has a central role in information transfer in this module. However, the DDX52 gene in T1D network shows an aberrant structure, with only one connection and BCT1D = 0. This gene is part of the DEAD box helicases protein family [38] which functions to separate the strands of short mRNA duplexes. Several of the proteins in this family, such as DDX3 and DDX42, are related to regulation of the immune activity [39]. DDX52 has also been shown to be under-expressed in relation to the immune response in another T1D study [40]. In this study, DDX52 plays an important topological role in the healthy network since it is in the center of one of the most interconnected regions (Fig 4). However, in the diabetes network, DDX52 only interacts with one direct neighbor (Fig 5). The function of DDX52 in T1D is still not known, but due to these major differences in the topology involving this protein, it is worth investigating whether this could be one of the factors that is aberrant at early stage at development of T1D.

NDUFA4 is a contrasting example of the gene with a large difference in BC value changing from being low-betweenness gene in healthy to high-betweenness gene in T1D network (Table 4). NDUFA4 encodes a subunit of the complex 1 of the mitochondrial electron transport chain with NADH dehydrogenase and oxidoreductase functions. The role of this gene is not known in T1D, but it is found to be over-expressed in T1D compared to controls and pre-T1D patients [41].

Another gene with the betweenness centrality that largely deviates between healthy and T1D network is SNX14 (sorting nexin 14). The role of this gene in T1D is not known, but there is one previous study showing SNX14 is significantly down-regulated (FDR<0.05) in children that progressed to T1D [42].

Betweenness centrality analysis of the rest of selected modules: Navajowhite, Yellowgreen, and Bisque (S1S3 Tables) revealed a number of genes with known or potential role in T1D. The gene with the highest BC value in Navajowhite disease module is CAPRIN1 (S1 Table), which encodes proteins involved in synaptic plasticity in neurons and cell proliferation and migration [43]. This gene has been associated with autoimmune diseases in mouse [44], but so far the similar role is yet to be confirmed in human autoimmune diseases. In Yellowgreen module, SCAF11 (SIP-1) shows high BC in healthy and very low BC in T1D (S2 Table). SCAF11 is known to be involved in Behcet´s disease [45] that has some features of autoimmunity. Furthermore, another gene identified according to the same criteria is ITFG1 (or TIP), which modulates T-cell function and has protective effect in graft-versus-host disease model [46]. In the last module we analyzed (Bisque) there were also several interesting genes that show large difference between BC values in healthy versus T1D module (S3 Table). IL1A or IL-1 (BCT1D = 60.7; BChealthy = 0) is a pro-inflammatory cytokine that takes part in the “diabetes type I pathway”.


Network-based analysis provides higher level connections between molecules and their involvement in different pathways, which is a good starting point for investigating complex diseases, such as T1D. The present study focuses on co-expression module-based analysis using WGCNA in combination with other network topology information. The results of the study reveal biological pathways that are enriched in co-expression modules and show aberrant structure in T1D network compared to the corresponding modules in co-expression network of healthy controls.

Five modules were chosen for further analysis, based on the loss of preservation, as explained in the result section. The results of the analysis of top-ranked Royalblue module will be discussed here in detail. Lists of the enriched terms in other modules can be found in S4S7 Tables. Functional enrichment analysis of this module was performed with EnrichNet and ConsensusPathDB. EnrichNet utilizes information from the molecular network structure connecting two gene/protein sets to score distances between input set of genes and pathways in a reference database. EnrichNet tries to overcome some limitations of the traditional over-representation based enrichment analysis, by calculating XD-score, which is relative to the average distance to all pathways in a reference database. The analysis resulted in several pathways and processes with a clear connection to different mechanisms that are associated to T1D. One of the KEGG pathways that showed enrichment was “Type I diabetes mellitus” and the analyzed module contains two genes from this pathway—FAS and IL1B. FAS belongs to the TNF-receptors superfamily and plays a major role in the programmed cell death. Apoptosis mediated by FAS appears to be the main mechanism in T cell-mediated damage to insulin-producing beta cells [47]. IL1B is a cytokine that serves as an important mediator in the inflammatory process and is also part of the main mechanisms of beta cell death in diabetes [48]. Among other pathways that are significantly enriched (Table 2), we found “Non-alcoholic fatty liver disease (NAFLD)”, “TNF-signaling pathway”, “NOD-like receptor signaling pathway” etc.

To further understand the interplay between significant pathways within the module and identify which genes they have in common, we used ClueGO ( [49]. Resulting network (Fig 6) illustrates network of six significantly enriched KEGG pathways along with the genes that are shared between these pathways (Fig 6A). Fig 6B illustrates significant KEGG pathways where bars show number of genes associated to each term. There are three genes shared by several pathways—IL1B (six pathways), FAS (four pathways), and CXCL8 (three pathways). This also confirms the relevance of the derived modules for identifying key players in T1D. Interestingly, analysis with VisANT confirms that these three genes are connected in a T1D network generated from the Royalblue module, while there is no such connection in the corresponding healthy module. CXCL8 provides the main connection between FAS—IL1B and the main cluster (the most interconnected region) in the T1D module (Fig 5), containing genes related to the immune system. In corresponding healthy module (Fig 4), these direct connections between CXCL8, FAS and IL1B are absent. CXCL8 (also known as IL8) produces interleukin 8, a chemokine which plays an important role in the inflammatory response and it is produced by many cell types [50]. In addition, high levels of interleukin 8 have been found in the saliva [51] and in the blood [38] of children with T1D. Due to the position of this gene in the diabetes network (Fig 5) and the confirmation of its increased expression levels in diabetic patients, interleukin 8 s may play an important part in the development of diabetes.

Fig 6. Functionally enriched KEGG pathways identified by ClueGO.

(A) The size of the nodes reflects significance of the term. Network includes genes that are shared between different KEGG pathways (B) Chart represents significant KEGG pathways where bars show number of genes associated to each term. Level of significance for terms is marked using 1) **; if the term is over-significant (p-value<0.001), 2) *; if the terms is significant (0.001<p-value<0.05), and 2). (dot); 0.05<p-value<0.01).

Another interesting gene that is part of the tightly connected cluster in T1D module (Fig 5) is ADAM17, which encodes the tumor necrosis factor-alpha converting enzyme (TNFR). ADAM17 plays a central role in cell regulation and thus it is related to many diseases, including T1D [52]. Earlier studies indicated that intermembrane activity of ADAM17 is possible factor that influences concentrations of TNFRs in blood in T1D patients [53,54].

There are other interesting genes that build the tightly interconnected cluster in Fig 5. We found genes that are related to cancer (NBN, FAM84B), Huntington's disease (DCTN4), Parkinson's disease (NUCKS1), Alzheimer's disease (SMC3), amongst others. Parkinson's disease is an autoimmune disease [55], and recent findings support the theory that Alzheimer's also appears to have an autoimmune component to the disease [56]. Previous work on Huntington's disease showed an activation of the immune system and altered immune response before the manifestation of clinical symptoms [57]. In summary, the part of the identified module aberrant in T1D may give the indication of the process shared amongst the above mentioned diseases. This result agrees with the enrichment analysis that found KEGG pathways “Alzheimer´s disease” and “Huntington´s disease” functionally enriched (Table 2). Further analysis of the dense cluster identified in the disease module (Fig 5) reveals another gene, TAB3, a part of “NF-kappa B signaling pathway” that is associated with the activation of the immune systems, particularly in response to external factors, such as inflammation. This pathway is found functionally enriched in Royalblue module (Table 2) and it was recently proposed that this pathway have important implications on the development of novel therapeutic strategies for T1D [58]. Since several of the genes that build the densely connected cluster in the Royalblue module are related to the immune diseases, it is of great interest to further investigate other genes in that cluster (i.e. ATXN7, PGGT1B, MIS18BP1) that have not been previously associated with T1D, and may contribute to extending our knowledge about the disease.

Topological analysis based on betweenness centrality measure (BC) revealed some high-betweenness proteins that may act as important links between different pathways. One of the genes identified based on betweenness centrality criteria that may play a role in T1D is NDUFA4. This finding, together with the results from the enrichment analysis, highlights NDUFA4 as part of three significantly enriched pathways (Table 2) with potential connection to diabetes, and identifies this gene as an interesting candidate for further investigation. One of these pathways is “Non-alcoholic fatty liver disease (NAFLD)”, which has been linked to T1D in children in previous studies [59,60].

Another interesting gene that shows a large difference between BC in healthy versus T1D Bisque module (see Results) is IL1A or IL-1, a pro-inflammatory cytokine that takes part in the “diabetes type I pathway”. The inhibition of IL-1 action has clinical efficacy in several inflammatory diseases including hereditary auto-inflammatory disorders and type 2 diabetes mellitus. [61]. Due to its modulating effect on the interaction between the innate and adaptive immune systems, IL-1 has suggested has been evaluated as a potential target in the autoimmune diabetes mellitus [61]. This finding also confirms that the approach we propose here offers insights into pathways and genes with known involvement in T1D and may serve as a good starting point for identifying novel mechanisms. SH2B2 is another gene in the same module which is a part of “insulin signaling pathway”, but its role in T1D is not known.

Apart from focusing on modules with low preservation between healthy and T1D network, we contrasted our results by analyzing the most preserved module (Pink module), and investigated which role this module may have in the disease. The resulting list of the pathways that are enriched in Pink module with p-value<0.05 are: “HTLV-I infection”, “T cell receptor signaling pathway”, and “Changas disease” (S8 Table). T cell receptor signaling pathway is related to the immune system and known to be associated to T1D. There are two gene members in this module that account for the enrichment in this pathway: DLG1 and NFATC3. DLG1 is involved in signal transduction, cell proliferation, and synaptogenesis, which are important functions that seem to be preserved when comparing heathy and disease module. Mutations in DLG1 are known to be associated to Chrohn´s disease [62], which is an immune related disease. NFATC3 has a crucial role in the development of the immune system and T-cell development [63]. This finding suggests that the highly preserved module in this case encloses functions that are important for the immune system and also shows preserved density and connectivity pattern between healthy and disease module.

In summary, our results show that the application of WGCNA, along with topological analysis and functional enrichment can be used to detect modules associated with T1D in children. Detected modules are utilized for exploratory analysis of dysregulated pathways in disease, as exemplified by the identified pathways such as “Type I diabetes mellitus”, Non-alcoholic fatty liver disease (NAFLD)”, “NOD-like receptor signaling pathway”. In addition, the approach we report here may help identifying candidate genes that are likely to be associated with the disease, such as IL-1, that was found to be significantly increased in newly diagnosed T1D patients [64]. Other identified genes that confirm the relevance of this approach are IL1B, FAS, CXCL8 etc. Previous work on network analysis in T1D was manly focused on using protein-protein interaction (PPI) interaction networks to find candidate genes in disease [65]. While other studies were focused on specific regions of interest for T1D, such as Major Histocompatibility Complex (MHC) region on chromosome 6p21 [66], our study identifies global network structure that allows us to explore key pathways and candidate genes from T1D co-expression modules derived from whole transcriptome. This work highlights the importance of the systems biology approach to study complex disease by analyzing the inherent modularity of the T1D co-expression network. Future efforts should be made to further investigate system-level properties of the modules associated to T1D, as well as other topological properties of the genes involved in identified modules.

Supporting Information

S1 Table. Betweenness centrality (BC) ranks for genes belonging to Navajowhite module.


S2 Table. Betweenness centrality (BC) ranks for genes belonging to Yellowgreen module.


S3 Table. Betweenness centrality (BC) ranks for genes belonging to Bisque module.


S4 Table. Function enrichment results for Navajowhite module.

Table shows GO biological process and molecular function for gene members of Navajowhite module (p-values<0.05), gene count>2.


S5 Table. Function enrichment results for Yellowgreen module.

Table shows GO biological process and molecular function for gene members of Yellowgreen module (p-values<0.05), gene count>2.


S6 Table. Function enrichment results for Bisque module.

Table shows GO biological process and molecular function for gene members of Bisque module (p-values<0.05), gene count>2.


S7 Table. Pathway enrichment results for Bisque module.

Table shows resulting KEGG pathways enriched in Bisque module.


S8 Table. Pathway enrichment results for Pink module.

Table shows resulting KEGG pathways enriched in Pink module.



We would like to thank Dr. Maggie Louie for her insightful reading of the manuscript.

Author Contributions

Conceived and designed the experiments: IRM ZLP. Performed the experiments: IRM. Analyzed the data: IRM ZLP. Contributed reagents/materials/analysis tools: IRM ZLP. Wrote the paper: IRM ZLP.


  1. 1. CDC (2014) National Diabetes Statistics Report. 1 p.
  2. 2. Borch-Johnsen K (1989) The prognosis of insulin-dependent diabetes mellitus. An epidemiological approach. Dan Med Bull 36: 336–348. pmid:2680308
  3. 3. Pociot F, McDermott MF (2002) Genetics of type 1 diabetes mellitus. Genes Immun 3: 235–249. pmid:12140742
  4. 4. Mathers CD, Loncar D (2006) Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med 3: e442. pmid:17132052
  5. 5. Knip M, Veijola R, Virtanen SM, Hyoty H, Vaarala O, Akerblom HK (2005) Environmental triggers and determinants of type 1 diabetes. Diabetes 54 Suppl 2: S125–136. pmid:16306330
  6. 6. Luo DF, Buzzetti R, Rotter JI, Maclaren NK, Raffel LJ, Nistico L, et al. (1996) Confirmation of three susceptibility genes to insulin-dependent diabetes mellitus: IDDM4, IDDM5 and IDDM8. Hum Mol Genet 5: 693–698. pmid:8733139
  7. 7. Wenzlau JM, Juhl K, Yu L, Moua O, Sarkar SA, Gottlieb P, et al. (2007) The cation efflux transporter ZnT8 (Slc30A8) is a major autoantigen in human type 1 diabetes. Proc Natl Acad Sci U S A 104: 17040–17045. pmid:17942684
  8. 8. Bennett ST, Lucassen AM, Gough SC, Powell EE, Undlien DE, Pritchard LE, et al. (1995) Susceptibility to human type 1 diabetes at IDDM2 is determined by tandem repeat variation at the insulin gene minisatellite locus. Nat Genet 9: 284–292. pmid:7773291
  9. 9. Nejentsev S, Walker N, Riches D, Egholm M, Todd JA (2009) Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324: 387–389. pmid:19264985
  10. 10. Nistico L, Buzzetti R, Pritchard LE, Van der Auwera B, Giovannini C, Bosi E, et al. (1996) The CTLA-4 gene region of chromosome 2q33 is linked to, and associated with, type 1 diabetes. Belgian Diabetes Registry. Hum Mol Genet 5: 1075–1080. pmid:8817351
  11. 11. Davies JL, Kawaguchi Y, Bennett ST, Copeman JB, Cordell HJ, Pritchard LE, et al. (1994) A genome-wide search for human type 1 diabetes susceptibility genes. Nature 371: 130–136. pmid:8072542
  12. 12. Hakonarson H, Grant SF, Bradfield JP, Marchand L, Kim CE, Glessner JT, et al. (2007) A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448: 591–594. pmid:17632545
  13. 13. Wellcome Trust Case Control C (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678. pmid:17554300
  14. 14. Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE, et al. (2008) Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat Genet 40: 1399–1401. pmid:18978792
  15. 15. Hakonarson H, Grant SF (2009) Genome-wide association studies in type 1 diabetes, inflammatory bowel disease and other immune-mediated disorders. Semin Immunol 21: 355–362. pmid:19574059
  16. 16. Janeway CA Jr T P, Walport M, et al. (2001) Immunobiology: The Immune System in Health and Disease. 5th edition.; Science NYG, editor.
  17. 17. Shalev A (2014) Minireview: Thioredoxin-interacting protein: regulation and function in the pancreatic beta-cell. Mol Endocrinol 28: 1211–1220. pmid:24911120
  18. 18. Goldberg E, Krause I (2009) Infection and type 1 diabetes mellitus—a two edged sword? Autoimmun Rev 8: 682–686. pmid:19393195
  19. 19. Axelsson S, Faresjo M, Hedman M, Ludvigsson J, Casas R (2008) Cryopreserved peripheral blood mononuclear cells are suitable for the assessment of immunological markers in type 1 diabetic children. Cryobiology 57: 201–208. pmid:18761006
  20. 20. Barabasi AL, Gulbahce N, Loscalzo J (2011) Network medicine: a network-based approach to human disease. Nat Rev Genet 12: 56–68. pmid:21164525
  21. 21. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4: Article17.
  22. 22. Kaizer EC, Glaser CL, Chaussabel D, Banchereau J, Pascual V, White PC (2007) Gene expression in peripheral blood mononuclear cells from children with diabetes. J Clin Endocrinol Metab 92: 3705–3711. pmid:17595242
  23. 23. Gautier L, Cope L, Bolstad BM, Irizarry RA (2004) affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20: 307–315. pmid:14960456
  24. 24. Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98: 5116–5121. pmid:11309499
  25. 25. Schwender H (2012) Multiple testing using SAM and Efron's empirical Bayes approaches. R package version 1.42.0.
  26. 26. Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9: 559. pmid:19114008
  27. 27. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297: 1551–1555. pmid:12202830
  28. 28. Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is my network module preserved and reproducible? PLoS Comput Biol 7: e1001057. pmid:21283776
  29. 29. Glaab E, Baudot A, Krasnogor N, Schneider R, Valencia A (2012) EnrichNet: network-based gene set enrichment analysis. Bioinformatics 28: i451–i457. pmid:22962466
  30. 30. Kamburov A, Stelzl U, Lehrach H, Herwig R (2013) The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Res 41: D793–800. pmid:23143270
  31. 31. Csardi G, Nepusz T. (2006) The igraph software package for complex network research. InterJournal Complex Systems: 1695.
  32. 32. Freeman LC (1978) Centrality in social networks conceptual clarification. Social Networks 1: 215–239.
  33. 33. Hu Z, Chang YC, Wang Y, Huang CL, Liu Y, Tian F, et al. (2013) VisANT 4.0: Integrative network platform to connect genes, drugs, diseases and therapies. Nucleic Acids Res 41: W225–231. pmid:23716640
  34. 34. Langfelder P, Horvath S (2012) Fast R Functions for Robust Correlations and Hierarchical Clustering. J Stat Softw 46.
  35. 35. Holtan SG, Pasquini M, Weisdorf DJ (2014) Acute graft-versus-host disease: a bench-to-bedside update. Blood 124: 363–373. pmid:24914140
  36. 36. Zipris D (2010) Toll-like receptors and type 1 diabetes. Adv Exp Med Biol 654: 585–610. pmid:20217515
  37. 37. Zipris D (2008) Innate immunity and its role in type 1 diabetes. Curr Opin Endocrinol Diabetes Obes 15: 326–331. pmid:18594272
  38. 38. Jarmoskaite I, Russell R (2011) DEAD-box proteins as RNA helicases and chaperones. Wiley Interdiscip Rev RNA 2: 135–152. pmid:21297876
  39. 39. Gustafson EA, Wessel GM (2010) DEAD-box helicases: posttranslational regulation and function. Biochem Biophys Res Commun 395: 1–6. pmid:20206133
  40. 40. Chen YG, Cabrera SM, Jia S, Kaldunski ML, Kramer J, Cheong S, et al. (2014) Molecular signatures differentiate immune states in type 1 diabetic families. Diabetes 63: 3960–3973. pmid:24760139
  41. 41. Reynier F, Pachot A, Paye M, Xu Q, Turrel-Davin F, Petit F, et al. (2010) Specific gene expression signature associated with development of autoimmune type-I diabetes using whole-blood microarray analysis. Genes Immun 11: 269–278. pmid:20090770
  42. 42. Kallionpaa H, Elo LL, Laajala E, Mykkanen J, Ricano-Ponce I, Vaarma M, et al. (2014) Innate immune activity is detected prior to seroconversion in children with HLA-conferred type 1 diabetes susceptibility. Diabetes 63: 2402–2414. pmid:24550192
  43. 43. Solomon S, Xu Y, Wang B, David MD, Schubert P, Kennedy D, et al. (2007) Distinct structural features of caprin-1 mediate its interaction with G3BP-1 and its induction of phosphorylation of eukaryotic translation initiation factor 2alpha, entry to cytoplasmic stress granules, and selective interaction with a subset of mRNAs. Mol Cell Biol 27: 2324–2342. pmid:17210633
  44. 44. Jordan MA (2011) Dissecting the genetics of autoimmune diseases.
  45. 45. Delunardo F, Conti F, Margutti P, Alessandri C, Priori R, Siracusano A, et al. (2006) Identification and characterization of the carboxy-terminal region of Sip-1, a novel autoantigen in Behcet's disease. Arthritis Res Ther 8: R71. pmid:16611372
  46. 46. Fiscella M, Perry JW, Teng B, Bloom M, Zhang C, Leung K, et al. (2003) TIP, a T-cell factor identified using high-throughput screening increases survival in a graft-versus-host disease model. Nat Biotechnol 21: 302–307. pmid:12598909
  47. 47. Chervonsky AV, Wang Y, Wong FS, Visintin I, Flavell RA, Janeway CA Jr., et al. (1997) The role of Fas in autoimmune diabetes. Cell 89: 17–24. pmid:9094710
  48. 48. Kurrer MO, Pakala SV, Hanson HL, Katz JD (1997) Beta cell apoptosis in T cell-mediated autoimmune diabetes. Proc Natl Acad Sci U S A 94: 213–218. pmid:8990188
  49. 49. Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25: 1091–1093. pmid:19237447
  50. 50. Qazi BS, Tang K, Qazi A (2011) Recent advances in underlying pathologies provide insight into interleukin-8 expression-mediated inflammation and angiogenesis. Int J Inflam 2011: 908468. pmid:22235381
  51. 51. Dakovic D, Colic M, Cakic S, Mileusnic I, Hajdukovic Z, Stamatovic N (2013) Salivary interleukin-8 levels in children suffering from type 1 diabetes mellitus. J Clin Pediatr Dent 37: 377–380. pmid:24046985
  52. 52. Gooz M (2010) ADAM-17: the enzyme that does it all. Crit Rev Biochem Mol Biol 45: 146–169. pmid:20184396
  53. 53. Gohda T, Niewczas MA, Ficociello LH, Walker WH, Skupien J, Rosetti F, et al. (2012) Circulating TNF receptors 1 and 2 predict stage 3 CKD in type 1 diabetes. J Am Soc Nephrol 23: 516–524. pmid:22266664
  54. 54. Niewczas MA, Ficociello LH, Johnson AC, Walker W, Rosolowsky ET, Roshan B, et al. (2009) Serum concentrations of markers of TNFalpha and Fas-mediated pathways and renal function in nonproteinuric patients with type 1 diabetes. Clin J Am Soc Nephrol 4: 62–70. pmid:19073786
  55. 55. Barker RA, Cahn AP (1988) Parkinson's disease: an autoimmune process. Int J Neurosci 43: 1–7. pmid:3215724
  56. 56. Dinkins MB, Dasgupta S, Wang G, Zhu G, He Q, Kong JN, et al. (2015) The 5XFAD Mouse Model of Alzheimer's Disease Exhibits an Age-Dependent Increase in Anti-Ceramide IgG and Exogenous Administration of Ceramide Further Increases Anti-Ceramide Titers and Amyloid Plaque Burden. J Alzheimers Dis 46: 55–61. pmid:25720409
  57. 57. Bjorkqvist M, Wild EJ, Thiele J, Silvestroni A, Andre R, Lahiri N, et al. (2008) A novel pathogenic pathway of immune activation detectable before clinical onset in Huntington's disease. J Exp Med 205: 1869–1877. pmid:18625748
  58. 58. Zhao Y, Krishnamurthy B, Mollah ZU, Kay TW, Thomas HE (2011) NF-kappaB in type 1 diabetes. Inflamm Allergy Drug Targets 10: 208–217. pmid:21495968
  59. 59. Rashid M, Roberts EA (2000) Nonalcoholic steatohepatitis in children. J Pediatr Gastroenterol Nutr 30: 48–53. pmid:10630439
  60. 60. Manton ND, Lipsett J, Moore DJ, Davidson GP, Bourne AJ, Couper RT (2000) Non-alcoholic steatohepatitis in children and adolescents. Med J Aust 173: 476–479. pmid:11149304
  61. 61. Mandrup-Poulsen T, Pickersgill L, Donath MY (2010) Blockade of interleukin 1 in type 1 diabetes mellitus. Nat Rev Endocrinol 6: 158–166. pmid:20173777
  62. 62. Xu S, Zhou F, Tao J, Song L, Ng SC, Wang X, et al. (2014) Exome sequencing identifies DLG1 as a novel gene for potential susceptibility to Crohn's disease in a Chinese family study. PLoS One 9: e99807. pmid:24937328
  63. 63. Macian F (2005) NFAT proteins: key regulators of T-cell development and function. Nat Rev Immunol 5: 472–484. pmid:15928679
  64. 64. Aribi M, Moulessehoul S, Kendouci-Tani M, Benabadji AB, Hichami A, Khan NA (2007) Relationship between interleukin-1beta and lipids in type 1 diabetic patients. Med Sci Monit 13: CR372–378. pmid:17660728
  65. 65. Gao S, Wang X (2009) Predicting Type 1 Diabetes Candidate Genes using Human Protein-Protein Interaction Networks. J Comput Sci Syst Biol 2: 133. pmid:20148193
  66. 66. Brorsson C, Hansen NT, Lage K, Bergholdt R, Brunak S, Pociot F, et al. (2009) Identification of T1D susceptibility genes within the MHC region by combining protein interaction networks and SNP genotyping data. Diabetes Obes Metab 11 Suppl 1: 60–66. pmid:19143816