Measles Contributes to Rheumatoid Arthritis: Evidence from Pathway and Network Analyses of Genome-Wide Association Studies

Growing evidence from epidemiological studies indicates the association between rheumatoid arthritis (RA) and measles. However, the exact mechanism for this association is still unclear now. We consider that the strong association between both diseases may be caused by shared genetic pathways. We performed a pathway analysis of large-scale RA genome-wide association studies (GWAS) dataset with 5,539 cases and 20,169 controls of European descent. Meanwhile, we evaluated our findings using previously identified RA loci, protein-protein interaction network and previous results from pathway analysis of RA and other autoimmune diseases GWAS. We confirmed four pathways including Cytokine-cytokine receptor interaction, Jak-STAT signaling, T cell receptor signaling and Cell adhesion molecules. Meanwhile, we highlighted for the first time the involvement of Measles and Intestinal immune network for IgA production pathways in RA. Our results may explain the strong association between RA and measles, which may be caused by the shared genetic pathway. We believe that our results will be helpful for future genetic studies in RA pathogenesis and may significantly assist in the development of therapeutic strategies.


Introduction
Rheumatoid arthritis (RA) is a chronic autoimmune disease characterized by inflammatory polyarthritis [1]. RA affects approximately 1% of the adult population worldwide [2]. According to the National Institute of Arthritis and Musculoskeletal and Skin Diseases, about 1.3 million adults in the U.S. suffer from RA [3]. RA is a complex disease caused by a combination of genetic susceptibility and environmental factors [4]. The complex genetic architecture of RA makes genetic analysis difficult. Recently, much effort has been devoted to finding common RA variants; especially genome-wide association studies (GWAS) [4,5,6,7,8,9]. However, these known genetic factors just explain 50-60% of the genetic variance for susceptibility to ACPA-positive and 30-50% susceptibility for ACPA-negative RA [10]. Considering the complex genetic architecture, it is apparent that additional risk variants remain to be discovered.
Previous studies reported an increased antibody level to measles virus in RA patients [11]. The following studies confirmed the association between measles virus and RA. Rosenau BJ et al investigated 50 patients with rheumatoid factor (RF)-negative RA. The result showed that 11 of 50 (22%) samples had IgM antibodies to measles virus recombinant nucleoprotein [12]. Recently, Heijstek MW et al examined the persistence of measles antibodies between 400 juvenile idiopathic arthritis (JIA) patients and 2176 healthy controls aged 1-19 years. The results indicated that measles-specific geometric mean antibody concentrations in JIA patients were higher (P,0.001) compared with healthy controls [13].
Thus growing evidence from epidemiological studies indicates the association between RA and measles. However, the exact mechanism for this association is still unclear now. Both RA and measles are complex human diseases. We consider that the strong association between both diseases may be caused by shared genetic pathways. Fortunately, pathway-based method to study existing GWAS datasets has been applied into RA [3,14,15,16,17,18,19,20]. However, no study reported the involvement of measles in RA. We found that all the pathway analyses used the Wellcome Trust Case-Control Consortium (WTCCC) or North American Rheumatoid Arthritis Consortium (NARAC) dataset or both datasets [3,14,15,16,17,18,19,20]. We think that the study power would benefit from examining large-scale GWAS datasets. In this research, we performed a pathway analysis of large-scale RA GWAS dataset with 5,539 cases and 20,169 controls of European descent. Meanwhile, we evaluated our findings by combining existing knowledge including previously identified RA loci, protein-protein interaction network and previous results from pathway analysis of RA and other autoimmune diseases GWAS (Figure 1).

Materials and Methods
The RA GWAS dataset The RA GWAS came from the meta-analysis of RA including 5,539 cases and 20,169 controls of European ancestry [8]. This study comprises six GWAS case-control collections, which included Brigham Rheumatoid Arthritis Sequential Study (BRASS, 483 cases and 1,449 controls), CANADA (589 cased and 1,472 controls), Epidemiological Investigation of Rheumatoid Arthritis (EIRA, 1,173 cased and 1,089 controls), NARAC I (867 cases and 1,041 controls), NARAC III (902 cased and 4,510 controls) and WTCCC (1,525 cased and 10,608 controls). Here, we used the summary results of this study for our analysis. A more detailed description can be obtained from the original publication [8].

RA susceptibility genes identified by previous GWAS
We got 35 SNPs, which were validated or suggestive European RA risk associations, RA risk associations previously validated in East Asian case-control samples, and SNPs known to be highly differentiated across European populations [5,21,22,23,24,25,26,27,28]. Most SNPs have achieved P,5.00E-08 in combined analysis from previous studies [8]. The other SNPs have been validated by replication in independent samples but may not have attained P,5.00E-08 in any single study [5,21,22,23,24,25,26,27,28]. These 35 SNPs correspond to 35 RA susceptibility genes. More detailed information is described in Table S1 from the original study [8]. Considering that these findings were reported before 2010, we further accessed the GWAS Catalog (July 20, 2013) and selected 48 SNPs with P,5.00E-08, which corresponded to 41 susceptibility genes. In the end, we finally selected 57 RA susceptibility genes without duplication.

Gene-based test for RA GWAS by VEGAS
A new gene-based test for GWAS approach, called VEGAS, was used to conduct a gene-based association study [29]. The method incorporates information from all SNPs within a gene and accounts for the gene sizes, SNP density, and the LD between SNPs. In brief, this method first assigns SNPs to each of 17,787 autosomal genes according to the positions on the UCSC Genome Browser hg18 assembly. SNPs within genes (including 650 kb from the 59 and 39 UTR) are selected. Next, for a given gene with SNPs, association P-values are converted to uppertail chi-squared statistics with one degree of freedom (df). The gene-based test statistic is then the sum of all the chi-squared 1 df statistics within that gene. Then, multivariate normal simulations are used to account for the LD structure of SNPs within genes using the HapMap2 (CEU) genotype data [29]. In the first stage, 1000 simulations are performed. If the resulting empirical P value is less than 0.1, 10000 simulations are performed. If the empirical P value from 10000 simulations is less than 0.0001, the program will perform 1000000 simulations. For computational reasons, if the empirical P value is 0, then no more simulations will be performed. An empirical P value of 0 from 1000000 simulations can be interpreted as P,10E-06, which exceeds a Bonferroni-corrected

Identifying RA risk pathways by hypergeometric test
We investigated the enrichment of RA genes identified by VEGAS and RA susceptibility genes identified by previous GWAS. For a given pathway in KEGG, hypergeometric test in Genecodis was used to detect an overrepresentation of the disease related genes among all the genes in the pathway [31]. The Pvalue of observing K disease related genes in the pathway can be calculated by where N is the total number of genes that are of interest, S is the number of all disease related genes and m is the number of genes in the pathway, K is the number of disease related genes in the pathway. GeneCodis (version 3) is a valuable tool to functionally interpret results from experimental techniques in genomics [32]. This web tool integrates different sources of information to finding groups of genes with similar biological meaning [32]. Effort has been made to remove noisy and redundant output from the enrichment results with the inclusion of a recently reported algorithm that summarizes significantly enriched terms and generates functionally coherent modules of genes and terms. A new comparative analysis has been added to allow the differential analysis of gene sets. New sources of biological information have been included, such as genetic diseases, drugsgenes interactions and Pubmed information among others [32].

Comparison with previous pathway analyses of RA and other autoimmune disease
We compared our original findings with that of previous pathway analyses of RA GWAS [3,14,15,16,17,18,19,20]. All studies used the WTCCC or NARAC dataset or both datasets. All the pathway analysis results are publicly available from the original studies or the corresponding supplementary materials [3,14,15,16,17,18,19,20]. We also compared our original findings with that of previous pathway analyses of other autoimmune diseases including Crohn's disease (CD), celiac disease (CeID), type 1 diabetes (TID) and multiple sclerosis (MS). All the pathway analysis results are publicly available from the original studies or the corresponding supplementary materials [15,20,33,34,35,36].

Protein-protein interaction network analysis by STRING
We investigated the potential interactions between the proteins encoded by RA susceptibility genes using Search Tool for the Retrieval of Interacting Genes (STRING) (version 9.0). STRING is a database of known and predicted protein interactions, including direct (physical) and indirect (functional) associations, which are derived from four sources including genomic context, high-throughput experiments, coexpression and previous knowledge [37]. STRING provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information. Interactions in STRING are provided with a confidence score, and accessory information such as protein domains and 3D structures is made available, all within a stable and consistent identifier space. New features in STRING include an interactive network viewer that can cluster networks on demand, updated on-screen previews of structural information including homology models, extensive data updates and strongly improved connectivity and integration with third-party resources. Version 9.0 of STRING covers more than 1100 completely sequenced organisms [37]. Recent publication describes the update to version 9.1 of STRING, introducing several improvements including the automated mining of scientific texts for interaction information, to now also include full-text articles, redesigned the algorithm for transferring interactions from one model organism to the other; and statistical information on any functional enrichment observed in networks [38].

Pathway analysis of RA GWAS
After removal of the Major histocompatibility complex (MHC) region, we got 394 RA genes with adjusted P,0.01 by VEGAS. We then conducted a KEGG pathway analysis of these 394 significant RA genes. 152 KEGG pathways, which included at least one of 394 RA genes, were available for analysis. We identified 12 significant pathways with P, = 1.00E-03 after False discovery rate (FDR) multiple testing corrections (The threshold indicating significant results is 0.05), among which Measles (hsa05162) was the most significant signal (P = 1.57E-08) ( Table 1). All detailed results are described in Table S1.

Pathway of RA susceptibility genes
We conducted a pathway analysis of 35 RA susceptibility genes identified by previous GWAS in European descent. 45 KEGG pathways, which included at least one of 35 RA genes, were available for analysis. After FDR multiple testing corrections (The threshold indicating significant results is 0.05), we identified 10 significant pathways (P, = 0.001), which contained at least three RA susceptibility genes. Interestingly, the results supported our previous findings. 6 pathways identified by pathway-based test of RA GWAS were replicated. These pathways included T cell receptor signaling (hsa04660), Cytokine-cytokine receptor interaction (hsa04060), Measles (hsa05162), Jak-STAT signaling (hsa04630), Cell adhesion molecules (hsa04514) and Intestinal immune network for IgA production (hsa04672), among which Cytokine-cytokine receptor interaction was the most significant pathway (P = 1.90E-10) and Measles was the second significant pathway (P = 1.35E-09) ( Table 2).

Comparison with previous pathway analyses of RA and other autoimmune diseases
After comparison with previous pathway analyses of RA GWAS, we confirmed our original findings. Four pathways including Cytokine-cytokine receptor interaction (hsa04060), Jak-STAT signaling pathway (hsa04630), T cell receptor signaling (hsa04660) and Cell adhesion molecules (hsa04514) were identified by at least one pathway analysis. Among the pathway analysis methods, cumulative trend test adjusts for gene length by permutations and LD based gene block method [15,18]. Here, we list the pathways, methods, the corresponding pathway Pvalues and references (Table 3). For more detailed information, please refer to the original studies [3,14,15,16,17,18,19,20].

Network analysis of RA susceptibility genes and RA GWAS
Based on the results above, we confirmed four RA risk pathways and highlighted for the first time the involvement of Measles and Intestinal immune network for IgA production pathways in RA. In order to further verify these findings, we conducted a proteinprotein interaction network analysis of 35 RA susceptibility genes. Interestingly, we found significant connectivity among proteins encoded by RA susceptibility genes according to protein-protein interaction network in STRING (9.0) using default settings (observed interaction, 103; expected interaction, 1.88; P,1.00E-10) (Figure 2). Using the 394 RA susceptibility genes from RA GWAS, we further found significant connectivity among proteins encoded by these genes (observed interaction, 471; expected interaction, 4.36; P,1.00E-10) (Figure 3). Meanwhile, the results showed interaction between the Measles and Intestinal immune network for IgA production pathways and another four RA risk pathways.

Verification by PubMed and Google Scholar literature search
In order to verify our findings, we searched the PubMed and Google Scholar databases. The results indicated that genes involved in the measles pathway were associated with RA. The evidence from literature search was described in Table 4.
Recently, Heijstek et al. investigated the effects of live attenuated measles-mumps-rubella (MMR) vaccination on disease activity in juvenile idiopathic arthritis (JIA) patients [39]. 137 patients with JIA aged 4 to 9 years were available for analysis. Patients were randomly assigned to receive MMR booster vaccination (n = 68) or no vaccination (control group; n = 69). At 12 months, seroprotection rates were higher in revaccinated patients vs. controls, as were antibody concentrations against measles, mumps and rubella [39].

Discussion
Here, considering the association between RA and measles reported by previous epidemiological studies, we investigated the shared genetic pathways by integrating GWAS with pathway and protein-protein interaction network. In the end, we confirmed four RA risk pathways reported by previous pathway analyses, including Cytokine-cytokine receptor interaction pathway, Jak-STAT signaling pathway, T cell receptor signaling pathway and Cell adhesion molecules pathway. Until now, targeting two of these four pathways is also a reality. For the Cytokine-cytokine receptor interaction pathway, a broad range of cytokines are active in the joints of RA patients [40]. Anti-cytokine therapy of RA have been proposed, among which anti-TNF therapy of RA has proved effective [41]. Meanwhile, anti-other cytokines can also be beneficial, such as many interleukin (IL) family genes including IL-1, IL-6, IL-23, and IL-2 families [40]. For the Jak-STAT signaling pathway, many cytokines exert their effect by the Jak-STAT signal transduction [42]. Functional inhibition of JAK3 is sufficient for efficacy in collagen-induced arthritis in mice [43]. Targeting JAKs is also a reality in RA, such as tofacitinib targeting JAK1, JAK2 and JAK3 to treat RA (Phase III), VX-509 and R-348 targeting JAK3 to treat RA (Phase II) [44]. Recently, two studies examine the effects of JAK inhibitor tofacitinib in RA to elucidate the role of JAK in disease process [45,46]. The results show that tofacitinib regulates RA synovitis through inhibition of interferon-c and IL-17 production [45]. JAK inhibition with tofacitinib also suppresses arthritic joint structural damage through decreased receptor activator of NF-kB ligand (RANKL) production [46].
For the T cell receptor signaling pathway, evidence is emerging that altered T cell receptor signaling thresholds could contribute to human autoimmune arthritis, including RA and the spondyloarthropathies (SpA) [47]. T cell activation has been identified in RA pathogenesis by RA GWAS [48]. It is also reported that defective activation of T cell receptor-proximal signaling proteins, such as small GTPase Rap1, contribute to the pathologic behavior of RA synovial T cells [49]. Evidence indicated that maintenance of T cell Rap1 signaling in murine T cells could reduce disease incidence and severity in collagen-induced arthritis [49]. T cell receptor signal strength also controls arthritis severity in proteoglycan-specific TCR transgenic mice [50]. For the Cell adhesion molecules pathway, the elevated production of cell adhesion molecules is crucial to the pathogenesis of RA [51]. Some cell adhesion molecules, such as E-selectin and intercellular adhesion molecule-1 (ICAM-1) are modulated in RA patients who respond clinically to drug treatment [52]. Recently, Klimiuk PA et al. analyzed serum concentrations of soluble intercellular adhesion molecule-1 (sICAM-1), vascular cell adhesion molecule-1 (sVCAM-1), and E-selectin (sE-selectin) in patients with early rheumatoid arthritis (RA) before and after 6 months of treatment with methotrexate (MTX). The results indicated that patients with early RA were characterized by high serum concentrations of sICAM-1, sVCAM-1, and sE-selectin. Therapy with MTX resulted in clinical improvement and diminished serum levels of soluble cell adhesion molecules in the RA patients [53].
Growing evidence from epidemiological studies indicates the association between RA and measles. An increased antibody level to measles virus was observed in JIA patients vs. healthy controls [11,13]. Recently, Heijstek et al. investigated the effects of live attenuated measles-mumps-rubella (MMR) vaccination on disease activity in JIA patients [39]. 137 patients with JIA aged 4 to 9 years were available for analysis. Patients were randomly assigned to receive MMR booster vaccination (n = 68) or no vaccination (control group; n = 69). At 12 months, seroprotection rates were higher in revaccinated patients vs. controls, as were antibody concentrations against measles, mumps and rubella [39].
We highlighted for the first time the involvement of Measles in RA by pathway analysis method. We further confirmed these findings by previously identified RA loci, protein-protein interaction network. Our results may explain the strong association between RA and measles, which may be caused by shared genetic pathway. We believe that our results will be helpful for future genetic studies in RA pathogenesis and may significantly assist in the development of therapeutic strategies by targeting the Measles pathway and reducing antibody level to measles virus in RA. Future replication studies using animal models are required to replicate our findings before the findings can be of clinical use.
Here, we analyzed 25,708 individuals from European population. The study power would benefit from examining this largescale RA GWAS dataset, which was originally analyzed by Stahl EA et al. [8]. The power was calculated based on the odds ratios under a multiplicative genotypic relative risk (GRR) model at four thresholds for significant SNPs with a~0:01, a~10 {4 , a~10 {6 and a~5|10 {8 . The results indicated that this study had a great power to detect genetic associations at a~0:01. For heterozygote GRR of 1.1, significant SNPs with a~0:01 and minor allele frequency (MAF) = 0.2 had a power 0.8. For heterozygote GRR of 1.2, significant SNPs with a~0:01 and minor allele frequency (MAF) = 0.1 had a power 1. More detailed information was described in the supplementary materials of the original study [8]. These results were consistent with recent findings. Recently, Hunt et al. genotyped 25 GWAS risk genes in 41,911 UK residents of white European origin (24,892 cases with six autoimmune diseases and 17,019 controls) [54]. The results showed that the missing heritability for common autoimmune diseases may be a result of many common-variant loci of weak effect [54]. Meanwhile, in order to reduce the sources of bias in pathway analysis, we adjusted gene length and LD patterns in human genome using VEGAS, which increased the reliability of our results. Our findings indicate that integration of GWAS dataset with pathway and protein-protein interaction network can uncover novel RA risk pathways. We think that this strategy may be applied into other phenotypes or diseases.
In our research, we selected hypergeometric test for pathway analysis, which is a commonly used competitive test that is incorporated in a number of bioinformatics tools as described in the review paper about the genome-wide pathway analysis to unravel the etiology of complex diseases [55]. In Table 3, pathway analysis software ClueGO and BINGO were also used hypergeometric test. Jia et al. suggested that for gene sets consisting of markers highly associated (P, = 1.00E-03) with disease, among gene set enrichment analysis (GSEA), hypergeometric test and SNP ratio test (SRT), the hypergeometric test performed best with the highest power [56].
Our study also has some limitations. First, except the human studies, there was no any animal study, which investigated the measles related genes and RA before. Second, we identified the involvement of Measles in RA by pathway analysis of RA GWAS. Until now, no evidence indicates that Measles pathway is significantly enriched for genetic association to Measles by pathway analysis of Measles GWAS. Considering these limitations, we will replicate this pathway by pathway analysis of Measles GWAS in future work, when Measles GWAS is available for us.

Gene
Supporting evidence Ref

TNFAIP3
In conclusion, we have demonstrated an increase in TNFAIP3 expression in PBMCs from patients with RA compared with healthy controls [57] TNFAIP3 Together, these observations indicate a critical and cell-specific function for TNFAIP3 (A20) in the etiology of rheumatoid arthritis, supporting the idea of developing A20 modulatory drugs as cell-targeted therapies. [58]

IL2
Our results replicate and firmly establish the association of STAT4 and CTLA4 with RA and provide highly suggestive evidence for IL2/IL21 loci as a risk factor for RA. [59]

IL2
The KIAA1109-TENR-IL2-IL21 gene cluster, that encodes an interleukin (IL-21) that plays an important role in Th17 cell biology, is the 20th locus for which there is a genome-wide (P,or = 5610 (28)) level of support for association with RA. [60]

IL2RA
The present genetic and serologic data suggest that inherited altered genetic constitution at the IL2RA locus may predispose to a less destructive course of RA. [61]

CD28
Modulation of CD28 expression with anti-tumor necrosis factor alpha therapy in rheumatoid arthritis [62] STAT1 Activation of the STAT1 pathway in rheumatoid arthritis [63] Tyk2 Our results demonstrate a critical contribution of Tyk2 in the development of arthritis, and we propose that Tyk2 might be an important candidate for drug development. [64] IKBKE Combination therapy with low dose IFNb and an IKBKE inhibitor might improve efficacy of either agent alone and offers a novel approach to RA. [65] doi:10.1371/journal.pone.0075951.t004