The NOD2 Single Nucleotide Polymorphisms rs2066843 and rs2076756 Are Novel and Common Crohn's Disease Susceptibility Gene Variants

Background The aims were to analyze two novel NOD2 variants (rs2066843 and rs2076756) in a large cohort of patients with inflammatory bowel disease and to elucidate phenotypic consequences. Methodology/Principal Findings Genomic DNA from 2700 Caucasians including 812 patients with Crohn's disease (CD), 442 patients with ulcerative colitis (UC), and 1446 healthy controls was analyzed for the NOD2 SNPs rs2066843 and rs2076756 and the three main CD-associated NOD2 variants p.Arg702Trp (rs2066844), p.Gly908Arg (rs2066847), and p.Leu1007fsX1008 (rs2066847). Haplotype and genotype-phenotype analyses were performed. The SNPs rs2066843 (p = 3.01×10−5, OR 1.48, [95% CI 1.23-1.78]) and rs2076756 (p = 4.01×10−6; OR 1.54, [95% CI 1.28-1.86]) were significantly associated with CD but not with UC susceptibility. Haplotype analysis revealed a number of significant associations with CD susceptibility with omnibus p values <10−10. The SNPs rs2066843 and rs2076756 were in linkage disequilibrium with each other and with the three main CD-associated NOD2 mutations (D'>0.9). However, in CD, SNPs rs2066843 and rs2076756 were more frequently observed than the other three common NOD2 mutations (minor allele frequencies for rs2066843 and rs2076756: 0.390 and 0.380, respectively). In CD patients homozygous for these novel NOD2 variants, genotype-phenotype analysis revealed higher rates of a penetrating phenotype (rs2076756: p = 0.015) and fistulas (rs2076756: p = 0.015) and significant associations with CD-related surgery (rs2076756: p = 0.003; rs2066843: p = 0.015). However, in multivariate analysis only disease localization (p<2×10−16) and behaviour (p = 0.02) were significantly associated with the need for surgery. Conclusion/Significance The NOD2 variants rs2066843 and rs2076756 are novel and common CD susceptibility gene variants.


Introduction
Crohn's disease (CD) and ulcerative colitis (UC) are chronic inflammatory bowel diseases (IBD) characterized by an exaggerated immune response of the intestinal mucosa and a dysfunctional epithelial barrier [1,2,3,4]. The identification of nucleotide-binding oligomerization domain 2 (NOD2, GeneID: 64127), also known as caspase recruitment domain-containing protein 15 (CARD15) as the first susceptibility gene in CD in 2001 [5,6] has provided significant new insights in the pathogenesis of IBD focusing on the genetic background of innate immune response and interaction with bacterial antigens [7,8,9,10]. Most recently, genome-wide associa-tion studies and subsequent replication studies have provided further insights into IBD pathogenesis by identification and confirmation of susceptibility genes such as the interleukin-23 receptor (IL23R) [11,12], SLC22A4/5 [13] and ATG16L1 [14] (autophagy-related 16-like 1) gene.
NOD2 represents a cytoplasmatic protein and functions mainly as a NF-kB pathway activating sensor for bacterial muramyl dipeptide (MDP) found in the cell wall of Gram-positive and Gram-negative bacteria [8,15,16,17]. In addition, NOD2 seems to be a negative regulator of Toll-like receptor 2-mediated T helper cell type 1 responses and modulates ileal expression of antimicrobial peptides such as alpha-defensins and the expression of proinflammatory cytokines and chemokines in the intestinal mucosa [18,19,20]. The identification of NOD2 as a susceptibility gene for CD therefore suggests an important role of genetically determined enteric bacteria-host interactions and an inappropriate activation of the mucosal immune system in IBD. Large genotypephenotype analyses by us [21,22] and others [23,24,25] also demonstrated a significant association of NOD2 variants with ileal involvement, stricturing phenotype and early disease onset in CD patients.
The NOD2 gene is located on chromosome 16q in the IBD1 locus and contains 11 constant exons and a twelfth alternatively spliced exon in the 59-region. So far, three main NOD2 variants, which include two amino acid substitutions, p.Arg702Trp encoded by exon 4, and p.Gly908Arg encoded by exon 8, and the frameshift mutation p.Leu1007fsX1008 located in exon 11, were identified to be overrepresented in CD patients. There is also evidence for further NOD2 variants being involved in IBD pathogenesis as demonstrated by us [26] and others in recent association studies [18,19,20]. However, the extent of disease modification or phenotypic consequences of other NOD2 variants such as the SNPs rs2066843 and rs2076756 have not been investigated so far.
We therefore genotyped 2700 individuals of Caucasian origin and performed a large and detailed genotype-phenotype analysis for the NOD2 variants rs2066843 and rs2076756 analyzing the influence of these variants on the disease susceptibility and phenotype of patients with CD and UC.

Study population
The study population (n = 2700) was comprised of 1254 IBD patients of Caucasian origin including 812 patients with CD, 442 patients with UC, and 1446 healthy, unrelated controls. The patients were recruited in two cohorts; the discovery sample was recruited from the University Hospital Munich-Grosshadern and comprised 519 CD patients, 232 UC patients and 770 controls, while the replication cohort recruited from the University Hospitals Bochum and Munich-Innenstadt consisted of 293 CD patients, 210 UC patients and 676 controls. Patients with indeterminate colitis were excluded from the study. All individuals gave written, informed consent prior to the study. The study was approved by the local Ethics committee and adhered to the ethical principles for medical research involving human subjects of the Helsinki Declaration. Phenotypic parameters were collected blind to the results of the genotype analysis and included demographic and clinical data (behaviour and anatomic location of IBD, disease-related complications, surgical or immunosuppressive therapy). Two senior gastroenterologists analyzed data which were recorded by patient charts analysis and a detailed questionnaire based on an interview at time of enrolment. For the analysis of demographic and phenotypic data, the diagnosis of CD and UC was related to established international guidelines based on endoscopic, radiological, and histopathological parameters [27]. CD patients were classified according to the Montreal classification [28] including age at diagnosis (A), location (L), and behaviour (B) of disease. In patients with UC, anatomic location was also assessed in accordance to the Montreal classification based on the criteria ulcerative proctitis (E1), left-sided UC (distal UC; E2), and extensive UC (pancolitis; E3). The clinical characteristics of the IBD study population are summarized in Table 1.

DNA extraction and NOD2 genotyping
Genomic DNA was isolated from peripheral blood leukocytes by standard procedures using the DNA blood mini kit from Qiagen (Hilden, Germany). Genotyping of the NOD2 variants p.Arg702Trp (rs2066844), p.Gly908Arg (rs2066847), and p.Leu1007fsX1008 (rs2066847) were performed as described previously [26] (primer and probe sequences are available on request). The NOD2 SNPs rs2066843 and rs2076756 were genotyped by PCR and melting curve analysis using a pair of fluorescence resonance energy transfer (FRET) probes, a sensor and an anchor probe, respectively, in a LightCycler 480 Instrument (Roche Diagnostics, Mannheim, Germany). The donor fluorescent molecule (fluorescein) at 39-end of the sensor probe in case of rs2066843 or the anchor probe in case of rs2076756, respectively, is excited at its specific fluorescence excitation wavelength (533 nm) and the energy is transferred to the acceptor fluorescent molecule at the 59-end of the anchor probe in case of rs2066843 (LightCycler Red 640) or the sensor probe in case of rs2076756 (LightCycler Red 670). The specific fluorescence signal emitted by the acceptor molecule is detected by the optical unit of the LightCycler 480 Instrument. The sensor probe is exactly matching to one allele of each SNP, whereas in the case of the other allele there is a mismatch resulting in a lower melting temperature. The total volume of the PCR was 5 ml containing 25 ng of genomic DNA, 1 x Light Cycler 480 Genotyping Master (Roche Diagnostics), 2.5 pmol of each primer and 0.75 pmol of each FRET probe (TIB MOLBIOL, Berlin, Germany). The PCR comprised an initial denaturation step (95uC for 10 min) and 45 cycles (95uC for 10 sec, primer annealing temperature as given in Supplemental Table S1 for 10 sec, 72uC for 15 sec). The melting curve analysis comprised an initial denaturation step (95uC for 1 min), a step rapidly lowering the temperature to 40uC and holding for 60 sec, and a heating step slowly (1 acquisition/uC) increasing the temperature up to 95uC and continuously measuring the fluorescence intensity. The results of melting curve analysis had been confirmed by analyzing samples representing all possible genotypes using sequence analysis. For sequencing, the total volume of the PCR was 100 ml containing 250 ng of genomic DNA, 1 x PCR-buffer (Qiagen, Hilden, Germany), a final MgCl 2 concentration of 1.5 mM, 0.5 mM of a dNTP-Mix (Sigma, Steinheim, Germany), 2.5 units of HotStar Plus Taq TM DNA polymerase (Qiagen) and 10 pmol of each primer (TIB MOLBIOL, Berlin, Germany). The PCR used for sequencing comprised an initial denaturation step (95uC for 5 min), 35 cycles (denaturation at 94uC for 30 sec, primer annealing at 60uC for 30 sec, extension at 72uC for 30 sec) and a final extension step (72uC for 10 min). The PCR products were purified using the QIAquick PCR Purification Kit (Qiagen) and sequenced by a commercial sequencing company (Sequiserve, Vaterstetten, Germany). All sequences of primers and FRET probes and primer annealing temperatures used for genotyping and for sequence analysis are given in the supplementary data section (Supplemental Table S1 and S2). The results of the genotyping for the CD-associated ATG16L1 variant rs2241880 (p.Thr300Ala) were available from a previous study [29].

Statistical analyses
Each genetic marker was tested for Hardy-Weinberg equilibrium in the control population. Fisher's exact test or x 2 test for comparison between categorical variables were used where appropriate. Single-marker allelic tests were performed with Pearson's x 2 test. Student's t-test was applied for quantitative variables. All tests were two-tailed and p-values ,0.05 were considered significant. Odds ratios were calculated for the minor allele at each SNP. For evaluation of phenotypic consequences, we conducted a logistic regression analysis. Data were evaluated by using the SPSS 13.0 software (SPSS Inc., Chicago, IL, U.S.A.) and R-2.4.1. (http://cran.r-project.org). Haplotype analysis and calculation of linkage disequilibrium (LD) were conducted using PLINK (http://pngu.mgh.harvard.edu/,purcell/plink/).

Results
The novel NOD2 variants rs2066843 and rs2076756 are associated with susceptibility to CD but not to UC In the controls, the genotype frequencies of the SNPs rs2066843 and rs2076756 were in agreement with the predicted Hardy-Weinberg equilibrium. Significant differences in the allele frequencies of the SNPs rs2066843 and rs2076756 were observed in CD patients but not in UC patients compared to healthy controls. As summarized in Table 2, the SNPs rs2066843 and rs2076756 were strongly associated with CD. In the group of CD patients, the frequency of the rarer T allele of the rs2066843 variant was 0.390, whereas in the controls it was 0.299 (p = 3.01610 25  . The association of the NOD2 SNPs with CD susceptibility was found in both our initial discovery cohort from the University Hospital Munich-Grosshadern (Supplemental Table S3) and the replication cohort from the University Hospital Munich, Campus Innenstadt and from Ruhr-University Bochum (Supplemental Table S4). In this study, rs2066843 and rs2076756 were in LD in all studied subpopulations (CD, UC, healthy controls; Supplemental Tables S5, S6, S7).
To analyze for potential disease associations with certain NOD2 haplotypes, we performed a detailed haplotype analysis (Table 3 and Supplemental Table S8). As shown in Table 3, we demonstrated for a number of haplotypes significant associations with CD susceptibility, including several associations with omnibus p values of less than 10 210 . The strongest association of a haplotype including one of the SNPs rs2066843 or rs2076756 comprised of the NOD2 SNPs rs2066844-rs2066845-rs2066847-rs2076756 (omnibus p-value = 1.14610 223 ). In contrast, no significant associations were found with UC susceptibility (Supplemental Table S8).

Genotype-phenotype analysis of rs2066843 and rs2076756 NOD2 variants in CD patients
Since the three common NOD2 variants p.Arg702Trp (rs2066844), p.Gly908Arg (rs2066847), and p.Leu1007fsX1008 (rs2066847) have been identified to be associated with a certain CD phenotype [21,22,23,24,30], we also performed a detailed  genotype-phenotype correlation in IBD patients. In univariate analysis, CD patients homozygous for the SNP rs2066843 were found to have a lower body mass index (p = 0.039) and less colonic involvement (p = 0.041) compared to the wildtype patients and we observed also a trend towards a predominantly penetrating disease phenotype (B3) (p = 0.068; Table 4). In addition, a higher need for CD-related surgery in homozygous carriers of the SNP rs2066843 (p = 0.015) was observed compared to the wildtype group (Table 5).
Analyzing CD patients regarding the rs2076756 genotype status, a significant younger age at disease onset was observed in homozygous carriers (mean 25.8611.4 years) compared to wildtype patients (p = 0.023; Table 6). Similar to the analysis of SNP rs2066843, homozygous carriers of SNP rs2076756 had less colonic involvement than wildtype patients (p = 0.032) but showed a trend towards ileocolonic disease location (p = 0.058) and had a significant higher rate of penetrating disease phenotype (B3) (p = 0.015; Table 6). The significant association of SNP rs2076756 with a severe disease phenotype was also reflected by the significantly higher percentage of patients with CD-related surgery (p = 0.003) and internal fistulas (p = 0.015) in homozygous carriers of this SNP (Table 7). Moreover, there was also a trend towards stenotic complications (p = 0.067) in homozygous CD patients (Table 7). However, given the large number of associations investigated, most associations lost significance following Bonferroni correction.
Given the increased prevalence of the three common NOD2 variants among carriers of the risk allele of rs2066843 and rs2076756 and previous reports demonstrating a severe phenotype in carriers of these three NOD2 variants, we next investigated if the phenotypic effects of rs2066843 and rs2076756 were independent of the main three CD-associated NOD2 variants p.Arg702Trp (rs2066844), p.Gly908Arg (rs2066847), and p.Leu1007fsX1008 (rs2066847). As shown in Table 8, there were no significant differences regarding homozygous carriers of these SNPs when stratified for the presence and absence of the main three CD-associated NOD2 variants and the significant phenotypic characteristics found in Tables 5, 6 and 7, suggesting that the CD-modifying effect of rs2066843 and rs2076756 is independent of the three main NOD2 variants.
Analyzing potential therapeutic consequences such as need for surgery, we next conducted a logistic regression analysis with R, using the need for surgery as dependent variable, and the SNP genotype as independent variable, taking localization as well as behaviour as covariates. This revealed that disease localization has a significant influence on the need for surgery (p = 0.02). In addition, disease behaviour is significantly associated with the need for surgery (p = 2.0610 216 , independently of localization). However, using the NOD2 genotype status as further explanatory variable does not improve the model fit (F-test p = 0.36 rs2066843, p = 0.32 rs2076756).
In UC patients, the analysis revealed no significant associations of the SNPs rs2066843 and rs2076756 with phenotypic characteristics such as age, age a diagnosis, male-to-female-ratio, body mass index (BMI), family history, anatomic location and disease behaviour, use of immunosuppressive agents, or UC-related complications (data not shown).
No evidence for epistasis between NOD2 variants and the CD-associated ATG16L1 variant rs2241880 (p.Thr300Ala) Very recent studies indicate that NOD2 recruits the autophagy protein ATG16L1 to the plasma membrane at the bacterial entry site [31]. In contrast, CD-associated mutants failed to recruit ATG16L1 to the plasma membrane and wrapping of invading bacteria by autophagosomes was impaired [31]. Moreover, dendritic cells from CD patients expressing Table 4. Association between rs2066843 genotype and CD disease characteristics based on the Montreal classification [28]. CD-associated NOD2 or ATG16L1 risk variants are defective in autophagy induction, bacterial trafficking and antigen presentation [32]. We therefore hypothesized that there may be epistasis between CD-associated NOD2 and ATG16L1 variants regarding susceptibility to CD. However, as shown in Supplemental Table S9, none of the 5 CD-associated NOD2 variants showed evidence for epistasis to the CD-associated ATG16L1 variant rs2241880 (p.Thr300Ala). Immunosuppressive agents included azathioprine, 6-mercaptopurine, 6-thioguanine, methotrexate, and/or infliximab. 3 Only surgery related to CD-specific problems (e.g. fistulectomy, colectomy, ileostomy) was included. For each variable, the number of patients included is given. doi:10.1371/journal.pone.0014466.t005 Table 6. Association between rs2076756 genotype and CD disease characteristics based on the Montreal classification [28].

Discussion
The identification of NOD2 as the first CD susceptibility gene in 2001 represents a landmark finding that implicated bacterial recognition and innate immunity as key processes involved in the pathogenesis of CD. Since genotype-phenotype analyses of the NOD2 variants p.Arg702Trp, p.Gly908Arg, and p.Leu1007fsX1008 have also provided strong evidence for the existence of a NOD2-related CD phenotype, these studies have not only changed our understanding of IBD pathogenesis but have also implications for clinical practice [21,22,23,25].
Here, we investigated the two novel CD-associated NOD2 variants rs2066843 and rs2076756 in a large German IBD patient cohort, confirming these NOD2 variants as susceptibility gene variants for CD but not for UC. We could demonstrate a highly significant association of SNPs rs2066843 and rs2076756 with CD. The haplotype analysis revealed a number of significant associations with CD susceptibility, including several associations with omnibus p values of less than 10 210 .
In addition, for the first time, our genotype-phenotype analysis revealed a significant association of the SNPs rs2066843 and rs2076756 with phenotypic characteristics such as early disease onset, severe penetrating disease phenotype complicated by fistulas. Moreover, univariante analysis revealed a frequent need for CD-related surgery associated with SNPs rs2066843 and rs2076756 in CD patients. However, the two novel CDassociated NOD2 variants could not be identified as independent variables for the need for surgery in CD patients after logistic regression analysis, suggesting that ileal disease localization and a stricturing or penetrating phenotype are clinically more relevant predictors for the need for CD-related surgery than the NOD2 genotypes alone. In addition, the strength of the association of rs2066843 and rs2076756 with CD was less pronounced than that of the NOD2 variant rs2066847 (p.Leu1007fsX1008) which results in patients homozygous for this variant in a more severe phenotype [21,22] than found for rs2066843 and rs2076756 in this study. Early deep-sequencing studies of the NOD2 locus by the Hugot group also suggested that the NOD2 ''gene-dosage'' effect is more important in the CD phenotype development than the type of NOD2 mutation 25 . In the study by the Hugot group, patients with ''double-dose'' NOD2 mutations (homozygous or compound heterozygous) were characterized by a younger age at onset, a more frequent stricturing phenotype, and less frequent colonic involvement than were seen in those patients who had no mutation 25 .  Although the phenotypic effects of rs2066843 and rs2076756 were independent of the presence of the three common NOD2 variants p.Arg702Trp, p.Gly908Arg, and p.Leu1007fsX1008, the observed associations demonstrate similarities to the phenotype as previously shown to be related to the three common NOD2 variants in our cohort [21,22]. The explorative character of our study has to be acknowledged. Given the large number of associations investigated, most associations would loose significance following Bonferroni correction. However, considering the size of our cohort and given that the highest percentages of carriers with ileocolonic involvement, stenoses, fistulas, and CD-related surgery were found without exception among the homozygous carriers of these two novel NOD2 SNPs, an association with a severe CD phenotype is very likely. Moreover, this result has to be seen in the background of a study population of a tertiary referral center with a very high percentage of severe CD demonstrated by fistulas and stenoses and CD-related surgery in more than half of the study population regardless of the genotype. These phenotypic associations found in homozygous carriers of these SNPs were demonstrated with similar frequencies in the subgroup of CD patients with additional CD-associated NOD2 mutations and in patients without these mutations, suggesting a specific diseasemodifying effect of rs2066843 and rs2076756.
Our findings highlight the essential role of the NOD2 region as CD susceptibility gene encoding a protein which acts as sensor for bacterial muramyl dipeptide (MDP) in the cell wall of Grampositive and Gram-negative bacteria [8,15,16,17]. However, the exact mechanism how NOD2 variants influence the intestinal inflammation is still under investigation. First, NOD2 mutations are associated with diminished mucosal alpha-defensin expression in CD [33], although this finding is challenged by the results of a recent study [34], demonstrating that in ileal CD reduced alphadefensin expression is the result of inflammation and not of NOD2 mutation status. Second, there is evidence for an impaired dendritic cell function in CD patients with NOD2 variants [35] and a loss of synergy between TLR9 and NOD2 in innate immune responses to CpG DNA [36]. Third, there might be crosstolerization between NOD1 and NOD2 leading to increased recognition of both pathogenic and commensal bacteria in NOD2-deficient macrophages pre-exposed to microbial ligands [37]. Fourth, the study by Kobayashi et al. demontrated that Nod2deficient mice are more susceptible to bacterial infection and thus Nod2 protein is a critical regulator of bacterial immunity within the intestine [38]. Finally, very recent evidence suggests that NOD1 and NOD2 are essential for recruitment of ATG16L1 during bacterial infection [31], which is impaired in patients with CD-associated NOD mutants [31]. We therefore analyzed CDassociated NOD2 and ATG16L1 variants for epistasis. However, we found no evidence for significant gene-gene interactions between these variants regarding CD susceptibility. This is in line with another large study demonstrating no epistasis for the three main CD-associated NOD2 variants and ATG16L1 [39], while another study found a weak gene-gene interaction between the ATG16L1 variant rs2241880 and the three common CD-associated NOD2 variants (p = 0.039) [14].
Despite the growing evidence for a strong genetic background in IBD by various genome-wide analyses and cohort studies [7,11,13,14,40,41,42,43,44], CD still remains a complex disorder modulated not only by susceptibility genes but also influenced by environmental factors. Moreover, there are genetic differences between different ethnic cohorts. For example, studies in Asian CD cohorts [45,46] could not show any evidence for a role of NOD2 in CD susceptibility in Asian populations. Genetic counselling based on the NOD2 genotype as well as the analysis of environmental risk factors could therefore benefit the individual at risk and change daily clinical practice.
Taken together, we conclude that the identification of the two novel NOD2 variants rs2066843 and rs2076756 might have implications for the future risk assessment in CD patients, considering their high minor allele frequencies among Caucasian CD patients and the association with a severe disease phenotype. However, the association of these two NOD2 variants with CD was less pronounced than that with rs2066847 (p.Leu1007fsX1008) which demonstrated even a more severe phenotypic effect than rs2066843 and rs2076756 in our previous studies [21,22]. So far, a major functional effect on the NF-kB pathway has only be shown for the p.Leu1007fsX1008 variant 6 . Therefore, further functional analyses of rs2066843 and rs2076756 as well as detailed genotypephenotype analyses in very large cohorts including a comprehensive assessment of low frequency variants in the NOD2 gene region are needed to clarify the contribution of these novel variants to the pathogenesis of CD. Currently, these two novel variants will not replace the other common CD-associated NOD2 variants p.Arg702Trp (rs2066844), p.Gly908Arg (rs2066847), and p.Leu1007fsX1008 (rs2066847) when evaluating genetic risk factors in CD patients.  Table S3 Allele frequencies of the SNPs rs2066843 and rs2076756 in patients with Crohn's disease (CD), ulcerative colitis (UC) and controls in the initial discovery cohort from the University Hospital Munich-Grosshadern. Minor allele frequencies (MAF), allelic test P-values, and odds ratios (OR, shown for the minor allele) with 95% confidence intervals (CI) are depicted for both the CD and UC case-control cohorts.