Investigation of Multiple Susceptibility Loci for Inflammatory Bowel Disease in an Italian Cohort of Patients

Background Recent GWAs and meta-analyses have outlined about 100 susceptibility genes/loci for inflammatory bowel diseases (IBD). In this study we aimed to investigate the influence of SNPs tagging the genes/loci PTGER4, TNFSF15, NKX2-3, ZNF365, IFNG, PTPN2, PSMG1, and HLA in a large pediatric- and adult-onset IBD Italian cohort. Methods Eight SNPs were assessed in 1,070 Crohn's disease (CD), 1,213 ulcerative colitis (UC), 557 of whom being diagnosed at the age of ≤16 years, and 789 healthy controls. Correlations with sub-phenotypes and major variants of NOD2 gene were investigated. Results The SNPs tagging the TNFSF15, NKX2-3, ZNF365, and PTPN2 genes were associated with CD (P values ranging from 0.037 to 7×10−6). The SNPs tagging the PTGER4, NKX2-3, ZNF365, IFNG, PSMG1, and HLA area were associated with UC (P values 0.047 to 4×10−5). In the pediatric cohort the associations of TNFSF15, NKX2-3 with CD, and PTGER4, NKX2-3, ZNF365, IFNG, PSMG1 with UC, were confirmed. Association with TNFSF15 and pediatric UC was also reported. A correlation with NKX2-3 and need for surgery (P  =  0.038), and with HLA and steroid-responsiveness (P  =  0.024) in UC patients was observed. Moreover, significant association in our CD cohort with TNFSF15 SNP and colonic involvement (P  =  0.021), and with ZNF365 and ileal location (P  =  0.024) was demonstrated. Conclusions We confirmed in a large Italian cohort the associations with CD and UC of newly identified genes, both in adult and pediatric cohort of patients, with some influence on sub-phenotypes.


Introduction
The pathogenesis of inflammatory bowel diseases (IBD), namely Crohn's disease (CD) and ulcerative colitis (UC), is still incompletely understood, but it is widely accepted that the two conditions result from an inappropriate and exaggerated mucosal immune response to constituents of the intestinal flora in a genetically susceptible host [1,2]. In the past year, genome-wide association (GWA) studies have identified several genes involved in the pathogenesis of IBD and, subsequently GWAS meta-analysis has led to confirmation of more than 70 genes or loci that confer susceptibility to CD and 47 to UC, mostly in adult populations [3,4,5]. In particular, GWA and replications studies identified gene variants, including protein tyrosine phosphathase nonreceptor type 2 (PTPN2), NK2 transcription factor related locus 3 (NKX2-3), and tumor necrosis factor superfamily member 15 (TNFSF15) [6,7,8]. A subsequent North American GWA study identified two additional loci for UC located on chromosome 1p36 and 12q15, each of them harboring multiple genes, including several with a definite role in inflammation and immunity, like group II secreted phospholipase A2 (PLA2G2E), interferon gamma (IFNG), interleukin 26 (IL26) and interleukin 22 (IL22). In addition, combined genome-wide significant evidence for association was found at two additional loci, namely HLA on chromosome 6p21 and IL23R (interleukin-23 receptor) on chromosome 1p31 [9]. Finally, a GWA study carried out in pediatric-onset IBD patients identified two novel IBD loci located on chromosome 20q13 and 21q22, close to the tumor necrosis factor receptor superfamily member 6B (TNFRSF6B) and proteasome-assembling chaperone 1 (PSMG1) genes [10].
In the present study, we investigated whether potential loci reported in the meta-analysis and GWA studies on chromosomes 5p13 (PTGER4), 12q15 (IFNG, IL22, IL26), 6p21 (HLA), 21q22 (PSMG1), and PTPN2, NKX2-3, TNFSF15, were associated in a large and phenotypically well-characterized Italian cohort of IBD patients, and we also attempted to elucidate their involvement in early onset disease. In addition, we tested the potential epistasis between these variants and IBD-associated NOD2/CARD15 SNPs, as well as possible genotype-phenotype correlations.

Ethics statement
The IBD cohort and unaffected controls were recruited from adult individuals referred the IRCCS, ''Casa Sollievo della Sofferenza '' Hospital in San Giovanni Rotondo, and from pediatric centers of the Italian Society of Pediatric Gastroenterology, Hepatology and Nutrition (SIGENP). Written informed consent was obtained from all adult participants and, for patients under age of 19 years, from related parents. Ethical approval was acquired from the Ethics Review Board of ''Casa Sollievo della Sofferenza'' Hospital, San Giovanni Rotondo, and each participating center approved the recruitment protocol. The study was supported by a grant from the Italian Minister of the Health (RC0902GA33).
Extensive clinical characterization was available for all patients. The diagnosis of CD or UC was established by conventional clinical, radiological, endoscopic, and histological findings [11]. The CD phenotype was classified based on age at disease onset (A), maximal extent of disease (L), and behavior (B) according to the Montreal classification [12,13]. In patients with UC the colon location was also classified according to the Montreal classification, by distinguishing ulcerative proctitis (E1), left-side colitis (E2), and extensive colitis (E3). For all IBD patients further clinical characteristics were analyzed, namely the occurrence of previous resective surgery, IBD family history, smoking habits, extraintestinal manifestations and response to medical therapy. More specifically the need for use of corticosteroids, immunosuppressors (thiopurines and methotrexate) and anti-TNF agents were evaluated. In addition, on the basis of review of medical records, patients with the use of corticosteroid (CS) were classified as CSresponder (at least one course of systemic steroids with clinical remission reported in the medical history), or CS-refractory (when an unsuccessful clinical response was achieved leading to alternative therapies like surgery, use of anti-TNF or other immunosuppressors drugs) [14,15]. Patients with incomplete or unclear information were excluded from this analysis.

SNPs analysis and genotyping
We selected 8 polymorphisms for genotyping: three of them (rs4613763, rs4263839, rs11190140) identified by the CD GWA study meta-analysis [3], two (rs10761659, rs2542151) by the WTCCC [6] as showing the strongest association signals, two (rs2395185, rs1558744) by the first UC GWA study [9], and one (rs2836878) by a GWA analysis for early-onset IBD [10]. The genotypic variants in the NOD2/CARD15 gene (rs2066844, rs2066845, rs2066847) had already been analyzed for all patients and controls. Genomic DNA was extracted from peripheral blood leukocytes by standard procedures using the DNA blood maxi kit from Qiagen (Hilden, Germany) in accordance with the manufacturer's instructions. All of the genotyping was performed at the Molecular Laboratory of the Gastroenterology Unit at the San Giovanni Rotondo Hospital, Italy. Genotyping was performed using Custom TaqmanH SNP assay (Applied Biosystems, Foster City, CA), following manufacturer's instructions. The overall success rate of the genotyping assay was over 98%.

Statistical analysis
Statistical analysis was performed using Haploview Software version 4.1 (http://www.broad.mit.edu/personal/mpg/haploview) and SPSS software version 13.0 (Chicago, IL, USA). Hardy-Weinberg Equilibrium (HWE) tests were performed for all investigated polymorphisms independently among cases and controls. For the case-control analysis, comparisons of genotypes and allele frequencies was performed using X 2 or Fisher's exact test, where appropriate. Genotype-phenotype associations were first analyzed by means of univariate analysis and subsequently expressed as Odds Ratios (OR) with 95% confidence intervals (95% CI) by means of stepwise logistic regression analysis. For detecting gene-gene interactions we used a logistic regression based on forward stepwise selection procedures using the number of risk alleles as predictor variable. P values of less than 0.05 were considered significant.

Case-control analysis with IBD patients
A total of 2283 individuals with IBD, including 1070 with CD and 1213 with UC, was analyzed. The control group consisted of 789 individuals from the same ethnicity who did not have IBD (neither CD nor UC). Clinical and demographic features of the IBD cases are listed in Table 1. There were 296 CD patients and 261 UC patients with the initial diagnosis before their 16 th birthday. The adult cohort was constituted of 774 patients with CD and 952 with UC.
To determine whether the previously identified variants were shared by both adult and pediatric individuals, we stratified all IBD patients according to their age at initial diagnosis. For the adult subset, associations were confirmed for all SNPs for either CD and UC with two remarkable exceptions: the rs4613763 variant, which was associated with CD (P = 0.031)( Table 4), and the rs2836878 variant which lost the association for UC ( Table 5). Concerning childhood-onset cohort, the genotype frequencies of all considered polymorphisms remained significantly associated for UC, with the exception of the rs2395185 variant ( Table 5). In CD pediatric patients the association was confirmed for rs4263839 (P = 0.008) and rs11190140 (P = 0.007) variants ( Table 4).
We asked whether the identified genotype/phenotype associations would differ after stratifying the IBD population in respect to age at diagnosis. For the HLA-rs2395185 SNP the association persisted both in the adult-(P = 0.003) and in the pediatric-onset subset of patients (P = 0.006) with a positive family history ( Table 6). Despite the large number of pediatric-onset IBD patients investigated, a trend was found for the genotype/ phenotype frequencies of rs10761659, rs4263839, and rs11190140 SNPs, but did not reach statistical significance.

Gene-gene interactions
The possible interactions of tested variants (rs4613763, rs2395185, rs4263839, rs11190140, rs10761659, rs1558744, rs2542151, and rs2836878) with polymorphisms in the established susceptibility gene NOD2/CARD15 (the three main polymorphisms rs2066844, rs2066845, and rs2066847: at least 1 variant against wild type) were evaluated. After correction for multiple testing, there was no significant evidence for interaction among the considered SNPs (P . 0.05) (data not shown), thus implying that each gene independently contributes to the disease risk.

Discussion
Recent GWA studies have enhanced our understanding of the complex genetic architecture of IBD. Most associations appear to be common to both types of IBD, while some genes/loci may be specific to adult-or pediatric-onset, and the factors that determine age of onset are unknown at present.
In addition, our analysis reveals significant association with pediatric UC cohort for six out the eight investigated variants, each of them with a P value , 0.016 (PTGER4, TNFSF15, NKX2-3, ZNF365, IFNG, and PSMG1): this finding suggests that these genes may also be involved in susceptibility to UC pediatric-onset.
To date, no studies on the tested polymorphisms have indicated associations with specific sub-phenotype. We were able to demonstrate, by stepwise logistic regression analysis, a correlation with, NKX2-3 and need for surgery (P = 0.038), and with HLA and steroid-responsiveness (P = 0.024), and a positive family history in UC patients. Moreover, significant association in our CD cohort with TNFSF15 SNP and colonic involvement (P = 0.021), and with ZNF365 and ileal location (P = 0.024) was demonstrated.
The clinical significance of these associations, if any, remains to be investigated.
A summary of previously described allelic distributions of SNPs of genes/loci analyzed is depicted in Supplementary Table S1.
Kugathasan et al [10] carried out a GWA analysis in a pediatriconset IBD and identified a significant association of both CD and UC with an intergenic SNP, rs2836878, located on chromosome 21q22 in a small region of linkage disequilibrium that harbors no genes but is close to the PSMG1 gene. The association was also reiterated by the largest GWA study conducted so far in early-onset IBD (UC: P = 2.65610 29 ) [16]. In keeping with previous finding is the contribution of the PSMG1 locus to disease susceptibility in adult UC [17]. McGovern et al [18] confirmed the association after combining data from two new GWA studies and performing a meta-analysis with a published study. A trend for similar associations at rs2836878 variant was observed in CD Canadian cohort but did not achieve statistical significance [19] (Table S1). Our results confirm the association in particular with pediatric-onset UC, pinpointing to a significant relevance of the 21q22 for UC. Larger studies that include functional data on the PSMG1 gene will be required to confirm association at this locus.  A significant association of genetic variants of the TNFSF15 (TL1A) gene on chromosome 9q33 with CD was observed in a large cohort of Japanese patients, in several European cohorts [8,20,21], in US Jewish patients [22], in the combined data from the NIDDK IBD Consortium and the WTCCC [3], in Koreans patients [23], and in UC GWA study [7]. The TNFSF15 is the only gene that has been associated in either Asiatic and European IBD patients [8,19] (Table  S1). TNFSF15 is a member of tumor necrosis factor (TNF) superfamily that binds to death domain receptor 3 (DR3, TNFRSF25) and is expressed in endothelial cells, lymphocytes, plasma cells, monocytes, and dendritic cells [24,25]. Our analysis confirms the TNFSF15 as a susceptibility locus in CD both in adult and pediatric population (P = , 0.001, and P = 0.008 respectively), and shows an association also with early onset UC (P = 0.016).
The WTCCC [6] reported on CD cases a novel association involving a cluster of SNPs around the rs10883365 variant on chromosome 10q24.2, which maps within NKX2-3 (NK2 transcription factor related, locus 3) gene, a member of the NKX family of homeodomain-containing transcription factors. The results of Parkes et al [26] supported this findings with an independent set of CD cases and controls of European descent. In addition, a large-scale meta-analysis [3] on CD cohort and replication study on IBD samples [7] found association with rs11190140 polymorphism in complete linkage disequilibrium with the risk allele at the SNP rs10883365 from the WTCCC study (r 2 = 1.0). In addition, a modest association was also reported with UC in a nonsynonymous SNP scan [27], in GWA scan for UC [17], and in the UC GWA meta-analysis and Table 4. Genotype distribution of associated SNPs in adult-and childhood-onset Crohn's disease (CD) cohort.  Table 5. Genotype distribution of associated SNPs in adult-and childhood-onset ulcerative colitis (UC) cohort. replication studies [18] (Table S1). Similarly, in the present study the rs11190140 variant was associated with an increased risk for both CD and UC (P = 0.003 and P = 4610 25 respectively), both in the adult and early-onset cohort. Abnormal expression of NKX2-3 may alter gut migration of antigen-responsive lymphocytes and influence the intestinal inflammatory response. NKX2-3-deficient mice develop splenic and gut-associated lymphoid tissue abnormalities with disordered segregation of T and B cells [28].
A highly attractive candidate gene for IBD, owing to its antiinflammatory function and involvement in type 1 diabetes susceptibility and rheumatoid arthritis [29], is the PTPN2 (protein tyrosine phosphatase, non-receptor type 2) located on chromosome 18p11, which encodes for a tyrosine phosphatase expressed in T cells, a negative regulator of inflammation. A novel association at rs2542151 of the PTPN2 gene and CD was identified [6], replicated [26,30] and confirmed [3]. In contrast, Franke et al [7] observed this association with UC, a finding replicated in the recent GWA on UC [17]. In addition, associations with CD pediatric was also demonstrated [16,31](Table S1). In the present investigation the association was observed only in adult CD (P = 0.015) confirming its role as an adultsusceptibility gene.
The UC GWAS in European ancestry samples [9] indentified loci on chromosomes 1p36 and 12q15 where are genes involved in inflammation and immunity, such as PLA2GIIE (phospholipase A2, group IIE), IFNG (interferon-c), IL22 (interleukin-22), and IL26 (interleukin-26); previous associations were replicated in GWA meta-analysis [18] (Table S1). We were able to replicate the association between the rs1558744 SNP on chromosome 12q15 and UC (P = 0.007), and interestingly after stratifying the cohort with respect to age at diagnosis, the association was observed also in the pediatric-onset patients (adult P = 0.03; pediatric P = 0.006).
Several independent genome-wide scans in inflammatory bowel disease have shown evidence of linkage to the MHC region [32,33], which is characterized by extensive LD blocks (up to 3 Mb) and several genes (250 genes), mainly involved in immunerelated functions. Recent genome-wide association studies in UC confirmed the association with the HLA with the maximal association signal at rs2395185, in a region spanning BTNL2 to HLA-DQB1 genes [9,18]. Recently, SNP and the HLA data convincingly show that the main signal is located in a narrow genomic window containing the HLA-DRB1 gene and strongly suggest that the more common HLA-DRB1*1101 allele plays a primary role in both UC and CD susceptibility [34]. This association to the DRB1 locus is consistent with other published GWAs in UC [9,35,36], also in Japanese population [37]. The same region was identified in the meta-analysis of CD genomewide association studies [3]. Recent genome-wide association study in early-onset IBD [16] validated this known adult-onset IBD locus in their CD, UC, and IBD dataset, further supporting the importance of this region in IBD risk (Table S1). Allelic and genotype association analysis in our cohort showed that the polymorphism was significantly associated with overall susceptibility to UC (P = 0.001), and in particular with adult subset (P = 0.001).
In the study of Libioulle et al [38] a region on chromosome 5p13.1 contributing to CD susceptibility was identified. The disease-associated alleles were found to correlate with expression levels of the prostaglandin receptor EP4, which binds prostaglandin E2 (PGE2) and is encoded by PTGER4. In the same region the meta-analysis of three GWAS [3] identified rs4613763 as the most strongly associated SNP. No highly significant association of the PTGER4 region was documented in the studies of the NIDDK [39] and the WTCCC [6]. Recently GWAS showed significant evidence for association also with UC [18] (Table S1). Similarly, our data indicated significant association with adult CD (P = 0.031), and early-onset UC (P = 0.002).
In the WTCCC GWA [6], a locus at chromosome 10q21 around rs10761659, a non-coding intergenic SNP mapping 14-kb telomeric to a zinc finger gene known as ZNF365, was detected. The locus was replicated both in pediatric CD and UC [16,31], and adult-onset CD [3,40] (Table S1). We were able to confirm this association with CD (P = 0.007), and UC (P = 4610 25 ), and after stratifying the cohort with respect to age at diagnosis the association was confirmed in adult CD (P = 0.002), and either adult (P = 0.0002), and in pediatric UC patients (P = 0.005).
In conclusion, our study has confirmed recently described associations, in particular between the PTGER4, HLA, TNFSF15, NKX2-3, ZNF365, IFNG, PTPN2, and PSMG1 genes and IBD in adults and in some cases for the first time also in children. Furthermore, we were able to identify the influence of the investigated genes on clinical expression and localization on CD and UC patients, but the clinical significance of these associations remains to be investigated and replicated. Further characterization, fine mapping, and functional studies of these genetic regions are needed to discover the pathogenetic role of these newly identified genes/loci.

Supporting Information
Table S1 Allelic distributions of single nucleotide polymorphisms (SNP) analyzed in the previously published studies and in the current study. (DOCX)