Genetic Variants of Wnt Transcription Factor TCF-4 (TCF7L2) Putative Promoter Region Are Associated with Small Intestinal Crohn's Disease

Reduced expression of Paneth cell antimicrobial α-defensins, human defensin (HD)-5 and -6, characterizes Crohn's disease (CD) of the ileum. TCF-4 (also named TCF7L2), a Wnt signalling pathway transcription factor, orchestrates Paneth cell differentiation, directly regulates the expression of HD-5 and -6, and was previously associated with the decrease of these antimicrobial peptides in a subset of ileal CD. To investigate a potential genetic association of TCF-4 with ileal CD, we sequenced 2.1 kb of the 5′ flanking region of TCF-4 in a small group of ileal CD patients and controls (n = 10 each). We identified eight single nucleotide polymorphisms (SNPs), of which three (rs3814570, rs10885394, rs10885395) were in linkage disequilibrium and found more frequently in patients; one (rs3814570) was thereby located in a predicted regulatory region. We carried out high-throughput analysis of this SNP in three cohorts of inflammatory bowel disease (IBD) patients and controls. Overall 1399 healthy individuals, 785 ulcerative colitis (UC) patients, 225 CD patients with colonic disease only and 784 CD patients with ileal involvement were used to determine frequency distributions. We found an association of rs3814570 with ileal CD but neither with colonic CD or UC, in a combined analysis (allele positivity: OR 1.27, 95% CI 1.07 to 1.52, p = 0.00737), which was the strongest in ileal CD patients with stricturing behaviour (allele frequency: OR 1.32, 95% CI 1.08 to1.62, p = 0.00686) or an additional involvement of the upper GIT (allele frequency: OR 1.38, 95% CI 1.03 to1.84, p = 0.02882). The newly identified genetic association of TCF-4 with ileal CD provides evidence that the decrease in Paneth cell α-defensins is a primary factor in disease pathogenesis.


Introduction
Inflammatory bowel disease (IBD), a chronic inflammation of the intestine, is commonly classified into ulcerative colitis (UC) and Crohn's disease (CD) on the basis of clinical features and histopathology [1]. Whereas UC is typically restricted to the colon, CD can occur at many sites, predominantly in the small intestinal ileum, the colon, or in both locations. Emerging details of disease pathogenesis support the current concept that ongoing immune activation in IBD is driven by bacterial microbiota, possibly as a result to an attenuated antimicrobial barrier in genetically predisposed individuals [1][2][3]. Both UC and CD have a complex polygenic, multifactorial background, with a coincidence of susceptibility genes and environmental factors involved in pathogenesis. It is likely that different genetically affected factors may explain the various clinical patterns of IBD, especially location of disease in CD, which is stable over time [4][5][6]. Different explanations for disease location, including a central role of small intestinal Paneth cells and other defects in intestinal innate immunity, were the focus of recent discussion [2]. For ileal CD, reduced expression of small intestinal Paneth cell a-defensins HD-5 and -6 (DEFA5 and DEFA6) has been described in several cohorts [7][8][9][10][11][12]. The defensin deficiency is proposed to attenuate the antibacterial host defense capacity of the intestinal mucosa, and may initiate and/or perpetuate the chronic inflammation characterizing the disease at this site [7][8][9][10][11][12]. We recently reported one mechanism to explain, in part, the decrease of these antimicrobial peptides [9,13]: A reduced expression of the Wnt pathway transcription factor TCF-4 (also known as transcription factor 7-like 2), which directly controls Paneth cell defensin expression (HD-5, HD-6, and orthologous mouse cryptdin peptides [9,13]).
Wnt proteins are a family of secreted morphogenes that play an important role in regulating cell fate and differentiation during embryogenesis [14]. The Wnt signalling pathway is induced by binding of Wnt family proteins to cell surface receptors, leading to stabilization of cytoplasmatic b-catenin, translocation of this regulatory protein into the nucleus, formation of a complex with transcription factors of the Tcf/Lef family and subsequently the activation of various target genes [13]. In the small intestine, epithelial cells transit through differentiation steps initiated in progenitor cells, which reside adjacent to Paneth cells at the base of the crypts [15]. Wnt signalling helps to maintain an undifferentiated state of the intestinal stem cells [16,17] and, paradoxically, also regulates positioning, differentiation and maturation of Paneth cells [13,18]. The Paneth cell gene program is critically dependent on TCF-4 [13]. Using a rodent model, we observed that very small changes (a 50% decrease of TCF-4 levels) are sufficient to compromise mouse Paneth cell cryptidin expression as well as its corresponding antimicrobial function against several bacterial species. We also reported that a reduced level of TCF-4 expression and activity was associated with a decrease of Paneth cell a-defensin levels in CD of the small intestine. The decrease of TCF-4 expression was found to be independent of inflammation in the tissue specimens, and also independent of the 1007fsinsC SNP in NOD2, a mutation in this pattern recognition receptor which has previously been associated with ileal CD [9]. We hypothesized that decreased TCF-4 expression might be the result of primary genetic variances in TCF-4, at least in some patients with ileal CD. Since there was a decrease in TCF-4 mRNA levels in these studies, an aberration in the promoter region of TCF-4 could be a possible explanation. Thus, the aim of this study was to sequence the promoter region of the TCF-4 gene in a group of patients with ileal CD to identify potential polymorphisms and to perform a subsequent association study on candidate genetic variants in well-defined cohorts of patients. We identified a total of 8 SNP variants, of which three (rs3814570, rs10885394, rs10885395) were in linkage disequilibrium and seemed to exhibit a higher frequency in ileal CD patients. One of these SNPs was found to be located in a putative regulatory region. We carried out high-throughput analysis of this SNP in three IBD cohorts from Oxford, Leuven and Vienna [19][20][21]. Herein we report an association of the SNP rs3814570 with ileal involvement of CD, but not with colonic CD or UC.

Patients and human material
For genetic analysis, we obtained DNA samples from a patient cohort of Caucasians with Crohn's disease (N = 259) or ulcerative colitis (N = 149) from the University Hospital in Vienna, as well as a control group of unrelated, healthy Caucasian blood donors in Stuttgart (N = 833). For subsequent testing, we obtained DNA samples from Caucasians with Crohn's disease (N = 277), UC (N = 74) and healthy controls (N = 242) from the University of Leuven, Belgium (3) as well as an additional third Caucasian cohort from Oxford with DNA of Caucasian healthy individuals (N = 324), UC (N = 562) and CD (N = 473) patients. In line with the Montreal classification (4) three subgroups were defined: ileal disease only (L1), colonic disease only (L2) and ileo colonic disease (L3). A total of 1399 randomly recruited healthy control individuals, 785 UC patients, 225 CD (L2) patients with disease limited to the colon and 784 CD patients with ileal involvement (L1+L3) were used to elucidate the frequency distribution of SNPs [19][20][21]. The numbers of patient subgroups and controls in the different cohorts are shown in Table 1 and detailed statistical  analyses are provided in Table 2. To exclude major differences between the groups in age or gender, CD patients as well as controls were sub grouped according to these criteria (Table 3). Additional points of interest were the behaviour as well as the aggressiveness of the disease. We therefore decided to separately test for an association with inflammatory, stricturing and penetrating behaviour as well as an association of the variant with surgery for Crohn's disease. Finally we checked patients with an additional involvement of the upper gastrointestinal tract (L4). The study was approved by the ethics committees of the Medical University Vienna, Austria, the University Hospital Tübingen, Germany, the University of Leuven, Belgium and the Oxford Radcliffe Hospital Trust. All patients gave informed and written consent for their DNA to be analyzed for this study.

Sequencing of TCF-4 promoter and gene region
To determine possible genetic variants in the TCF-4 promoter, we sequenced the 2.1 kb upstream region of randomly selected healthy controls (n = 10) and patients with ileal CD (n = 10). In addition, we sequenced the region of the TCF-4 gene in which functional insertions and deletions have been reported in colonic cancer [22]. Subsequently, a sequence analysis of known TCF-4 exons was carried out, including ,100 bp intron boundaries, to identify additional potential variants of this gene in these regions. Primers were designed using ENSG00000148737 of the Ensemble genome browser database for the promoter and exon sequencing. Sequencing was performed according to standard procedures and the primers are provided upon request.

TCF-4 genotyping
Leukocyte DNA was isolated by standard procedures (QIAamp DNA Blood Mini Kit, Qiagen, Hilden, Germany) from whole blood samples. Genotyping of the samples from the cohorts from Vienna and Leuven was performed using the matrix assisted laser desorption/ionization time-of-flight (MALDI-TOF) based mass spectrometry (MS) of allele specific primer extension products with a system from Bruker (Daltonik,Leipzig, Germany). Presence of TCF-4 SNPs detected by MALDI-TOF MS was confirmed by TaqManH analysis and direct sequencing in a subset of samples. MALDI-TOF MS based genotyping of the DNA samples obtained from Oxford was carried out using a MassARRAYH Compact System from Sequenom (San Diego, USA). Primers were designed using reference sequence NT 030059 and will be provided on request.

NOD2 genotype analysis
Genotyping for the common NOD2 variants (SNP8, SNP12, and SNP13) was performed in the Vienna patient samples using TaqMan technology (Applied Biosystems, Foster City, California, USA), as described previously [7].

Computer analysis and statistics
In silico screen of a 10 kb TCF-4 upstream region was performed using ''Promoter 2.0: for the recognition of PolII promoter sequences.'' TESS (Transcription Element Search System) database software allowed assessing of potential binding sites for certain transcription factors in the candidate sequence. Polymorphisms were tested for Hardy-Weinberg equilibrium using Finetti specialized software (http://ihg2.helmholtz-muenchen.de/ cgi-bin/hw/hwa1.pl) using log likelihood ratio chi square test in the three cohorts. For genetic analysis (comparing IBD subgroups versus controls) we used this software to calculate odds ratios, Confidence Intervals (C.I.) and to perform Pearson's goodness-offit chi-square tests. Differences in genotype frequencies were subject to both t tests and Armitage's trend tests. Values below 0.05 were considered significant. Linkage disequilibrium between TCF-4 SNPs and haplotype blocks were calculated and identified using Haploview. To exclude a coincidental association of the SNP rs3814570, the significance of p-values,0.05 was verified using Benjamini-Hochberg correction in the overall group.

SNP selection and haplotypes
To investigate potential genetic linkage of TCF-4 to ileal CD, we screened for SNPs by sequencing 2.1 kb of the 59 flanking region of TCF-4 in a random group of 10 ileal CD patients and 10 healthy controls. We found eight SNPs in this putative promoter region (Figure 1), of which three (rs3814570, rs10885394, rs10885395) were in linkage disequilibrium (LD) in both the patient and control groups. In the control group, two of ten individuals carried the variants; in patients with ileal CD, six of ten individuals were heterozygous for the SNPs. On the basis of these findings, we studied a well-defined cohort of patients with CD and healthy controls from Vienna, Austria. In both the control and CD groups, we found LD between the 3 SNPs that defined a novel haplotype block ( Figure 2a).
An in silico promoter and transcription factor binding-site analysis of the sequenced region revealed a potential regulatory region close to the location of rs3814570. Because of (i) the observed decreased expression of TCF-4 mRNA, (ii) the higher frequency of the promoter variant in patients as well as (iii) the presence of a putative regulatory locus, we tested the hypothesis that rs3814570 exhibits an association with small intestinal involvement of CD. To exclude additional major variants in the gene region and possible LD of the identified promoter SNPs to other potentially functional variants in the TCF-4 gene, we sequenced known coding exons, with ,100 kb overlapping intron boundaries in 10 randomly chosen controls (6 identical to promoter analysis) as well as 25 patients with ileal CD (7 identical to promoter analysis) ( Figure S1). We found ten additional putative SNPs, of which two were in LD, but none exhibited LD with the described promoter SNPs (data not shown). A further search for haplotypes in TCF-4 was conducted based on published data from the HapMap project (Figure 2b), and no haplotype block including rs3814570 or additional SNPs in the gene region were identified.
A TCF-4 promoter variant is associated with ileal CD predisposition Analysis of SNP rs3814570 frequency distribution was carried out in a total of 1399 controls (T allele frequency = 25.59%), 785 Table 2. TCF-4 (TCF7L2) rs3814570 frequency distribution and statistical analysis of combined cohort samples.  ,.
Since there were differences in allele frequencies between the cohorts (Tables S1, S2, and S3), we tested if those apparent frequency differences were statistically significant. In general the Oxford cohort exhibited a lower T allele frequency in controls (23.30%) compared to Leuven (26.65%) as well as to Vienna (26.17%) The same was true for CD patients (T allele frequency in Oxford: 27.38%, Leuven: 30.14% and Vienna: 28.96%), but could partly be explained by the different percentage of colonic CD patients in the groups. For CD with ileal involvement only, the frequency distributions in the cohorts were more similar (T allele frequency in Oxford: 28.30%, Leuven: 30.82% and Vienna: 30.64%) and not significantly different. Even though we found a possible change in frequency distribution between the Oxford control group with both the Leuven (allele frequency: OR 1.20, 95% CI 0.91 to 1.57, p = 0.19618) and Vienna controls (allele frequency: OR 1.17, 95% CI 0.94 to 1.44, p = 0.15453), the differences did not achieve statistical significance. The elevated SNP frequency in ileal CD patients was seen in three independent European cohorts, and a distinct significant association of the minor variant for rs3814570 with ileal CD could be observed in the combined analysis of all samples ( Table 2).
The association of rs3814570 with ileal CD is independent of gender but slightly more pronounced in patients .40 years To make sure there is no disarrangement of age as well as gender we subgrouped all controls as well as the CD patient groups according to these criteria (Table 3). There were no consistent differences in allele frequency between men and women in either controls or patients; therefore we exclude a gender specific effect of the variant. Interestingly we found an increased association of the variant comparing patients with ileal, but not solely colonic CD of the age group A3 (.40 years) with controls of the same age group in the overall analysis, as well as in two separate cohorts (Leuven and Oxford). In the overall analysis a statistical significance for homozygous carriers was present (homozygous carriers: OR 2.02, 95% CI 1.01 to 4.05, p = 0.04347) rs3814570 shows the highest frequency in patients with stricturing ileal Cohn's disease We grouped the patients according to their behaviour into B1 (inflammatory), B2 (stricturing) and B3 (penetrating) ( Table 4). We found the highest frequency in the overall analysis within the ileal CD subgroup with stricturing behaviour (T allele frequency: 31.25%). This was also obvious in 2 separate cohorts (T allele frequency in Oxford: 29.81%, Leuven: 35.83%) but not seen in L2 CD patients. The association of the SNP with stricturing ileal CD compared to healthy controls exhibited a high significance in the overall analysis (allele frequency: OR 1.32, 95% CI 1.08 to1.62, p = 0.00686) and an additionally increased amount of homozygous carriers was observed (homozygous carriers: OR 1.71, 95% 1.11 to 2.63, p = 0.01460). To identify a possible association with aggressiveness of disease we also grouped the patients in such that have had at least one surgery for CD and those who did not (Table 4). No consistent result was observed; even though in two cohorts a trend towards a higher frequency in the ileal CD group with surgery (T allele frequency in Oxford: 28.93% and Leuven: rs3814570 confers to the risk of an additional L4 phenotype in patients with ileal CD To specifically address the question of upper GIT involvement (L4) we separated the patient groups in further subgroups according to this specific additional phenotype. In general the amount of patients with upper GIT involvement was quite low: Leuven patients with additional L4 phenotype: 12 patients L3; 4 patients L2; 6 patients L1; Oxford patients with additional L4 phenotype: 36 patients L3; 4 patients L2; 10 patients L1; Vienna patients with additional L4 phenotype: 40 patients L3; 10 patients L2; 11 patients L1. Comparing the allele frequencies with controls, we found a slight increase in patients with ileal CD and additional L4 phenotype (T allele frequency: 32.17%). This did not account for L2 patients with upper GIT involvement. The stronger association of the rare variant was also statistically significant in the overall analysis (allele frequency: OR 1.38, 95% CI 1 1.03 to1.84, p = 0.02882).

rs3814570 is independent of NOD2
Given that the 3020insC frameshift mutation (SNP13) in NOD2 is a known susceptibility factor for CD of the ileum and is associated with reduced HD-5 and-6 levels, we investigated if the observed association of rs3814570 with ileal CD is independent of NOD2 in the Vienna and Leuven cohorts. We previously reported that the effects of reduced TCF-4 on Paneth cell a-defensins in ileal CD patients were independent of the effects of the SNP13 NOD2 variant, since patients with this NOD2 mutation showed a much more marked decrease of HD-5 and -6 expression [9]. The independence of the factors suggests that excluding patients harbouring NOD2 SNP13 should yield similar allele frequencies of rs3814570 in the remaining ileal CD patients. Indeed, comparing all Leuven ileal CD patients (n = 232) to a subgroup excluding patients harbouring SNP13 (n = 191), there were no differences in allele frequency (OR 0.99) or allele positivity (OR 0.98). The same was true for the Vienna ileal CD patients (n = 204): following SNP13 exclusion (n = 154) the allele frequency gave an OR 1.06 and an allele positivity of OR 1.04. Thus, exclusion of patients with NOD2 frameshift mutation SNP13 does not alter the observed allele frequencies of rs3814570 in patients with ileal CD, supporting independent effects of this TCF-4 SNP and NOD2 SNP13 in ileal CD.

Discussion
In a hypothesis driven candidate gene approach, we investigated the association of sequence polymorphisms in the TCF-4 (TCF7L2) promoter with ileal Crohn's disease. The reported findings represent the third identified genetic association with a link to Paneth cells in ileal CD. Recently Cadwell et al. published that Crohn's disease patients homozygous for the disease risk allele of ATG16L1 display Paneth cell abnormalities which were also present in ATG16L1 HM mice [23]. Earlier we and others have shown that the 3020insC (SNP13) mutation in the intracellular, in Paneth cell present muramyl dipeptide receptor NOD2, is associated with especially reduced levels of HD5 and -6 [7,8,12]. Such a distinct deficiency in the innate defence also characterizes NOD2 knock out mice [24]. The characteristic decrease of HD-5 and -6 in ileal CD results in an impaired innate immunity at the small intestinal barrier which is distinguished by reduced antibacterial activity in the epithelium, and proposed to disrupt the host -microbe balance at the mucosa [7,8,12]. It is also apparent in patients with wild type NOD2 or either missense mutations (SNP8 and SNP12) but was, however, more pronounced in patients with the frameshift (SNP13) mutation. Different studies point to a delicate balance between commensal microbes and the intestinal mucosa (for review [25,26]). We propose that a perturbation in this dynamic interplay has an important role in IBD pathogenesis [27]. In summary, NOD2 SNP13 can in part explain a loss in HD-5 and -6 level but is found in only a minority of patients with ileal CD, but diminished defensin levels are present in the majority [8] and have an immediate effect on antimicrobial activity against and composition of the intestinal microflora [8]. A different functional link in ileal CD leading to diminished Paneth cell a-defensins HD-5 and -6, is a reduced mRNA expression of TCF-4, which has been previously reported by our group [9]. The identification of TCF-4 as a new factor in the pathogenesis of ileal CD provides a more general mechanism for the deficit in HD-5 and -6. Since TCF-4 binds to and directly regulates the promoter regions of HD-5 and HD-6, diminished TCF-4 expression, maybe consequent on a genetic mutation, could account for a decrease in both of these defensins. The current data give support to the hypothesis of a genetic association between a rare SNP variant of TCF-4 and ileal involvement in CD in a subset of patients. This variant, the rs3814570 T allele in the TCF-4 promoter region, was most prevalent in CD localized to the ileum and no association with either colonic CD or UC was found. The strongest association with the variant was present in ileal CD patients with stricturing disease behaviour as well as those with an additional involvement of the upper GIT. The fact that genetic variants in TCF-4, a factor indispensable for Paneth cell function, are specifically associated with ileal CD provides further evidence that a decrease in HD-5 and -6 is predisposing and can be seen as a primary defect in the disease. The reported association between the TCF-4 SNP and small intestinal involvement was found in all cohorts from Vienna, Oxford and Leuven. Variability of allele frequencies in controls between the Oxford cohort and the two cohorts from the European mainland might be explained by population differences as a consequence of a heterogenic ethnical history. For NOD2 an even greater heterogeneity among Europeans has been reported [28]. In a letter regarding the frequency variability of DLG5 polymorphisms, another gene reported to exhibit an association with IBD, Tenesa et al. caution against pooling data from different populations, because true but in the cohorts different effects might be concealed [29]. The overall analysis of controls versus ileal CD showed a marginally lower difference when compared with the individual results of the Vienna and Oxford cohorts (Figure 3), so this might play a role in our analysis. However, differences in overall allele frequencies in the control cohorts were not statistically different.
Associations of genetic variations of the TCF-4 gene with other diseases exist, but data are limited. An association of two noncoding SNPs in the TCF-4 gene has been observed with diabetes mellitus [30] and, in another study, an association with deletions and insertions of adenines in the coding region was reported in patients with colorectal cancer [22]. We did not find any genetic association of these polymorphisms in UC nor CD (or in any of the clinical subgroups) in the samples from Vienna (Figure 2a and data not shown for repetitive A polymorphic region).
Given that Wnt/TCF-4 plays a major role in Paneth cell maturation, aside from its direct function in the expression of Paneth cell a-defensins [31,32], the observed link between ileal CD and TCF-4 suggests that impaired cell differentiation might be involved in the disorder. This would differ from many other views on IBD pathogenesis which emphasize the role of dysregulated immune function in otherwise normally functioning cells. If indeed a hypothesis on aberrant cell maturation proves significant, effective new therapeutic strategies might alternatively target steps in differentiation in addition to regulate or influence downstream impaired effector molecules like HD-5 and -6. Figure S1 Sequencing of TCF-4 (TCF7L2) exon regions and intron boundaries. Sequencing of exon regions was performed in a representative and limited number of healthy controls as well as Crohn's disease patients with known clinical phenotype (small intestinal CD). The relative location of identified variants is marked via grey dashes (upper part) and their allele frequency is demonstrated via bars for controls as well as patients (lower part). P,0,05 is considered statistical significant.  Table S2 TCF-4 (TCF7L2) rs3814570 frequency distribution and statistical analysis of Vienna cohort samples. The different distribution of genotypes is shown for each group and subgroup: controls, inflammatory bowel disease (IBD), Crohn's disease (CD), ulcerative colitis (UC), CD with solely colonic involvement (L2), CD with solely ileal (L1) and ileo-colonic CD (L3). Differences in genotype distribution compared to controls as well as the number of carriers (allele positivity) were subject to t-tests as well as Armitage's trend test. Found at: doi:10.1371/journal.pone.0004496.s003 (0.05 MB DOC) Table S3 TCF-4 (TCF7L2) rs3814570 frequency distribution and statistical analysis of Leuven cohort samples. The different distribution of genotypes is demonstrated for each group and subgroup: controls, inflammatory bowel disease (IBD), Crohn's disease (CD), ulcerative colitis (UC), CD with solely colonic involvement (L2), CD with solely ileal (L1) and ileo-colonic CD (L3). Differences in genotype distribution compared to controls as well as the number of carriers (allele positivity) were subject to ttests as well as Armitage's trend test. Found at: doi:10.1371/journal.pone.0004496.s004 (0.05 MB DOC)