The Crohn’s disease associated SNP rs6651252 impacts MYC gene expression in human colonic epithelial cells

Crohn’s disease (CD) is a debilitating inflammatory bowel disease (IBD) that arises from chronic inflammation in the gastrointestinal tract. Genome-wide association studies (GWAS) have identified over 200 single nucleotide polymorphisms (SNPs) that are associated with a predisposition for developing IBD. For the majority, the causal variant and target genes affected are unknown. Here, we investigated the CD-associated SNP rs6651252 that maps to a gene desert region on chromosome 8. We demonstrate that rs6651252 resides within a Wnt responsive DNA enhancer element (WRE) and that the disease associated allele augments binding of the TCF7L2 transcription factor to this region. Using CRISPR/Cas9 directed gene editing and epigenetic modulation, we find that the rs6651252 enhancer regulates expression of the c-MYC proto-oncogene (MYC). Furthermore, we found MYC transcript levels are elevated in patient-derived colonic segments harboring the disease-associated allele in comparison to those containing the ancestral allele. These results suggest that Wnt/MYC signaling contributes to CD pathogenesis and that patients harboring the disease-associated allele may benefit from therapies that target MYC or MYC-regulated genes.


Introduction
Crohn's disease (CD) and ulcerative colitis (UC) are the two main classes of inflammatory bowel disease (IBD), and arise from chronic inflammation in the gastrointestinal (GI) tract [1]. CD can present anywhere along the GI tract whereas UC is confined primarily to the colon [1,2]. In a generally accepted view, IBD results from one or more environmental triggers in a genetically susceptible individual [3,4]. While the precise environmental exposure is debatable, that fact that 5-23% of IBD patients have a first-degree relative that is also afflicted with disease is supportive of a genetic inheritance [5]. PLOS

Cell lines
The HCT116 and DLD-1 cell lines were obtained from the American Type Culture Collection (cat. numbers CCL-221 and CCL-247) while HEK293T cells were obtained from Invitrogen. HCT116 and HEK293T cells were maintained in DMEM (Corning) supplemented with 10% FBS, 5 mM L-glutamine, and 1% penicillin/streptomycin. DLD-1 cells were maintained in RPMI supplemented with 10% FBS and 1% penicillin/streptomycin. Cells were cultured in an incubator at 37˚C in 5% CO 2 .

Plasmids
To generate the rs6651252 WRE luciferase plasmid (rs6651252-luc), a 555 bp DNA segment containing rs6651252 was amplified by PCR with genomic DNA isolated from HCT116 cells using the DNAEasy kit (Qiagen) serving as the template. The PCR product was subcloned into the pGL3-promoter luciferase vector (Promega) as a KpnI-NheI fragment. Site-directed mutagenesis was conducted to convert the ancestral T variant to the disease-associated C variant using the QuickChange mutagenesis kit (Agilent) following the manufacturer's instructions.
To generate rs6651252 WRE expanded luciferase construct [rs6651252 (exp)-luc], a 2.7 kb segment containing the rs6651252 WRE was amplified by PCR and the product was likewise subcloned as a KpnI-NheI fragment into the pGL3-promoter luciferase vector. Primer sequences used in the PCR reactions can be found in S1 Table. For the epigenetic silencing assay, the phU6-sgRNA plasmid and the pLV hU6-sgRNA hUbC-dCas9-KRAB-T2a-Puro plasmid encoding dCas9-KRAB fusion protein was obtained from Addgene (#53188 and #71236, respectively). Using the 2.7 kb segment of DNA, the CRISPR guide RNA design tool (crispr.mit.edu) was used to identify unique guide sequences. For each of eight guides, paired oligonucleotides were designed with BbsI overhanging restriction sites to facilitate subcloning into the phU6-gRNA plasmid vector. Complementary DNA oligonucleotides (10 μM) corresponding to each of the eight guide RNA sequences, were annealed in 1x T4 ligase buffer by heating to 95˚C for 5 min and cooling 5˚C/min to room temperature. To digest and anneal sgRNA, 25 ng of pU6-gRNA plasmid was mixed with 1 μl annealed guide pair, 1 μl of 10x T4 ligase buffer, 2.5U BbsI, 0.5 μl of T4 ligase and 6 μl H 2 O. The reactions were subjected to 25 cycles at 37˚C for 5 min and 23˚C for 5 min on a DNAengine thermocycler (BioRad). Primer sequences used to generate the guide RNAs can be found in S1 Table. To create the rs6651252 WRE deletion cell line, guide RNAs were designed to flank a 692 bp region surrounding rs6651252, and corresponding oligonucleotides were annealed as described above. The fragments were ligated into the pSpCas9(BB)-2A-GFP (PX458) CRISPR/ Cas9 plasmid (Addgene, #48138), which was first digested with BbsI. Primer sequences used to generate guide RNAs and to assess the rs6651252 status are listed in S1 Table. Sanger sequencing was used to verify each plasmid insert and enhancer deletions in the knockout clones.

Luciferase reporter assays
Luciferase assays were conducted as described previously [22]. Briefly, approximately 2.5 x 10 4 cells were seeded per well in a 24-well plate. Transfections were conducted using Lipofectamine 2000 following manufacturer's guidelines. Each reaction contained 50 ng of the luciferase reporter plasmid, and 2 ng pLRL-SV40 Renilla, which served as a transfection control. Where indicated, 50 ng of pcDNA3.1-β-catenin S45F [28], 50 ng of pME18 Lef [28], and 50 ng of pLV hU6-sgRNA hUbC-dCas9-KRAB-T2a-Puro were added to the transfection. The phU6-sgRNA plasmids encoding the guide RNAs (25 ng each) were included as indicated. Total concentration of DNA was adjusted to 2 μg per reaction using pBluescript (Stratagene). Transfection mixtures were incubated on cells for 6 h, after which the media was replaced with normal growth media. Each reaction was conducted in quadruplicate. After 24 h, cells were lysed in 200 μl passive lysis buffer and luciferase levels were measured using the dual luciferase assay kit (30005-2; Biotium) on a Glomax 20/20 single chamber luminometer (Promega).

CRISPR/Cas9-mediated deletion of the rs6651252 WRE
CRISPR/Cas9 modified clonal HCT116 cell lines were generated following the protocol outlined previously [29]. Briefly, 500 ng of each CRISPR/Cas9 plasmid (PX458), encoding guide RNA sequences that flanked rs6651252, were transfected into HCT116 cells using lipofectamine 2000 (Invitrogen) for 6 h. After 24 h, the cells were harvested and a FACSDiva (Becton Dickinson) machine was used to seed 2 cells per each well of a 96-well plate. After expanding the clones, genomic DNA was isolated using the Lyse&Go kit (Pierce) and the rs6651252 region was amplified by PCR using the DreamTaq Green (Thermo) polymerase and primers listed in S1 Table. The products were resolved on a 1% agarose gel, excised with a scalpel, and purified using the MinElute PCR Purification Kit (Qiagen). The products were sequenced to identify clones harboring rs6651252 WRE deletions.

DNA pull-down assay
The DNA pull-down assay was conducted as previously described with minor modification [30]. Probes were designed with 10 nucleotides flanking each side of the TCF consensus motif within the rs6651252 WRE. The paired probes were annealed as described previously [30]. The annealed probes (15 μM) were incubated with 0.1 mg of streptavidin coated magnetic beads (Promega Z5481) for 1.5 h at room temperature. Prior to use, the beads were washed three times in 1X sodium citrate buffer (30 mM NaCl, 0.035 mM C 6 H 5 O 7 Na 3 ) and then resuspended in 100 μl binding buffer (1M NaCl, 10 mM Tris, 1 mM EDTA, pH = 8.0). To prepare the protein lysates, 2.5 x 10 7 HCT116 cells were harvested and lysed in RIPA buffer supplemented with freshly added protease inhibitors (1 mM PMSF, 10 μl/ml aprotinin, 10 μg/ml leupeptin). For each binding reaction, 200 μg of protein lysate was incubated at 4˚C with 150 μg sonicated salmon sperm for 30 min to reduce non-specific interactions. This lysate was added to probes conjugated with the magnetic beads and the reactions were incubated for 2 h at room temperature on a rotating platform. The protein/DNA complexes were collected using a magnetic stand, washed three times in RIPA buffer, and eluted in 50 μl of 2x laemmli loading buffer. The proteins were resolved on an 8% polyacrylamide gel and standard western blot analysis as previously described [31]. The blots were incubated overnight with anti-TCF7L2 antibodies (05-511 Millipore, 1:750 dilution) followed by incubation with HRP conjugated anti-mouse secondary (Jackson Immunoresearch, 1:5000 dilution) for 2 h prior to ECL treatment and exposure to film. In competition assays, 250 ng, 500 ng, or 1500 ng of annealed and unlabeled probes were added concurrently to the binding reactions containing the indicated biotinylated probes. Probe sequences are listed in S1 Table.

CRISPR/dCas9 repression assay
In the epigenetic repression assays, 250 ng of the pLV hU6-sgRNA hUbC-dCas9-KRAB-T2a-Puro plasmid and 25 ng each of the phU6-gRNA plasmids harboring guide RNAs that tiled the 2.7 kb segment containing the rs6651252 WRE were electroporated into 1.0 x 10 7 HCT116 or HEK293T cells using the Amaxa Cell line Nucleofector Kit V (Lonza) and an Amaxa Nucleofector II electroporator following the manufacturers guidelines. MYC gene expression was assessed by RT-qPCR 72 h after electroporation.

Quantitative reverse transcription PCR
To assess MYC gene expression, RNAs were collected and cDNAs synthesized using protocols that were described previously [29]. For experiments involving patient tissues, approximately 1 g of flash-frozen and full thickness colonic tissue was homogenized in 1 ml of TRIzol (Thermo Fisher) in an eppendorf tube using a disposable micropestle [32]. Following a 5 min incubation at room temperature, 200 μl of chloroform was added and the mixture shaken vigorously before centrifugation at 12,000 x g for 15 min at 4˚C. The upper layer was removed, mixed with an equal volume of 100% ethanol, and the RNAs were purified using an RNeasy Mini kit (Qiagen). For both cell line and patient sample experiments, cDNA was synthesized from 500 ng of RNA using the iScript cDNA synthesis kit (BioRad), following manufacturer protocol. Data are presented as relative levels using the 2 ΔCT method with GAPDH and TUBB3 serving as the reference genes. Primer sequences used in the RT-qPCR experiments are listed in S1 Table.

Patient samples
The samples evaluated in this study were collected with patient consent as part of the biorepository within the Department of Surgery, Division of Colon and Rectal Surgery at the Pennsylvania State University College of Medicine. The Pennsylvania State University College of Medicine Institutional Review Board (IRB) evaluated and approved protocol number HY98-057EP-A "A Proposal for Creation of an Inflammatory Bowel Disease Registry" for creation and maintenance of our biorepository. The allelic status of IBD associated SNPs is assessed using a custom oligonucleotide array [33].

Statistics
Each experiment was performed at least three times. For ChIP-qPCR and RT-qPCR, each sample was amplified in quadruplicate reactions per experiment. Significance of the results was evaluated with either a pairwise Student's T-test or one-way ANOVA with a Tukey test. Data were considered significant for p-value less than 0.05.

β-Catenin and TCF7L2 bind the rs6651252 locus in colonic epithelial cells
In our prior work, we conducted a ChIP-Seq screen to identify β-catenin-bound genomic regions in an colonic epithelial cell line [27]. We hypothesized that some of these regions could demarcate enhancer elements that contain IBD-associated SNPs. By overlapping β-catenin ChIP-Seq peak regions with positions of IBD SNPs, we identified the CD-associated SNP, rs6651252, as a leading candidate (Fig 1A). Analysis of ENCODE data deposited on the genome browser (http://genome.ucsc.edu/) found that elevated levels of H3K27 acetylated histones, which are a marker of active enhancers, colocalized to this region [34,35]. To confirm these findings, we conducted ChIP-qPCR assays in the HCT116 and DLD-1 colonic epithelial cell lines and used the non-intestinal cell line, HEK293T, as a control. Using primers that flanked rs6651252, we detected robust β-catenin and TCF7L2 binding to this region in the colonic epithelial cell lines and not HEK293T (Fig 1B and 1C). Very little signal was detected using primers that annealed to a region in the TUBULIN gene, which attests to the specificity of our ChIP assays. Furthermore, we detected elevated levels of H3K27 acetylated histones at rs6651252 in the intestinal cell lines and not HEK293T, suggesting that this CD-associated SNP may localize to an active and cell-type specific regulatory DNA enhancer element ( Fig  1D) [34,35].

TCF7L2 binds DNA harboring disease-associated rs6651252 variant with stronger affinity
The rs6651252 SNP maps immediately adjacent to a consensus TCF binding motif (Fig 2A) [27]. The ancestral allele at this position is a T while the disease-associated allele is a C with a minor allelic frequency of 0.14 [7]. We conducted DNA pull-down assays to determine whether TCF7L2 bound to this fragment of DNA and to test whether the rs6651252 allelic variants impacted its association. Nuclear protein lysates from HCT116 cells were incubated with biotinylated DNA probes, the complexes were precipitated using streptavidin-conjugated magnetic beads and eluted proteins were subjected to western blot analysis. We found that TCF7L2 bound to this element and that mutating the TCF consensus motif blocked its association ( Fig 2B). Moreover, the probe containing the C variant precipitated more TCF7L2 compared to the probe containing the T variant ( Fig 2B). In addition, we found that adding increasing amounts of unlabeled C probe more effectively competed with TCF7L2 binding to the biotinylated T probe in comparison to reactions containing equivalent amounts of unlabeled T probes ( Fig 2C). Therefore, the C variant of rs6651252 potentiates binding of TCF7L2 to this DNA element.

rs6651252 demarcates a Wnt responsive DNA enhancer element
We next used heterologous luciferase reporter assays to determine whether rs6651252 was embedded within a DNA enhancer element. Using genomic DNA isolated from a human colonic epithelial cell line, we PCR amplified a 555 bp segment containing rs6651252 and inserted it upstream of the minimal SV40 promoter in the pGL3-luciferase vector (Fig 3A). We refer to this plasmid as rs6651252-luc and upon sequencing the insert, it contained the T ancestral allele. In comparison to HCT116 and DLD-1 cells transfected with the vector backbone alone, rs6651252-luc drove higher levels of luciferase (Fig 3B and 3C). In HEK293T cells, rs6651252-luc produced lower levels of luciferase relative to the control vector (S1 Fig). Next, we used site-directed mutagenesis to substitute the rs6651252 T variant with the disease-associated C variant (Fig 3D). In both HCT116 and DLD-1 cells, this plasmid drove higher levels of luciferase relative to rs6651252 containing the T variant (Fig 3E and 3F). These results demonstrate that a DNA segment harboring rs6651252 is a cell-type specific enhancer element and that the C variant augments its activity.

The rs6651252 WRE regulates MYC gene expression
In a recent study, Meddens et al. used 4C-seq to identify the gene targets of enhancer elements containing embedded IBD-associated SNPs [36]. In that report, rs6651252 was found juxtaposed to the MYC and POU5F1B gene loci on chromosome 8 through long-range chromatin loops [36]. Whether the rs6651252 enhancer regulated MYC or POU5F1B expression was not demonstrated. To test if these genes were regulated targets, we used CRISPR/Cas9 to delete the rs6651252 enhancer in the HCT116 cell line (Fig 4A) [29]. After propagating independent clonal lines, we used PCR to assess rs6651252-enhancer status. We successfully identified multiple lines with either heterozygous or homozygous deletions (Fig 4B). The PCR-based genotyping was confirmed using Sanger sequencing. In comparison to a control clone that lacked deletions, two independent knockout clones displayed reduced levels of MYC expression as To confirm that MYC gene expression is regulated by the distal rs6651252 WRE, we used a CRISPR/Cas9-based approach to epigenetically repress the function of this DNA regulatory element [37]. In this assay, guide RNAs are used to recruit a mutant Cas9 (dCas9, lacking endonuclease activity) fused to the KRAB transcriptional repressor. We generated eight guide RNAs that tiled a 2.7 kb segment to target dCas9-KRAB to the rs6651252 locus ( Fig 5A). Prior to using this system, we performed various control experiments to validate the approach. First, we conducted luciferase assays using the same vector backbone used previously in Fig 3, except that the 2.7 kb segment containing rs6651252 was inserted upstream of the SV40 promoter. In transfected HEK293T cells, plasmids containing β-catenin and LEF (a TCF family member) cDNAs activated expression of luciferase (Fig 5B). Whereas inclusion of plasmids encoding the guide RNAs alone resulted in a slight decrease, co-transfecting guides along with dCa-s9-KRAB blocked β-catenin/LEF-driven luciferase activity (Fig 5B). Similarly, gRNA/dCa-s9-KRAB complexes effectively reduced rs6651252 (exp)-driven luciferase levels in HCT116 cells (Fig 5C). Using ChIP-qPCR assays in transfected HEK293T cells, we found that inclusion of the guide RNAs increased levels of dCas9-KRAB recruited to the rs6651252 locus (Fig 5D). Having validated the system, we introduced the guide RNAs and dCas9-KRAB into HCT116 cells and this significantly reduced MYC expression in these cells (Fig 5E). Together, these experiments demonstrate that the rs6651252 enhancer regulates MYC expression in colonic epithelial cells.

The rs6651252 C variant correlates with increased MYC expression in patient colonic tissues
To determine whether rs6651252 impacts MYC gene expression in vivo, we obtained colonic segments that were surgically resected from CD patients. As controls, we obtained intestinal tissues from patients that underwent resection for non-IBD related issues such as slow transit or volvulus. We genotyped rs6651252 in genomic DNAs isolated from collected blood and found that while all control patients were homozygous for the ancestral T variant, half of the CD patients were TT and the other half were heterozygous TC. We then isolated RNA from flash-frozen tissues and assessed MYC gene expression using RT-qPCR. This analysis found that tissues from CD patients that were heterozygous for the risk variant (TC) contained higher levels of MYC transcripts compared to tissues from CD patients harboring homozygous alleles (TT) or controls (Fig 6).

Discussion
GWAS have identified over 200 SNPs that are associated with a predisposition for developing IBD [3,[6][7][8]. While some of these are found within protein-coding regions of the genome, most map to intergenic and gene-poor loci [3]. Recent work has shown that many of these non-coding SNPs are found within regions of accessible chromatin, demarcated by elevated levels of H3K27ac [13]. Most often, these SNPs are assumed to impact the nearest gene promoter [2,3]. However, it is known that enhancers are capable of impacting more than one gene and can bypass the nearest gene to influence expression of a neighboring gene [38]. Due to the inherent difficulty in studying non-coding regions of DNA, particularly those that may function in a cell-type specific manner, the causative impact of these non-coding SNPs on gene function largely remains a mystery. This current study focused on the CD-associated SNP, rs6651252 that maps to the 8q24 locus. This locus is a large non-coding region of the genome that contains numerous SNPs that have been shown to impact ovarian, prostate and colorectal cancers [39].
We demonstrate that rs6651252 demarcates a WRE and that the disease-associated allele potentiates enhancer activity through higher affinity binding of TCF7L2. In an earlier study, Meddens et al. reported evidence that this rs6651252 enhancer is juxtaposed to the MYC promoter, implicating MYC as a direct target of this enhancer [36]. Our work confirms and The CD-associated SNP rs6651252 regulates MYC extends this finding as we found that either deletion or epigenetic silencing of this element reduces MYC expression in human colonic epithelial cells. Moreover, in our survey of human colonic tissues resected from IBD patients, we find that the disease-associated rs6651252 increases MYC expression. Together, these findings are similar to those reported for the cancer-associated SNP rs6983267, which also resides within this gene desert region on chromosome 8 [40][41][42]. While the rs6983267 disease-associated allele augments TCF binding and enhancer activity, whether it differentially impacts MYC gene expression in normal tissues or cancers is still a matter of debate [43].
While much attention has been given to understanding of the role of MYC in colorectal carcinogenesis, much less is known about its role in IBD [44]. Using the dextran sodium sulfate (DSS) model of acute colitis in mice, we reported that slight elevation of MYC (~2.5 fold) promotes restitution of the colonic epithelium [45,46]. Furthermore, we found that lithium treatment, which inhibits glycogen synthase kinase and stabilizes MYC, confers a favorable response to colonic regeneration after acute DSS-induced damage in mice [47]. These findings indicate that short-term MYC stabilization may provide favorable outcomes in IBD by promoting restitution of the epithelial monolayer [45][46][47][48]. However, an earlier study reported that levels of MYC transcripts are elevated in intestinal tissues isolated from IBD patients in comparison to controls [49]. Furthermore, higher levels of MYC protein was found in inflamed IBD intestinal tissue in comparison to control, non-IBD colonic segments [49]. In addition, MYC expression is elevated, and the MYC chromosomal locus is frequently amplified, in colitis-associated cancer (CAC) [50][51][52]. Therefore, while transient stabilization of MYC is beneficial, long-term and sustained MYC expression is likely detrimental in IBD and CAC.
Our analysis of surgically resected colonic tissues indicates that CD patients harboring the rs6651252 risk allele display elevated levels of MYC transcripts relative to levels found in tissues from CD patients that are homozygous for the ancestral allele or controls (Fig 6). Because we have a limited number of control tissues in our Colorectal Disease Biobank, we were unable to The CD-associated SNP rs6651252 regulates MYC identify control colonic tissues that were heterozygous for the risk allele. This precludes interpretations that elevated MYC may contribute to disease onset. However, our analysis of genotyped diseased tissues suggests that elevated MYC may contribute to disease pathogenesis in CD patients harboring the risk allele. It follows that patients who are homozygous for the disease variant may have a more pronounced phenotype, and as we continue to recruit patients into our biorepository, we hope to identify such patients and will explore this possibility in a future study.
One limitation of our study of human colonic tissues is that full-thickness specimens were analyzed. While we favor a model whereby the rs6651252 disease associated allele elevates MYC expression in the colonic epithelium, we cannot dismiss the possibility that other cell types, such as resident lymphocytes, could be contributing to the overall differences we detect. In support of this possibility, Mokry et al. demonstrated that rs6651252 resides in a region of accessible chromatin in specific CD4 + sub-cellular populations [13]. Due to the importance of T-cells in IBD pathologies [53], and the role of Wnt signaling in T-cell biology [54], it is possible that the rs6651252 WRE influences MYC expression in cells that function in the adaptive cellular immune response. Purification of specific population of cells from genotyped samples and assessing MYC gene expression in these cells is required to fully understand how rs6651252 is impacting CD pathogenesis.
In summary, our work shows that rs6651252 demarcates a WRE within the 8q24 locus. The CD-associated allele facilitates stronger TCF7L2 binding to the WRE, potentiates enhancer activity, and increases MYC gene expression. While additional work is needed to further define the cell types in which this rs6651252 WRE operates, and the constellation of genes whose expression it impacts, these findings suggest that CD patients harboring this allele may benefit from therapies that target MYC or MYC-regulated genes. Along these lines, MYC gene expression is sensitive to inhibitors that target bromodomain and extra-terminal family of proteins (iBETs), such as JQ1 [55,56]. Our findings presented here support the idea that JQ1 should be further evaluated in pre-clinical mouse models of IBD with the hope that someday it could used to augment current treatment strategies for IBD patients.