We have developed a transcriptome-wide approach to identify genes affected by promoter CpG island DNA hypermethylation and transcriptional silencing in colorectal cancer. By screening cell lines and validating tumor-specific hypermethylation in a panel of primary human colorectal cancer samples, we estimate that nearly 5% or more of all known genes may be promoter methylated in an individual tumor. When directly compared to gene mutations, we find larger numbers of genes hypermethylated in individual tumors, and a higher frequency of hypermethylation within individual genes harboring either genetic or epigenetic changes. Thus, to enumerate the full spectrum of alterations in the human cancer genome, and to facilitate the most efficacious grouping of tumors to identify cancer biomarkers and tailor therapeutic approaches, both genetic and epigenetic screens should be undertaken.
Loss of gene expression in association with aberrant accumulation of 5-methylcytosine in gene promoter CpG islands is a common feature of human cancer. Here, we describe a method to discover these genes that permits identification of hundreds of novel candidate cancer genes in any cancer cell line. We now estimate that as much as 5% of colon cancer genes may harbor aberrant gene hypermethylation and we term these the cancer “promoter CpG island DNA hypermethylome.” Multiple mutated genes recently identified via cancer resequencing efforts are shown to be within this hypermethylome and to be more likely to undergo epigenetic inactivation than genetic alteration. Our approach allows derivation of new potential tumor biomarkers and potential pathways for therapeutic intervention. Importantly, our findings illustrate that efforts aimed at complete identification of the human cancer genome should include analyses of epigenetic, as well as genetic, changes.
Citation: Schuebel KE, Chen W, Cope L, Glöckner SC, Suzuki H, Yi J-M, et al. (2007) Comparing the DNA Hypermethylome with Gene Mutations in Human Colorectal Cancer. PLoS Genet 3(9): e157. doi:10.1371/journal.pgen.0030157
Editor: Jeannie T. Lee, Massachusetts General Hospital, United States of America
Received: April 12, 2007; Accepted: July 31, 2007; Published: September 21, 2007
Copyright: © 2007 Schuebel et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: Supported by National Institute of Environmental Health Sciences grant ES11858, National Cancer Institute grants CA043318 and CA06973–44 and American Surgical Association Foundation.
Competing interests: The commercial rights to the MSP technique belong to Oncomethylome Sciences. SBB and JGH serve as consultants to Oncomethylome Sciences and are entitled to royalties from any commercial use of this procedure.
Abbreviations: CRC, colorectal cancer; DAC, 5-aza-2′-deoxycytidine; DKO, double knockout; MSP, methylation-specific PCR; RT-PCR, reverse transcriptase PCR; TSA, trichostatin A
It is now well established that loss of proper gene function in human cancer can occur through both genetic and epigenetic mechanisms [1,2]. The number of genes mutated in human tumor samples is being clarified. Recently, Sjöblom et al.  sequenced 13,023 genes in colorectal cancer (CRC) and breast cancer, and estimated an average of 14 significant mutations per tumor, suggesting that a relatively small number of genetic events may be sufficient to drive tumorigenesis. In contrast, the full spectrum of epigenetic alterations is not well delineated. The best-defined epigenetic alteration of cancer genes involves DNA hypermethylation of clustered CpG dinucleotides, or CpG islands, in promoter regions associated with the transcriptional inactivation of the affected genes . These promoters are located proximal to nearly half of all genes  and are thought to remain primarily methylation free in normal somatic tissues. The exact number of such epigenetic lesions in any given tumor is not precisely known, although a growing number of screening approaches, none covering the whole genome efficiently, are identifying an increasing number of candidate genes [5–13]. Given the large number of potential target promoters present in the genome, we hypothesized that many more hypermethylated genes await discovery.
Herein, we describe a whole human transcriptome microarray screen to identify genes silenced by promoter hypermethylation in human CRC. The approach readily identifies candidate cancer genes in single tumors with a high efficiency of validation. By comparing the list of candidate hypermethylated genes with mutated genes recently identified in CRC , we establish key relationships between the altered tumor genome and the gene hypermethylome. Our studies provide a platform to understand how epigenetic and genetic alterations drive human tumorigenesis.
Developing the Whole Transcriptome Approach
Our first step towards a global identification of hypermethylation-dependent gene expression changes was made by comparing, in a genome-wide expression array-based approach, wild-type HCT116 CRC cells with isogenic partner cells carrying individual and combinatorial genetic deletions of two major human DNA methyltransferases (Figure 1A) . Importantly, in the DNMT1(−/−)DNMT3B(−/−) double knockout (DKO) HCT116 cells, which have virtually complete loss of global 5-methylcytosine, all previously individually examined hypermethylated genes lacking basal expression in wild-type cells undergo promoter demethylation with concomitant gene re-expression [10,14–16]. By stratifying genes according to altered signal intensity on a 44K Agilent Technologies array platform, we observe a unique spike of gene expression increases in the DKO cells when compared to the isogenic wildtype parental cells, or isogenic cell lines in which DNMT1 or DNMT3B have been individually deleted and which harbor minimal changes in DNA methylation (Figure 1B). This minimal change in the DNMT1(−/−)cells may, in part, be due to recently identified alternative transcripts arising from the DNMT1 locus [17,18].
(A) RNA from the indicated cell lines was isolated, labeled, hybridized, scanned, and fluorescent spot intensities normalized by background subtraction and Loess transformation using Agilent Technologies 44K human microarrays. Parental wild-type HCT116 cells (WT) and isogenic knockout counterparts for DNA methyltransferase 1 (DNMT1−/−) or 3b (DNMT3B−/−) are compared in our study. DKO cells are doubly deficient for both DNMT1 and DNMT3B.
(B) Gene-expression changes in HCT116 cells with genetic disruption of various DNA methyltransferases. A 3-D scatter plot indicating the gene-expression levels in HCT 116 cells with genetic disruption of DNMT1 (x-axis), DNMT3B (z-axis), and both DNMT1 and DNMT3B (DKO; y-axis) in fold scale. Individual gene-expression changes are in black with the average for three experiments (red spots) or from an individual experiment (blue spots) for those genes in DKO cells with greater than 4-fold expression change.
(C) HCT116 cells were treated with 300 nM TSA for 18 h or 5 μM DAC for 96 h and processed as described above.
(D) Gene-expression changes for HCT116 cells treated with TSA (x-axis) or DAC (y-axis) are plotted by fold change. Yellow spots indicate genes from DKO cells with 2-fold changes and above. Notice the loss of sensitivity when compared to gene-expression increases seen in DKO cells (80% of genes greater than 4-fold in the DKO cells now becomes greater than 1.3-fold in DAC-treated cells). Green spots indicate randomly selected genes verified to have complete promoter methylation in wild-type cells, reexpression in DKO cells and after DAC treatment, while red spots indicate selected genes that were identified as false positives (See Figures 4, 6, and 7 for validation results). Blue spots indicate the location of the 11 guide genes—previously shown to be hypermethylated and completely silenced in HCT 116 cells—used in this study (see Figure 3 for description). A distinct group of genes, including five of 11 guide genes, displays increases of greater than 2-fold after DAC treatment but no increase after TSA treatment. These genes form the top tier of candidate hypermethylated genes as discussed in the text.
(E) Relatedness of whole-transcriptome expression patterns identified by dendrogram analysis. Individual single genetic disruption of DNMT1 and DNMT3B, DKO and DAC treatment, and TSA treatment each form three distinct categories of gene expression changes.
(A) Gene-expression changes for the indicated cells treated with TSA (x-axis) or DAC (y-axis) are plotted by log-fold change, and individual genes are shown in black.
(B) Validation of the DNA hypermethylome. The characteristic spike of hypermethylated genes defined by treatment of cells with DAC or TSA consists of two tiers, with distinct features. The top tier of genes was identified as a zone in which gene expression did not increase with TSA (<1.4 fold) and displayed no detectable expression in wild-type cells, but increased greater than 2-fold with DAC treatment. The next tier of genes was identified as a cluster of genes for which expression changes of TSA and wild type were identical to those in the top tier, but increased between 1.4-fold and 2-fold with DAC treatment. Gene expression validation by RT-PCR and MSP indicated a validation frequency of 91% for top-tier genes in HCT116 cells, including genes that increased in DKO cells by greater than 2-fold. Next-tier genes in HCT116 cells were confirmed at a frequency of 49%, and in the SW480 top tier, with a frequency of 65%.
(C) Shared candidate hypermethylated genes in CRC cell lines. We identified a total of 5,906 unique genes in all six cell lines with expression changes falling within the criteria of top- or next-tier categories. Overlaps in gene expression changes among two, three, four, five, or six cell lines are indicated; these range from 1,414 genes shared among two cell lines to 78 genes that were shared among all six cell lines.
We tested our approach using a pharmacologic strategy based on our previous approach , but now markedly modified to provide whole-transcriptome coverage, to identify silenced hypermethylated genes in any cancer cell line. For densely hypermethylated and transcriptionally inactive genes, the DNA demethylating agent 5-aza-2′-deoxycytidine (DAC) has a well established capacity to induce gene re-expression [19,20]. On the other hand, for these same genes, the class I and II histone deacetylase inhibitor, trichostatin A (TSA) will not alone induce reexpression [10,21]. We now use this lack of TSA response for such genes to provide a new informatics filter to identify the majority of DNA hypermethylated genes in cancer. After treatment of HCT116 cells with either DAC or TSA (Figure 1C), we identified a zone in which gene expression did not increase with TSA (<1.4-fold) and displayed no detectable expression in mock-treated cells. Within this zone, we observed a characteristic spike of DAC-induced gene expression that virtually completely encompasses the genes with increased expression in DKO cells (compare yellow spots in Figure 1D with blue spots in Figure 1B). This gene spike is absolutely dependent upon analysis of only genes that fail to respond to histone deacetylase inhibition, underscored by a cluster analysis that shows the close relationship between genes in DKO- and DAC-treated cells with a separate grouping of gene-expression changes after TSA treatment alone or in single knockouts (Figure 1E). These data confirm previous studies covering much less of the genome, and using only treatment of cells with DAC and TSA together, in which genes with dense CpG islands that were reexpressed by TSA harbored only partial or no detectable hypermethylation [10,21].
Importantly, a similar spike of gene expression increases could be seen in five additional human CRC cell lines, SW480, CaCO2, RKO, HT29, and COLO320 (Figure 2A), as well as cell lines derived from lung, breast, ovary, kidney, and brain (unpublished data), confirming that this approach works universally in cancer cell lines and identifies overlapping gene sets (Figure 2C). However, it is important to note that—possibly because DAC incorporates into the DNA of dividing cells, and our treatments were performed for only 96 h—sensitivity for detecting the gene increases in the pharmacological approach is reduced in HCT116 cells compared to that seen in DKO cells (Figure 1D). To address the sensitivity with which our new array approach identifies CpG island hypermethylated genes, we first examined 11 genes known to be hypermethylated, completely silenced and reexpressed after DAC treatment in HCT116 cells (Figure 3A). All tested genes remained within the TSA nonresponsive zone (Figure 3B), and the direction of expression changes correlated well in DAC treated and DKO cells (Figure 3C). Importantly, for the DAC increase, five of the guide genes (45%) increased 2-fold or more and three more genes, or a total of 73%, increased 1.3-fold or more (Figure 3D). We estimate, then, that we can detect over 70% of DNA hypermethylated genes in a given cancer cell line and we test this hypothesis in studies directly below.
Validating the Methylation Status of Candidate Genes Derived from the Screening Approach
Based on the sensitivity differences observed between DKO- and DAC -induced gene increases (compare Figure 1B and D; also Figure 3B and 3C) and behavior of the guide genes in the array platform, we designated, within the TSA-negative zone, a top tier (2-fold increase or above) and a next tier of genes (increasing between 1.4- and 2-fold) to identify hypermethylated cancer genes (Figure 2B). Importantly, we introduced an additional filter for selecting genes from these zones based on their having no basal expression in untreated cells, since this full lack of transcription is characteristic of promoter CpG island methylated genes in cell culture. Indeed, based on these selection criteria, in HCT116 cells, 32 of 35 (91%, Figure 4) of randomly chosen CpG island–containing genes spanning the top-tier response zone of 532 genes (Figure 5), and 31 of 48 such SW480 cell genes (65%, Figure 6) from among 318 top tier genes proved to be CpG hypermethylated as measured by methylation-specific PCR (MSP) , and silenced in the cell line of origin as measured by reverse transcriptase PCR (RT-PCR). We also examined the efficiency of discovery for hypermethylated genes in the next tier of DAC-treated HCT116 cells. Of the 1,190 genes identified in this region, 17 of 35 (49%) randomly selected genes containing a CpG island were hypermethylated with concordant gene silencing (Figure 7). Our verification rates then demonstrate around 65% efficiency of our approach, which is close to our original estimate and which is excellent compared to previous screens for identifying new cancer hypermethylated genes [6,23]. With this level of verified hypermethylation, we calculate that the hypermethylome in HCT116 cells consists of an estimated 1,067 genes and an estimated 579 genes for the SW480 cells (See Table S1 for a detailed description of calculations). The hypermethylome would be estimated to range from 532 genes in CaCO2 to 1,389 genes in RKO cells (Table S1).
(A) Gene names, Agilent Technologies probe name, Genbank accession number, and references for the 11 guide genes previously shown to be hypermethylated and completely silenced in HCT116 cells.
(B, C) Blue spots and gene names indicate the location of the 11 guide genes in a plot of TSA (x-axis) versus DAC (y-axis) gene expression changes on a log scale (B) or fold-change (C) scale. Five of 11 guide genes, circled in green, display increases of greater than 2-fold after DAC treatment but no increase after TSA treatment and these same genes have greater than 3-fold increases in DKO cells (green circle)
(D) Direct comparison of guide genes in DKO and DAC plots. A distinct group of five guide genes, indicated by a green circle, showing greater than 3-fold expression changes in DKO cells and greater than 2-fold in DAC-treated cells, define the upper tier of candidate hypermethylated genes as discussed in the text. Another three genes increased 1.3-fold, and three failed to increase with DAC treatment, allowing criteria for the next tier of gene expression to be established as described in the text.
We next asked whether our top and next-tier regions truly enriched for hypermethylated genes by examining a randomly selected subset of 22 control genes located outside these zones. These genes were located in the responsive TSA zones (Zones 1 and 2, Figure S1A) or below the threshold of DAC responsiveness in the TSA nonresponsive zone (Zone 3, Figure S1A) in HCT116 cells. Of the tested genes, only 9% (2 of 22, Figure S1B) showed detectable methylation with concomitant gene silencing, confirming the specificity of our approach and validating the criteria we used to establish the top and next-tier approach. We can then predict that for cancer cell lines, with use of our filters, ~90% of promoter CpG island DNA methylated genes lie in the negative TSA-responsive zone.
List of HCT116 candidate hypermethylated genes selected for verification of expression (by RT-PCR of HCT116 and DKO cells) and promoter methylation (by MSP of HCT116 and DKO cells) status. Gene descriptions are indicated on the left side of the panel and gene names are shown next to the PCR results. Water (RT-PCR and MSP), in vitro methylated DNA (for MSP), and actin beta (ACTB) were used as controls for each individual gene; a representative sample is shown. Green arrows identify genes that verified the array results, red arrows those that did not.
A fundamental question in cell culture–based approaches is whether they identify genes that are targets for inactivation in primary tumors. To address this, 20 CpG island containing genes from the verified gene lists were randomly selected from the HCT116 top tier (17 genes), HCT116 next tier (two genes), or SW480 top tier (one gene) and analyzed for methylation in a panel of CRC cell lines. All of the tested genes were hypermethylated in two or more cell lines (Figure 8). We then examined the status of these 20 genes in a panel of 20 to 61 primary colon cancers and 20 to 40 normal-appearing colon tissue samples obtained from cancer-free individuals. Most of the genes (65%) were completely unmethylated or rarely methylated in the normal colonic tissue samples, but were methylated in a vast majority (86%) of the primary tumors (Figure 8). Of the 20 genes analyzed, 13 genes (65%) satisfied criteria for “tumor-specific methylation” with high-frequency methylation in cell lines, low (<5%) or undetectable methylation in normal colon, and frequent methylation in primary tumor samples (Figure 8). The efficiency of our strategy suggests a discovery rate of approximately one in two for identification of hypermethylated genes in cell lines and approximately one in three for identification of cancer-specific hypermethylated genes. Our estimate of approximately 400 hypermethylated genes per primary tumor now can be matched with predictions of Costello et al.  for hypermethylation of CpG islands, based on screening with Restriction Landmark Genomic Scanning approaches.
Green spots show the location of individual genes with names indicated in blue. The top tier of gene-expression changes within the spike shown in Figure 1D has been magnified, and values for DAC and TSA expression changes are shown in log scale.
Validating Potential Biologic Relevance of Newly Identified Genes
We next tested some parameters for biological significance of two of the genes harboring tumor-specific methylation for their likely importance in primary colon cancers. One, the neuralized homolog (Drosophila) (NEURL) gene, is located in a chromosome region with high deletion frequency in brain tumors , and its product has been identified as a ubiquitin ligase required for Notch ligand turnover [25–27]. Activation of this key developmental pathway influences cell-fate determination in flies and vertebrates [28,29] and activation of Notch, through unknown mechanisms, is thought to play an inhibitory role in normal differentiation during colorectal cancer . The second gene, FOXL2, belongs to the forkhead domain–containing family of transcription factors implicated in diverse processes including establishing and maintaining differentiation programs . Intriguingly, this gene is essential for proper ovarian development  and germline mutations in humans lead to a plethora of craniofacial anomalies and premature ovarian failure . We find both of these genes to be frequently DNA hypermethylated in a panel of colorectal cell lines (five of nine cell lines for NEURL and seven of nine for FOXL2, Figure 9A and 9C), and bisulfite sequencing revealed methylation of all CpG residues in the central CpG island regions of both genes in HCT116 and RKO cell lines, with complete demethylation in DKO cells (Figure 9B and 9D). For both genes, this hypermethylation perfectly correlated with loss of basal expression and ability to reexpress the genes with DAC treatment (Figure 9A and 9C). Importantly, promoter methylation of both genes, as assessed by bisulfite sequencing (Figure 9B and 9D) is absent in normal human colon or rectum, but frequent in primary colon cancers (Figure 9E and 9F), suggesting that hypermethylation arose as a cancer-specific phenomenon, although slight methylation was observed at the FOXL2 locus in normal tissue from aged patients (unpublished data). Finally, the pattern for hypermethylation of the FOXL2 and NEURL genes in cell culture fit with a biology important to a subset of colon cancers. As many as one in eight colorectal cancers, predominantly those from the right side of the colon, harbor a defect in mismatch-repair capacity [34,35], primarily due, in nonfamilial cancers, to inactivation of MLH1 by epigenetic mechanisms . Such tumors belong to a group with high frequency of hypermethylated gene promoters [37,38]. The hypermethylation of FOXL2 and, especially, NEURL, aggregate with these tumor types not only among the colon cancer cell lines (HCT116, DLD1, LoVo, RKO, and SW48), but also when analyzed in a series of primary human colon cancers (Fisher's exact test value of 0.024 for FOXL2 and 0.001 for NEURL, Figure 9G).
Initial in vitro studies suggest that both FOXL2 and NEURL might possess tumor-suppressor activity. When overexpressed in colon cancer cell lines, full-length FOXL2 and NEURL (Figure 10A and 10C), generate a 10-fold and 20-fold reduction, respectively, in colony growth of HCT116 cells (Figure 10C), with surviving clones having severely depleted size (Figure 10B), comparable to results obtained with the bona fide tumor suppressor p53 (Figure 10F). Similar results were seen in RKO and DLD1 cells (Figure 10D and 10E), both of which have complete gene silencing at the FOXL2 and NEURL loci. While the precise molecular mechanisms for the growth suppression remains to be determined, Notch signaling has recently been shown to play an important role in differentiation of intestinal crypt cells where deletion of the Notch effector molecule RBPJκ or treatment with a highly selective γ-secretase inhibitor was found to be sufficient for conversion of crypt cells to goblet cells [28,29]. Similarly, the closely related FOXL2 transcription factor family member FOXL1 has recently been shown to play a role in epithelial–mesenchymal transition of the intestinal epithelium .
Comparison of Newly Identified DNA Hypermethylated Genes to Mutated Genes Identified from Sequencing of Cancer Genomes
While it is clear that genetic and epigenetic mechanisms are both important to initiation and progression of human tumorigenesis, the relative contributions of each of these alterations need to be clarified on a global basis. Studies of classic tumor suppressor genes such as VHL in renal cancer and MLH1 in colon cancer indicate that important cancer genes can have an incidence of inactivation by either genetic or epigenetic mechanisms [36,40]. However, a genome-wide analysis to query such relationships has not been performed.
Genes were selected for verification of expression (by RT-PCR of HCT116 and DKO cells) and promoter methylation (by MSP of HCT116 and DKO cells) status. Gene names are indicated on the left side of the panel and gene abbreviations are shown next to the PCR results. Water (RT-PCR and MSP), in vitro methylated DNA (for MSP), and actin beta (ACTB) were used as controls for each individual gene; a representative sample is shown. Green arrows identify genes that verified the array results, red arrows those that did not as discussed in the text.
In a recent genome-wide sequencing of cancer genes, Sjöblom et al.  observed that newly discovered gene mutations in colon and breast cancers generally had a low incidence of occurrence, with 90% of the genes identified harboring a mutation frequency of less than 10%. Furthermore, a typical patient's colon or breast tumor was estimated to have an average of only 14 mutations and there appeared to be little overlap between individual tumors for the newly discovered mutations . These low frequencies raise the question whether alternative mechanisms might account for inactivation of these genes in additional tumors. Obviously, the much higher number of candidate hypermethylated genes we now identify in individual tumors suggests that this epigenetic change might provide an alternative inactivating route to mutations for many tumor suppressor genes. We now show that screening tumors for both genetic and epigenetic changes indicates that this is the case.
Genes were selected for verification of expression (by RT-PCR of SW480 and DAC-treated SW480 cells) and promoter methylation (by RT-PCR of SW480 and DAC-treated SW480 cells) status. Full gene names are indicated on the left side of the panel and abbreviated gene names are shown next to the PCR results. Water (RT-PCR), in vitro methylated DNA (for MSP), and actin beta (ACTB) were used as controls for each individual gene; a representative sample is shown. Green arrows identify genes that verified the array results, red arrows those that did not as discussed in the text.
We first located the 189 newly identified, mutated cancer (CAN) genes, described by Sjöblom et al. , within the top and next tiers of our colorectal cancer cell line hypermethylome and found 56 genes present in these zones in one or more of the cell lines. Of these, 45 contained CpG islands. Twenty-six of these 45 genes (58%), similar to the verification rate for all candidate genes identified as discussed above, proved to be hypermethylated in at least one of the six cell lines, and were selected for further study. Importantly, exactly half (13 of 26 genes) of these genes were expressed at high levels (Figure 11A) and were not methylated in normal colon (Figure 11B) but were methylated in primary CRC tumors (Figure 11C), giving a frequency of 50% for identification of tumor-specific methylation when starting with genes harboring cell line methylation. We also randomly selected, for verification of methylation and expression status in cell lines, CAN genes that fell primarily in zone 3 of the microarray, that is, within the TSA-negative zone but below the 1.4-fold cutoff for stimulation by DAC. As seen earlier for other randomly selected genes in this region, these randomly selected CAN genes had a significantly reduced (four of 15, or 27%) frequency of methylation as compared to the 56 top and next-tier CAN genes discussed above (Figure S1C). Interestingly, however, this rate is much more similar to that for the well-characterized hypermethylated guide genes (~30% as shown in Figure 3A–3C) than for the other randomly selected zone 3 genes (9%, compare Figure S1B and S1C), perhaps indicating the importance of epigenetic inactivation of these mutated genes. Indeed, relevant to this point, for the majority of the examined CAN genes within the hypermethylome region, the incidence of hypermethylation is strikingly higher than that for mutations (Figure 11D). Thus, unlike for the mutations in the individual genes, which are restricted to only tumors from a few patients, hypermethylation for the majority of the genes is a shared property between many tumors. These findings of both mutations, and alternatively epigenetic silencing, in these previously uncharacterized genes solidifies their probable roles as tumor suppressor genes.
Methylation analysis of verified hypermethylome genes in human tissue samples. Twenty genes from the verified gene lists were randomly selected from the HCT116 top tier (BOLL, DDX43, DKK3, FOXL2, HOXD1, JPH3, NEF3, NEURL, PPP1R14A, RAB32, STK31, and TLR2), HCT116 next tier (SALL4 and TP53AP1), or SW480 top tier (ZFP42) and analyzed for methylation in CRC cell lines (white columns), normal colon (red columns), or primary tumors (green columns). Percentage of methylation is indicated on the y-axis, and the abbreviated gene name on the x-axis. We tested at least six different cell lines, 16 to 40 colonic samples from noncancer patients, and between 18 and 61 primary CRC samples for each gene.
We describe a gene-expression approach with the capacity to define, for any human cancer type for which representative cell-culture lines are available, a substantial fraction of the cancer gene promoter CpG island DNA hypermethylome. Studies of these genes will contribute to understanding the molecular pathways driving tumorigenesis; provide useful new DNA hypermethylation biomarkers to monitor cancer risk assessment, early diagnosis, and prognosis; and permit better monitoring of gene reexpression during cancer prevention and/or therapeutic strategies .
Through use of our approach to analyze mutated genes identified by a genome-wide sequencing strategy, we document that many more epigenetically altered genes than genetically altered genes exist in any given tumor. The importance of this fact emerges in our finding that for newly discovered genes that are affected by both mechanisms, the incidence for hypermethylation of any given gene among colon cancers appears to be generally much higher than for mutations. Interestingly, many of the new genes found by Sjöblom et al.  harbored heterozygous mutations and it would be, thus, difficult to predict whether the genes were affected by activating or inactivating events from such data alone. As first suggested by Zardo et al. , our data may clarify, in initial screening studies, the latter category, as promoter DNA hypermethylation and gene silencing often affect genes independent of loss of heterozygosity frequency. Thus, discovery of genes targeted by hypermethylation as an inactivating event should help guide prioritization of genes to study in cancer gene resequencing efforts.
Finally, our data indicate that, in any given cancer type, one may markedly underestimate both the full range of gene alterations and associated abnormalities of cellular pathways by failing to screen for both genetic and epigenetic abnormalities. Our findings indicate that assessing both mechanisms for loss of gene function indicates far more sharing among individual colon tumors for pathway disruption than genetic analyses alone would predict. Optimal approaches to grouping of tumors according to molecular alterations in key pathways should, then, depend on defining both genetic and epigenetic gene changes. Thus, our findings should encourage any genome-wide cancer gene screening strategies to include finding DNA hypermethylated genes and prioritizing these to be sequenced for mutations as well as prioritizing newly discovered mutated genes to be studied for promoter methylation.
(A–D) Methylation and expression analyses. Cell line abbreviations are indicated at the top (A, C), with the upper panel indicating methylation tested by MSP and expression tested by RT-PCR before (−) and after (+) DAC treatment. U indicates unmethylated and M indicates methylated alleles DKO and water (H2O) controls are indicated on the right panel. Graphical display of the NEURL (B) or FOXL2 (D) promoter CpG islands, with bisulfite sequencing primers indicated in black, MSP primers indicated in red, and CpG nucleotides as open circles. Transcription start sites are indicated with a green square, and the 5′ and 3′ ends are indicated by numbers with respect to the transcription start site. Bisulfite sequencing results (lower panels) in cell lines (HCT116, RKO, or DKO) or human tissues (normal colon or rectum); unmethylated CpGs are indicated by open circles, methylated CpGs by shaded circles.
(E) Methylation analysis of the NEURL CpG island in human tumors. Upper panel shows results of primary CRC samples analyzed by MSP. Positive samples analyzed further by bisulfite sequencing are denoted with an arrow. Lower panel shows bisulfite sequencing results for 15 cloned alleles of each tumor sample, with the location relative to the transcription start site indicated in bp. Open circles indicate unmethylated CpG dinucleotides and closed circles indicate methylated dinucleotides.
(F) Methylation analysis of the FOXL2 CpG island in human tumors. Upper panel shows results of primary CRC samples analyzed by MSP. Positive samples analyzed by bisulfite sequencing are denoted with an arrow. Lower panel shows bisulfite sequencing results for 15 cloned alleles of each tumor sample, with the location relative to the transcription start site indicated in bp. Open circles indicate unmethylated CpG dinucleotides and closed circles indicate methylated dinucleotides.
(G) Results of MSP methylation status of FOXL2 and NEURL in colon cancers classified as being microsatellite stable (MSS) or having microsatellite instability (MSI) by classic criteria.
(A) Expression vectors encoding full length NEURL or FOXL2, or empty vector, were transfected into HCT116 cells, selected for hygromycin resistance, and stained.
(B) Resulting colonies were visualized by light microscopy.
(C–E) Colony number resulting from transfection with the indicated plasmid in HCT116 cells (C), RKO (D), or DLD1 cells (E).
(F) Growth suppression of HCT116 cells by p53. Colony formation (left panel), colony visualization (middle panel), and quantitation (right panel) are indicated.
Materials and Methods
Cell culture and treatment.
HCT116 cells and isogenic genetic knockout derivatives were maintained as previously described . For drug treatments, log-phase CRC cells were cultured in McCoy's 5A media (Invitrogen, http://www.invitrogen.com/) containing 10% BCS and 1× penicillin/streptomycin with 5 μM DAC (stock solution: 1 mM in PBS; Sigma, http://www.sigmaaldrich.com/) for 96 h, replacing media and DAC every 24 h. Cell treatment with 300 nM TSA (stock solution: 1.5mM dissolved in ethanol, Sigma) was performed for 18 h. Control cells underwent mock treatment in parallel with addition of equal volume of PBS or ethanol without drugs.
(A) Expression of matched CAN genes in normal human colon measured by RT-PCR. Partially expressed and no expression indicated weak or absent RT-PCR amplification. (B, C) Methylation analysis of CAN genes. Fifty-six CAN genes were located in the top or next tier of one microarray in one or more cell lines. Of these, 45 genes contained CpG islands. Selected genes from this list with methylation in cell lines (26 genes) were analyzed for methylation in normal colon (B) and primary CRC (C). Frequency of methylation of these genes is shown as a percentage.
(D) Relationship between methylation status, analyzed by MSP, and mutation for 13 genes overlapping the CAN and hypermethylome gene lists.
Total RNA was harvested from logphase cells using TRIzol (Invitrogen) and the RNeasy kit (Qiagen, http://www1.qiagen.com/) according to the manufacturer's instructions, including a DNase digestion step. RNA was quantified using the NanoDrop ND-100 (http://www.nanodrop.com/) followed by quality assessment with 2100 Bioanalyzer (Agilent Technologies, http://www.agilent.com/). RNA concentrations for individual samples were greater than 200ng/μl, with 28S/18S ratios greater than 2.2 and RNA integrity of 10 (10 scored as the highest). Sample amplification and labeling procedures were carried out using the Low RNA Input Fluorescent Linear Amplification Kit (Agilent Technologies) according to the manufacturer's instructions. The labeled cRNA was purified using the RNeasy mini kit (Qiagen) and quantified. RNA spike-in controls (Agilent Technologies) were added to RNA samples before amplification. Samples (0.75 μg) labeled with Cy3 or Cy5 were mixed with control targets (Agilent Technologies), assembled on Oligo Microarray, hybridized, and processed according to the Agilent microarray protocol. Scanning was performed with the Agilent G2565BA microarray scanner using settings recommended by Agilent Technologies.
All arrays were subject to quality checks recommended by the manufacturer. Images were visually inspected for artifacts, and distributions of signal and background intensity of both red and green channels were examined to identify anomalous arrays. No irregularities were observed, and all arrays were retained and used. All calculations were performed using the R statistical computing platform  and packages from Bioconductor bioinformatics software project [44–46]. The log ratio of red signal to green signal was calculated after background subtraction and LoEss normalization as implemented in the limma package from Bioconductor . Individual arrays were scaled to have the same inter-quartile range (75th percentile–25th percentile). Log-fold changes were averaged over dye-swap replicate microarrays to produce a single set of expression values for each condition. We have deposited primary array data in the GEO database at the National Center for Biotechnology Information (NCBI).
Methylation and gene expression analysis.
RNA was isolated with TRIzol Reagent (Invitrogen) according to the manufacturer's instructions. For RT-PCR, 1 μg of total RNA was reverse transcribed using Ready-To-Go You-Prime First-Strand Beads (Amersham Biosciences, http://www.amersham.com/) with addition of random hexamers (0.2 μg per reaction). For RT-primer design we used Primer3 (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi). For MSP analysis, DNA was extracted following a standard phenol-chloroform extraction method. Bisulfite modification of genomic DNA was carried out using the EZ DNA methylation Kit (Zymo Research, http://www.zymoresearch.com/). Primer sequences specific to the unmethylated and methylated promotor sequences were designed using MSPPrimer (http://www.mspprimer.org). MSP was performed as previously described . All PCR products (15 μl of 50-μl total volume for RT-PCR and 7.5 μl of 25-μl total volume for MSP) were loaded directly onto 2% agarose gels containing GelStar Nucleic Acid Gel Stain (Cambrex, http://www.cambrex.com/) and visualized under ultraviolet illumination. Primer sequences and conditions for MSP, bisulfite sequencing, and RT-PCR are available upon request from the authors.
Human tumor analysis.
Formalin-fixed, paraffin-embedded tissues from primary CRCs were obtained from the archive of the Department of Pathology of the University Hospital Maastricht, Maastricht, The Netherlands and Johns Hopkins University Hospital. Approval was obtained by the Medical Ethical Committees of the University of Maastricht and the University Hospital Maastricht and Johns Hopkins University Hospital. DNA was isolated using the Puregene DNA isolation kit (Gentra Systems, http://www1.qiagen.com/). FOXL2 and NEURL methylation was analyzed by nested MSP. MSI analysis was performed by analysis of the BAT-26 mononucleotide repeat.
The primer sequences and PCR conditions for the BAT-26 mononucleotide repeat were used as described previously .
Colony Formation Assay.
One million HCT116, RKO, or DLD1 cells were plated in six-well dishes (Falcon) and transfected with 5 μg of plasmid (pIRES-Neo3, Invitrogen) using Lipofectamine 2000 according to the manufacturer's instructions. Following a 24-h recovery period, selection in 4 μg/ml gentamycin- (Invitrogen) containing complete medium was performed for 10 d. Staining, visualization, and counting of triplicate wells were performed as previously described .
Figure S1. Sensitivity of Detecting Hypermethylated Genes Compared with Control Genes
(150 KB PPT)
Table S1. Quantitative Estimate of Hypermethylome Size
(50 KB PPT)
We have deposited primary array data in the GEO database at the NCBI (http://www.ncbi.nlm.nih.gov/geo/). The accession numbers for the array experiments described in the paper are: GSM107602, DAC_vs_mock; GSM107603, TSA_VS_mock; GSM107604, DNMT1_vs_WT; GSM107605, DNMT3B_vs_WT; GSM107606, DKO_vs_WT; GSM107607, WT_vs_DKO; GSM107660, DAC_vs_mock_2; GSM107662, TSA_vs_mock_2; GSM107663, WT_vs_DNMT1; and GSM107664, WT_vs_DNMT3B.
The GEO series in which all ten arrays are linked may be found under accession number GSE4763.
We wish to thank Ingrid Hedenfalk and Jeffrey Trent for advice on array analysis; Karen Stefanisko, Joyce Ohm, Johann Brandes, and Chris Strock for helpful comments and suggestions; Marco Riojas for technical assistance; and Kathy Bender for manuscript preparation and submission. We especially thank Victor Velculescu for many helpful discussions.
KES, WC, LC, SCG, HS, JMY, TAC, LVN, WVC, SvdB, MvE, AHT, KJ, MT, NA, and SBB conceived and designed the experiments. WC, LC, SCG, HS, JMY, TAC, LVN, WVC, SvdB, MvE, AHT, KJ, WY, MT, KI, NA, and JGH performed the experiments. KES, WC, LC, SCG, HS, JMY, TAC, LVN, WVC, SvdB, MvE, AHT, KJ, WY, MT, KI, NA, JGH, and SBB analyzed the data. KES, WC, LC, SCG, HS, JMY, TAC, LVN, WVC, SvdB, MvE, AHT, KJ, WY, MT, NA, and SBB contributed reagents/materials/analysis tools. KES and SBB wrote the paper.
- 1. Ponder BA (2001) Cancer genetics. Nature 411: 336–341.
- 2. Herman JG, Baylin SB (2003) Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med 349: 2042–2054.
- 3. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, et al. (2006) The consensus coding sequences of human breast and colorectal cancers. Science 314: 268–274.
- 4. Antequera F, Bird A (1993) Number of cpg islands and genes in human and mouse. Proc Natl Acad Sci U S A 90: 11995–11999.
- 5. Costello JF, Fruhwald MC, Smiraglia DJ, Rush LJ, Robertson GP, et al. (2000) Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. NatGenet 24: 132–138.
- 6. Gius D, Cui H, Bradbury CM, Cook J, Smart DK, et al. (2004) Distinct effects on gene expression of chemical and genetic manipulation of the cancer epigenome revealed by a multimodality approach. Cancer Cell 6: 361–371.
- 7. Hu M, Yao J, Cai L, Bachman KE, van den Brule F, et al. (2005) Distinct epigenetic changes in the stromal cells of breast cancers. Nat Genet 37: 899–905.
- 8. Keshet I, Schlesinger Y, Farkash S, Rand E, Hecht M, et al. (2006) Evidence for an instructive mechanism of de novo methylation in cancer cells. Nat Genet 38: 149–153.
- 9. Paz MF, Wei S, Cigudosa JC, Rodriguez-Perales S, Peinado MA, et al. (2003) Genetic unmasking of epigenetically silenced tumor suppressor genes in colon cancer cells deficient in DNA methyltransferases. Hum Mol Genet 12: 2209–2219.
- 10. Suzuki H, Gabrielson E, Chen W, Anbazhagan R, van Engeland M, et al. (2002) A genomic screen for genes upregulated by demethylation and histone deacetylase inhibition in human colorectal cancer. Nat Genet 31: 141–149.
- 11. Toyota M, Ho C, Ahuja N, Jair KW, Li Q, et al. (1999) Identification of differentially methylated sequences in colorectal cancer by methylated CpG island amplification. Cancer Res 59: 2307–2312.
- 12. Ushijima T (2005) Detection and interpretation of altered methylation patterns in cancer cells. Nat Rev Cancer 5: 223–231.
- 13. Weber M, Davies JJ, Wittig D, Oakeley EJ, Haase M, et al. (2005) Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat Genet 37: 853–862.
- 14. Rhee I, Bachman KE, Park BH, Jair KW, Yen RW, et al. (2002) DNMT1 and DNMT3b cooperate to silence genes in human cancer cells. Nature 416: 552–556.
- 15. Akiyama Y, Watkins N, Suzuki H, Jair KW, van Engeland M, et al. (2003) GATA-4 and GATA-5 transcription factor genes and potential downstream antitumor target genes are epigenetically silenced in colorectal and gastric cancer. Mol Cell Biol 23: 8429–8439.
- 16. Toyota M, Sasaki Y, Satoh A, Ogi K, Kikuchi T, et al. (2003) Epigenetic inactivation of CHFR in human tumors. Proc Natl Acad Sci U S A 100: 7818–7823.
- 17. Egger G, Jeong S, Escobar SG, Cortez CC, Li TW, et al. (2006) Identification of DNMT1 (DNA methyltransferase 1) hypomorphs in somatic knockouts suggests an essential role for DNMT1 in cell survival. Proc Natl Acad Sci U S A 103: 14080–14085.
- 18. Spada F, Haemmer A, Kuch D, Rothbauer U, Schermelleh L, et al. (2007) DNMT1 but not its interaction with the replication machinery is required for maintenance of DNA methylation in human cells. J Cell Biol 176: 565–571.
- 19. Christman JK (2002) 5-Azacytidine and 5-aza-2′-deoxycytidine as inhibitors of DNA methylation: mechanistic studies and their implications for cancer therapy. Oncogene 21: 5483–5495.
- 20. Jones PA (1985) Altering gene expression with 5-azacytidine. Cell 40: 485–486.
- 21. Cameron EE, Bachman KE, Myohanen S, Herman JG, Baylin SB (1999) Synergy of demethylation and histone deacetylase inhibition in the re-expression of genes silenced in cancer. Nat Genet 21: 103–107.
- 22. Herman JG, Graff JR, Myohanen S, Nelkin BD, Baylin SB (1996) Methylation-specific PCR: A novel PCR assay for methylation status of CpG islands. Proc Natl Acad Sci U S A 93: 9821–9826.
- 23. Yamashita K, Upadhyay S, Osada M, Hoque MO, Xiao Y, et al. (2002) Pharmacologic unmasking of epigenetically silenced tumor suppressor genes in esophageal squamous cell carcinoma. Cancer Cell 2: 485–495.
- 24. Nakamura H, Yoshida M, Tsuiki H, Ito K, Ueno M, et al. (1998) Identification of a human homolog of the Drosophila neuralized gene within the 10q25.1 malignant astrocytoma deletion region. Oncogene 16: 1009–1019.
- 25. Deblandre GA, Lai EC, Kintner C (2001) Xenopus neuralized is a ubiquitin ligase that interacts with XDelta1 and regulates Notch signaling. Dev Cell 1: 795–806.
- 26. Lai EC, Deblandre GA, Kintner C, Rubin GM (2001) Drosophila neuralized is a ubiquitin ligase that promotes the internalization and degradation of delta. Dev Cell 1: 783–794.
- 27. Pavlopoulos E, Pitsouli C, Klueg KM, Muskavitch MA, Moschonas NK, et al. (2001) neuralized Encodes a peripheral membrane protein involved in delta signaling and endocytosis. Dev Cell 1: 807–816.
- 28. Fre S, Huyghe M, Mourikis P, Robine S, Louvard D, et al. (2005) Notch signals control the fate of immature progenitor cells in the intestine. Nature 435: 964–968.
- 29. van Es JH, van Gijn ME, Riccio O, van den Born M, Vooijs M, et al. (2005) Notch/gamma-secretase inhibition turns proliferative cells in intestinal crypts and adenomas into goblet cells. Nature 435: 959–963.
- 30. Radtke F, Clevers H (2005) Self-renewal and cancer of the gut: Two sides of a coin. Science 307: 1904–1909.
- 31. Lehmann OJ, Sowden JC, Carlsson P, Jordan T, Bhattacharya SS (2003) Fox's in development and disease. Trends Genet 19: 339–344.
- 32. Uda M, Ottolenghi C, Crisponi L, Garcia JE, Deiana M, et al. (2004) Foxl2 disruption causes mouse ovarian failure by pervasive blockage of follicle development. Hum Mol Genet 13: 1171–1181.
- 33. Crisponi L, Deiana M, Loi A, Chiappe F, Uda M, et al. (2001) The putative forkhead transcription factor FOXL2 is mutated in blepharophimosis/ptosis/epicanthus inversus syndrome. Nat Genet 27: 159–166.
- 34. Ionov Y, Peinado MA, Malkhosyan S, Shibata D, Perucho M (1993) Ubiquitous somatic mutations in simple repeated sequences reveal a new mechanism for colonic carcinogenesis. Nature 363: 558–561.
- 35. Parsons R, Li GM, Longley MJ, Fang WH, Papadopoulos N, et al. (1993) Hypermutability and mismatch repair deficiency in RER+ tumor cells. Cell 75: 1227–1236.
- 36. Herman JG, Umar A, Polyak K, Graff JR, Ahuja N, et al. (1998) Incidence and functional consequences of hMLH1 promoter hypermethylation in colorectal carcinoma. Proc Natl Acad Sci U S A 95: 6870–6875.
- 37. Toyota M, Ahuja N, Ohe-Toyota M, Herman JG, Baylin SB, et al. (1999) CpG island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S A 96: 8681–8686.
- 38. Weisenberger DJ, Siegmund KD, Campan M, Young J, Long TI, et al. (2006) CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat Genet 38: 787–793.
- 39. Perreault N, Sackett SD, Katz JP, Furth EE, Kaestner KH (2005) Foxl1 is a mesenchymal Modifier of Min in carcinogenesis of stomach and colon. Genes Dev 19: 311–315.
- 40. Herman JG, Latif F, Weng Y, Lerman MI, Zbar B, et al. (1994) Silencing of the VHL tumor-suppressor gene by DNA methylation in renal carcinoma. Proc Natl Acad Sci U S A 91: 9700–9704.
- 41. Egger G, Liang G, Aparicio A, Jones PA (2004) Epigenetics in human disease and prospects for epigenetic therapy. Nature 429: 457–463.
- 42. Zardo G, Tiirikainen MI, Hong C, Misra A, Feuerstein BG, et al. (2002) Integrated genomic and epigenomic analyses pinpoint biallelic gene inactivation in tumors. Nat Genet 32: 453–458.
- 43. Ihaka R, Gentleman RC (1996) A language for data analysis and graphics. J Comput Graphical Stat 5: 299–314.
- 44. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, et al. (2004) Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol 5: R80.
- 45. Smyth GK, Speed T (2003) Normalization of cDNA microarray data. Methods 31: 265–273.
- 46. Smyth GK, Yang YH, Speed T (2003) Statistical issues in cDNA microarray data analysis. Methods Mol Biol 224: 111–136.
- 47. Hoang JM, Cottu PH, Thuille B, Salmon RJ, Thomas G, et al. (1997) BAT-26, an indicator of the replication error phenotype in colorectal cancers and cell lines. Cancer Res 57: 300–303.
- 48. Suzuki H, Watkins DN, Jair KW, Schuebel KE, Markowitz SD, et al. (2004) Epigenetic inactivation of SFRP genes allows constitutive WNT signaling in colorectal cancer. Nat Genet 36: 417–422.