Comparing the DNA Hypermethylome with Gene Mutations in Human Colorectal Cancer

We have developed a transcriptome-wide approach to identify genes affected by promoter CpG island DNA hypermethylation and transcriptional silencing in colorectal cancer. By screening cell lines and validating tumor-specific hypermethylation in a panel of primary human colorectal cancer samples, we estimate that nearly 5% or more of all known genes may be promoter methylated in an individual tumor. When directly compared to gene mutations, we find larger numbers of genes hypermethylated in individual tumors, and a higher frequency of hypermethylation within individual genes harboring either genetic or epigenetic changes. Thus, to enumerate the full spectrum of alterations in the human cancer genome, and to facilitate the most efficacious grouping of tumors to identify cancer biomarkers and tailor therapeutic approaches, both genetic and epigenetic screens should be undertaken.


Introduction
It is now well established that loss of proper gene function in human cancer can occur through both genetic and epigenetic mechanisms [1,2]. The number of genes mutated in human tumor samples is being clarified. Recently, Sjö blom et al. [3] sequenced 13,023 genes in colorectal cancer (CRC) and breast cancer, and estimated an average of 14 significant mutations per tumor, suggesting that a relatively small number of genetic events may be sufficient to drive tumorigenesis. In contrast, the full spectrum of epigenetic alterations is not well delineated. The best-defined epigenetic alteration of cancer genes involves DNA hypermethylation of clustered CpG dinucleotides, or CpG islands, in promoter regions associated with the transcriptional inactivation of the affected genes [2]. These promoters are located proximal to nearly half of all genes [4] and are thought to remain primarily methylation free in normal somatic tissues. The exact number of such epigenetic lesions in any given tumor is not precisely known, although a growing number of screening approaches, none covering the whole genome efficiently, are identifying an increasing number of candidate genes [5][6][7][8][9][10][11][12][13]. Given the large number of potential target promoters present in the genome, we hypothesized that many more hypermethylated genes await discovery.
Herein, we describe a whole human transcriptome microarray screen to identify genes silenced by promoter hypermethylation in human CRC. The approach readily identifies candidate cancer genes in single tumors with a high efficiency of validation. By comparing the list of candidate hyper-methylated genes with mutated genes recently identified in CRC [3], we establish key relationships between the altered tumor genome and the gene hypermethylome. Our studies provide a platform to understand how epigenetic and genetic alterations drive human tumorigenesis.

Developing the Whole Transcriptome Approach
Our first step towards a global identification of hypermethylation-dependent gene expression changes was made by comparing, in a genome-wide expression array-based approach, wild-type HCT116 CRC cells with isogenic partner cells carrying individual and combinatorial genetic deletions of two major human DNA methyltransferases ( Figure 1A) [14].
Importantly, in the DNMT1 (À/À) DNMT3B (À/À) double knockout (DKO) HCT116 cells, which have virtually complete loss of global 5-methylcytosine, all previously individually examined hypermethylated genes lacking basal expression in wild-type cells undergo promoter demethylation with concomitant gene re-expression [10,[14][15][16]. By stratifying genes according to altered signal intensity on a 44K Agilent Technologies array platform, we observe a unique spike of gene expression increases in the DKO cells when compared to the isogenic wildtype parental cells, or isogenic cell lines in which DNMT1 or DNMT3B have been individually deleted and which harbor minimal changes in DNA methylation ( Figure 1B). This minimal change in the DNMT1 (À/À) cells may, in part, be due to recently identified alternative transcripts arising from the DNMT1 locus [17,18].
We tested our approach using a pharmacologic strategy based on our previous approach [10], but now markedly modified to provide whole-transcriptome coverage, to identify silenced hypermethylated genes in any cancer cell line. For densely hypermethylated and transcriptionally inactive genes, the DNA demethylating agent 5-aza-29-deoxycytidine (DAC) has a well established capacity to induce gene re-expression [19,20]. On the other hand, for these same genes, the class I and II histone deacetylase inhibitor, trichostatin A (TSA) will not alone induce reexpression [10,21]. We now use this lack of TSA response for such genes to provide a new informatics filter to identify the majority of DNA hypermethylated genes in cancer. After treatment of HCT116 cells with either DAC or TSA ( Figure 1C), we identified a zone in which gene expression did not increase with TSA (,1.4-fold) and displayed no detectable expression in mock-treated cells. Within this zone, we observed a characteristic spike of DAC-induced gene expression that virtually completely encompasses the genes with increased expression in DKO cells (compare yellow spots in Figure 1D with blue spots in Figure 1B). This gene spike is absolutely dependent upon analysis of only genes that fail to respond to histone deacetylase inhibition, underscored by a cluster analysis that shows the close relationship between genes in DKO-and DAC-treated cells with a separate grouping of geneexpression changes after TSA treatment alone or in single knockouts ( Figure 1E). These data confirm previous studies covering much less of the genome, and using only treatment of cells with DAC and TSA together, in which genes with dense CpG islands that were reexpressed by TSA harbored only partial or no detectable hypermethylation [10,21].
Importantly, a similar spike of gene expression increases could be seen in five additional human CRC cell lines, SW480, CaCO2, RKO, HT29, and COLO320 (Figure 2A), as well as cell lines derived from lung, breast, ovary, kidney, and brain (unpublished data), confirming that this approach works universally in cancer cell lines and identifies overlapping gene sets ( Figure 2C). However, it is important to note thatpossibly because DAC incorporates into the DNA of dividing cells, and our treatments were performed for only 96 hsensitivity for detecting the gene increases in the pharmacological approach is reduced in HCT116 cells compared to that seen in DKO cells ( Figure 1D). To address the sensitivity with which our new array approach identifies CpG island hypermethylated genes, we first examined 11 genes known to be hypermethylated, completely silenced and reexpressed after DAC treatment in HCT116 cells ( Figure 3A). All tested genes remained within the TSA nonresponsive zone ( Figure 3B), and the direction of expression changes correlated well in DAC treated and DKO cells ( Figure 3C). Importantly, for the DAC increase, five of the guide genes (45%) increased 2-fold or more and three more genes, or a total of 73%, increased 1.3fold or more ( Figure 3D). We estimate, then, that we can detect over 70% of DNA hypermethylated genes in a given cancer cell line and we test this hypothesis in studies directly below. Figure 1. Approach for Identification of the Human Cancer Cell Hypermethylome in HCT116 CRC Cells (A) RNA from the indicated cell lines was isolated, labeled, hybridized, scanned, and fluorescent spot intensities normalized by background subtraction and Loess transformation using Agilent Technologies 44K human microarrays. Parental wild-type HCT116 cells (WT) and isogenic knockout counterparts for DNA methyltransferase 1 (DNMT1 À/À ) or 3b (DNMT3B À/À ) are compared in our study. DKO cells are doubly deficient for both DNMT1 and DNMT3B. (B) Gene-expression changes in HCT116 cells with genetic disruption of various DNA methyltransferases. A 3-D scatter plot indicating the geneexpression levels in HCT 116 cells with genetic disruption of DNMT1 (x-axis), DNMT3B (z-axis), and both DNMT1 and DNMT3B (DKO; y-axis) in fold scale. Individual gene-expression changes are in black with the average for three experiments (red spots) or from an individual experiment (blue spots) for those genes in DKO cells with greater than 4-fold expression change. (C) HCT116 cells were treated with 300 nM TSA for 18 h or 5 lM DAC for 96 h and processed as described above. (D) Gene-expression changes for HCT116 cells treated with TSA (x-axis) or DAC (y-axis) are plotted by fold change. Yellow spots indicate genes from DKO cells with 2-fold changes and above. Notice the loss of sensitivity when compared to gene-expression increases seen in DKO cells (80% of genes greater than 4-fold in the DKO cells now becomes greater than 1.3-fold in DAC-treated cells). Green spots indicate randomly selected genes verified to have complete promoter methylation in wild-type cells, reexpression in DKO cells and after DAC treatment, while red spots indicate selected genes that were identified as false positives (See Figures 4, 6, and 7 for validation results). Blue spots indicate the location of the 11 guide genes-previously shown to be hypermethylated and completely silenced in HCT 116 cells-used in this study (see Figure 3 for description). A distinct group of genes, including five of 11 guide genes, displays increases of greater than 2-fold after DAC treatment but no increase after TSA treatment. These genes form the top tier of candidate hypermethylated genes as discussed in the text.

Author Summary
Loss of gene expression in association with aberrant accumulation of 5-methylcytosine in gene promoter CpG islands is a common feature of human cancer. Here, we describe a method to discover these genes that permits identification of hundreds of novel candidate cancer genes in any cancer cell line. We now estimate that as much as 5% of colon cancer genes may harbor aberrant gene hypermethylation and we term these the cancer ''promoter CpG island DNA hypermethylome.'' Multiple mutated genes recently identified via cancer resequencing efforts are shown to be within this hypermethylome and to be more likely to undergo epigenetic inactivation than genetic alteration. Our approach allows derivation of new potential tumor biomarkers and potential pathways for therapeutic intervention. Importantly, our findings illustrate that efforts aimed at complete identification of the human cancer genome should include analyses of epigenetic, as well as genetic, changes.

Validating the Methylation Status of Candidate Genes Derived from the Screening Approach
Based on the sensitivity differences observed between DKO-and DAC -induced gene increases (compare Figure  1B and D; also Figure 3B and 3C) and behavior of the guide genes in the array platform, we designated, within the TSAnegative zone, a top tier (2-fold increase or above) and a next tier of genes (increasing between 1.4-and 2-fold) to identify hypermethylated cancer genes ( Figure 2B). Importantly, we introduced an additional filter for selecting genes from these zones based on their having no basal expression in untreated cells, since this full lack of transcription is characteristic of promoter CpG island methylated genes in cell culture. Indeed, based on these selection criteria, in HCT116 cells, 32 of 35 (91%, Figure 4) of randomly chosen CpG islandcontaining genes spanning the top-tier response zone of 532 genes ( Figure 5), and 31 of 48 such SW480 cell genes (65%, Figure 6) from among 318 top tier genes proved to be CpG hypermethylated as measured by methylation-specific PCR (MSP) [22], and silenced in the cell line of origin as measured by reverse transcriptase PCR (RT-PCR). We also examined the efficiency of discovery for hypermethylated genes in the next (B) Validation of the DNA hypermethylome. The characteristic spike of hypermethylated genes defined by treatment of cells with DAC or TSA consists of two tiers, with distinct features. The top tier of genes was identified as a zone in which gene expression did not increase with TSA (,1.4 fold) and displayed no detectable expression in wild-type cells, but increased greater than 2-fold with DAC treatment. The next tier of genes was identified as a cluster of genes for which expression changes of TSA and wild type were identical to those in the top tier, but increased between 1.4-fold and 2-fold with DAC treatment. Gene expression validation by RT-PCR and MSP indicated a validation frequency of 91% for top-tier genes in HCT116 cells, including genes that increased in DKO cells by greater than 2-fold. Next-tier genes in HCT116 cells were confirmed at a frequency of 49%, and in the SW480 top tier, with a frequency of 65%. (C) Shared candidate hypermethylated genes in CRC cell lines. We identified a total of 5,906 unique genes in all six cell lines with expression changes falling within the criteria of top-or next-tier categories. Overlaps in gene expression changes among two, three, four, five, or six cell lines are indicated; these range from 1,414 genes shared among two cell lines to 78 genes that were shared among all six cell lines. doi:10.1371/journal.pgen.0030157.g002 tier of DAC-treated HCT116 cells. Of the 1,190 genes identified in this region, 17 of 35 (49%) randomly selected genes containing a CpG island were hypermethylated with concordant gene silencing ( Figure 7). Our verification rates then demonstrate around 65% efficiency of our approach, which is close to our original estimate and which is excellent compared to previous screens for identifying new cancer hypermethylated genes [6,23]. With this level of verified hypermethylation, we calculate that the hypermethylome in HCT116 cells consists of an estimated 1,067 genes and an estimated 579 genes for the SW480 cells (See Table S1 for a detailed description of calculations). The hypermethylome would be estimated to range from 532 genes in CaCO2 to 1,389 genes in RKO cells (Table S1).
We next asked whether our top and next-tier regions truly enriched for hypermethylated genes by examining a randomly (B, C) Blue spots and gene names indicate the location of the 11 guide genes in a plot of TSA (x-axis) versus DAC (y-axis) gene expression changes on a log scale (B) or fold-change (C) scale. Five of 11 guide genes, circled in green, display increases of greater than 2-fold after DAC treatment but no increase after TSA treatment and these same genes have greater than 3-fold increases in DKO cells (green circle) (D) Direct comparison of guide genes in DKO and DAC plots. A distinct group of five guide genes, indicated by a green circle, showing greater than 3fold expression changes in DKO cells and greater than 2-fold in DAC-treated cells, define the upper tier of candidate hypermethylated genes as discussed in the text. Another three genes increased 1.3-fold, and three failed to increase with DAC treatment, allowing criteria for the next tier of gene expression to be established as described in the text. doi:10.1371/journal.pgen.0030157.g003 selected subset of 22 control genes located outside these zones. These genes were located in the responsive TSA zones (Zones 1 and 2, Figure S1A) or below the threshold of DAC responsiveness in the TSA nonresponsive zone (Zone 3, Figure  S1A) in HCT116 cells. Of the tested genes, only 9% (2 of 22, Figure S1B) showed detectable methylation with concomitant gene silencing, confirming the specificity of our approach and validating the criteria we used to establish the top and nexttier approach. We can then predict that for cancer cell lines, with use of our filters, ;90% of promoter CpG island DNA methylated genes lie in the negative TSA-responsive zone.
A fundamental question in cell culture-based approaches is whether they identify genes that are targets for inactivation in primary tumors. To address this, 20 CpG island containing genes from the verified gene lists were randomly selected from the HCT116 top tier (17 genes), HCT116 next tier (two genes), or SW480 top tier (one gene) and analyzed for methylation in a panel of CRC cell lines. All of the tested genes were hypermethylated in two or more cell lines (Figure 8). We then examined the status of these 20 genes in a panel of 20 to 61 primary colon cancers and 20 to 40 normal-appearing colon tissue samples obtained from cancer-free individuals. Most of the genes (65%) were completely unmethylated or rarely methylated in the normal colonic tissue samples, but were methylated in a vast majority (86%) of the primary tumors ( Figure 8). Of the 20 genes analyzed, 13 genes (65%) satisfied criteria for ''tumor-specific methylation'' with high-frequency methylation in cell lines, low (,5%) or undetectable methylation in normal colon, and frequent methylation in primary tumor samples (Figure 8). The efficiency of our strategy suggests a discovery rate of approximately one in two for identification of hypermethylated genes in cell lines and approximately one in three for identification of cancerspecific hypermethylated genes. Our estimate of approximately 400 hypermethylated genes per primary tumor now can be matched with predictions of Costello et al. [5] for hypermethylation of CpG islands, based on screening with Restriction Landmark Genomic Scanning approaches.

Validating Potential Biologic Relevance of Newly Identified Genes
We next tested some parameters for biological significance of two of the genes harboring tumor-specific methylation for their likely importance in primary colon cancers. One, the neuralized homolog (Drosophila) (NEURL) gene, is located in a chromosome region with high deletion frequency in brain tumors [24], and its product has been identified as a ubiquitin ligase required for Notch ligand turnover [25][26][27]. Activation of this key developmental pathway influences cell-fate determination in flies and vertebrates [28,29] and activation of Notch, through unknown mechanisms, is thought to play an inhibitory role in normal differentiation during colorectal cancer [30]. The second gene, FOXL2, belongs to the forkhead domain-containing family of transcription factors implicated in diverse processes including establishing and maintaining differentiation programs [31]. Intriguingly, this gene is essential for proper ovarian development [32] and germline mutations in humans lead to a plethora of craniofacial anomalies and premature ovarian failure [33]. We find both of these genes to be frequently DNA hypermethylated in a panel of colorectal cell lines (five of nine cell lines for NEURL and seven of nine for FOXL2, Figure 9A and 9C), and bisulfite sequencing revealed methylation of all CpG residues in the central CpG island regions of both genes in HCT116 and RKO cell lines, with complete demethylation in DKO cells ( Figure 9B and 9D). For both genes, this hypermethylation perfectly correlated with loss of basal expression and ability to reexpress the genes with DAC treatment (Figure 9A and 9C). Importantly, promoter methylation of both genes, as assessed by bisulfite sequencing (Figure 9B and 9D) is absent in normal human colon or rectum, but frequent in primary colon cancers ( Figure 9E and 9F), suggesting that hypermethylation arose as a cancer-specific phenomenon, although slight methylation was observed at the FOXL2 locus in normal tissue from aged patients (unpublished data). Finally, the pattern for hypermethylation of the FOXL2 and NEURL genes in cell culture fit with a biology important to a subset of colon cancers. As many as one in eight colorectal cancers, predominantly those from the right side of the colon, harbor a defect in mismatch-repair capacity [34,35], primarily due, in nonfamilial cancers, to inactivation of MLH1 by epigenetic mechanisms [36]. Such tumors belong to a group with high frequency of hypermethylated gene promoters [37,38]. The hypermethylation of FOXL2 and, especially, NEURL, aggregate with these tumor types not only among the colon cancer cell lines (HCT116, DLD1, LoVo, RKO, and SW48), but also when analyzed in a series of primary human colon cancers (Fisher's exact test value of 0.024 for FOXL2 and 0.001 for NEURL, Figure 9G).
Initial in vitro studies suggest that both FOXL2 and NEURL might possess tumor-suppressor activity. When overexpressed in colon cancer cell lines, full-length FOXL2 and NEURL ( Figure 10A and 10C), generate a 10-fold and 20-fold reduction, respectively, in colony growth of HCT116 cells ( Figure 10C), with surviving clones having severely depleted size ( Figure 10B), comparable to results obtained with the bona fide tumor suppressor p53 ( Figure 10F). Similar results were seen in RKO and DLD1 cells ( Figure 10D and 10E), both of which have complete gene silencing at the FOXL2 and NEURL loci. While the precise molecular mechanisms for the growth suppression remains to be determined, Notch signaling has recently been shown to play an important role in differentiation of intestinal crypt cells where deletion of the Notch effector molecule RBPJj or treatment with a highly selective c-secretase inhibitor was found to be sufficient for conversion of crypt cells to goblet cells [28,29]. Similarly, the closely related FOXL2 transcription factor family member FOXL1 has recently been shown to play a role in epithelialmesenchymal transition of the intestinal epithelium [39].

Comparison of Newly Identified DNA Hypermethylated Genes to Mutated Genes Identified from Sequencing of Cancer Genomes
While it is clear that genetic and epigenetic mechanisms are both important to initiation and progression of human tumorigenesis, the relative contributions of each of these alterations need to be clarified on a global basis. Studies of classic tumor suppressor genes such as VHL in renal cancer and MLH1 in colon cancer indicate that important cancer genes can have an incidence of inactivation by either genetic or epigenetic mechanisms [36,40]. However, a genome-wide analysis to query such relationships has not been performed.
In a recent genome-wide sequencing of cancer genes, Sjö blom et al. [3] observed that newly discovered gene mutations in colon and breast cancers generally had a low incidence of occurrence, with 90% of the genes identified harboring a mutation frequency of less than 10%. Furthermore, a typical patient's colon or breast tumor was estimated to have an average of only 14 mutations and there appeared to be little overlap between individual tumors for the newly discovered mutations [3]. These low frequencies raise the question whether alternative mechanisms might account for inactivation of these genes in additional tumors. Obviously, the much higher number of candidate hypermethylated genes we now identify in individual tumors suggests that this epigenetic change might provide an alternative inactivating route to mutations for many tumor suppressor genes. We now show that screening tumors for both genetic and epigenetic changes indicates that this is the case.
We first located the 189 newly identified, mutated cancer (CAN) genes, described by Sjö blom et al. [3], within the top and next tiers of our colorectal cancer cell line hyper- methylome and found 56 genes present in these zones in one or more of the cell lines. Of these, 45 contained CpG islands. Twenty-six of these 45 genes (58%), similar to the verification rate for all candidate genes identified as discussed above, proved to be hypermethylated in at least one of the six cell lines, and were selected for further study. Importantly, exactly half (13 of 26 genes) of these genes were expressed at high levels ( Figure 11A) and were not methylated in normal colon ( Figure 11B) but were methylated in primary CRC tumors ( Figure 11C), giving a frequency of 50% for identification of tumor-specific methylation when starting with genes harboring cell line methylation. We also randomly selected, for verification of methylation and expression status in cell lines, CAN genes that fell primarily in zone 3 of the microarray, that is, within the TSA-negative zone but below the 1.4-fold cutoff for stimulation by DAC. As seen earlier for other randomly selected genes in this region, these randomly selected CAN genes had a significantly reduced (four of 15, or 27%) frequency of methylation as compared to the 56 top and next-tier CAN genes discussed above ( Figure S1C). Interestingly, however, this rate is much more similar to that for the well-characterized hypermethylated guide genes (;30% as shown in Figure 3A-3C) than for the other randomly selected zone 3 genes (9%, compare Figure S1B and S1C), perhaps indicating the importance of epigenetic inactivation of these mutated genes. Indeed, relevant to this point, for the majority of the examined CAN genes within the hypermethylome region, the incidence of hypermethylation is strikingly higher than that for mutations ( Figure 11D). Thus, unlike for the mutations in the individual genes, which are restricted to only tumors from a few patients, hypermethylation for the majority of the genes is a shared property between many tumors. These findings of both mutations, and alternatively epigenetic silencing, in these previously uncharacterized genes solidifies their probable roles as tumor suppressor genes.

Discussion
We describe a gene-expression approach with the capacity to define, for any human cancer type for which representative cell-culture lines are available, a substantial fraction of the cancer gene promoter CpG island DNA hypermethylome. Studies of these genes will contribute to understanding the molecular pathways driving tumorigenesis; provide useful new DNA hypermethylation biomarkers to monitor cancer risk assessment, early diagnosis, and prognosis; and permit better monitoring of gene reexpression during cancer prevention and/or therapeutic strategies [41].
Through use of our approach to analyze mutated genes identified by a genome-wide sequencing strategy, we document that many more epigenetically altered genes than genetically altered genes exist in any given tumor. The importance of this fact emerges in our finding that for newly discovered genes that are affected by both mechanisms, the incidence for hypermethylation of any given gene among colon cancers appears to be generally much higher than for mutations. Interestingly, many of the new genes found by Sjö blom et al. [3] harbored heterozygous mutations and it would be, thus, difficult to predict whether the genes were affected by activating or inactivating events from such data alone. As first suggested by Zardo et al. [42], our data may clarify, in initial screening studies, the latter category, as promoter DNA hypermethylation and gene silencing often affect genes independent of loss of heterozygosity frequency. Thus, discovery of genes targeted by hypermethylation as an inactivating event should help guide prioritization of genes to study in cancer gene resequencing efforts.
Finally, our data indicate that, in any given cancer type, one may markedly underestimate both the full range of gene alterations and associated abnormalities of cellular pathways by failing to screen for both genetic and epigenetic abnormalities. Our findings indicate that assessing both mechanisms for loss of gene function indicates far more sharing among individual colon tumors for pathway disruption than genetic analyses alone would predict. Optimal approaches to grouping of tumors according to molecular alterations in key pathways should, then, depend on defining both genetic and epigenetic gene changes. Thus, our findings should encourage any genome-wide cancer gene screening strategies to include finding DNA hypermethylated genes and prioritizing these to be sequenced for mutations as well as prioritizing newly discovered mutated genes to be studied for promoter methylation.

Materials and Methods
Cell culture and treatment. HCT116 cells and isogenic genetic knockout derivatives were maintained as previously described [14].   Microarray analysis. Total RNA was harvested from logphase cells using TRIzol (Invitrogen) and the RNeasy kit (Qiagen, http://www1. qiagen.com/) according to the manufacturer's instructions, including a DNase digestion step. RNA was quantified using the NanoDrop ND-100 (http://www.nanodrop.com/) followed by quality assessment with 2100 Bioanalyzer (Agilent Technologies, http://www.agilent.com/). RNA concentrations for individual samples were greater than 200ng/ll, with 28S/18S ratios greater than 2.2 and RNA integrity of 10 (10 scored as the highest). Sample amplification and labeling procedures were carried out using the Low RNA Input Fluorescent Linear Amplification Kit (Agilent Technologies) according to the manufacturer's instructions. The labeled cRNA was purified using the RNeasy mini kit (Qiagen) and quantified. RNA spike-in controls (Agilent Technologies) were added to RNA samples before amplification. Samples (0.75 lg) labeled with Cy3 or Cy5 were mixed with control targets (Agilent Technologies), assembled on Oligo Microarray, hybridized, and processed according to the Agilent microarray protocol. Scanning was performed with the Agilent G2565BA microarray scanner using settings recommended by Agilent Technologies.
Data analysis. All arrays were subject to quality checks recommended by the manufacturer. Images were visually inspected for artifacts, and distributions of signal and background intensity of both red and green channels were examined to identify anomalous arrays. No irregularities were observed, and all arrays were retained and used. All calculations were performed using the R statistical computing platform [43] and packages from Bioconductor bioinformatics software project [44][45][46]. The log ratio of red signal to green signal was calculated after background subtraction and LoEss normalization as implemented in the limma package from Bioconductor [46]. Individual arrays were scaled to have the same interquartile range (75th percentile-25th percentile). Log-fold changes were averaged over dye-swap replicate microarrays to produce a single set of expression values for each condition. We have deposited primary array data in the GEO database at the National Center for Biotechnology Information (NCBI).
Methylation and gene expression analysis. RNA was isolated with TRIzol Reagent (Invitrogen) according to the manufacturer's instructions. For RT-PCR, 1 lg of total RNA was reverse transcribed using Ready-To-Go You-Prime First-Strand Beads (Amersham Biosciences, http://www.amersham.com/) with addition of random hexamers (0.2 lg per reaction). For RT-primer design we used Primer3 (http://frodo.wi. mit.edu/cgi-bin/primer3/primer3_www.cgi). For MSP analysis, DNA was extracted following a standard phenol-chloroform extraction method. Bisulfite modification of genomic DNA was carried out using the EZ DNA methylation Kit (Zymo Research, http://www. zymoresearch.com/). Primer sequences specific to the unmethylated and methylated promotor sequences were designed using MSPPrimer (http://www.mspprimer.org). MSP was performed as previously described [22]. All PCR products (15 ll of 50-ll total volume for RT-PCR and 7.5 ll of 25-ll total volume for MSP) were loaded directly onto 2% agarose gels containing GelStar Nucleic Acid Gel Stain (Cambrex, http://www.cambrex.com/) and visualized under ultraviolet illumination. Primer sequences and conditions for MSP, bisulfite sequencing, and RT-PCR are available upon request from the authors. Human tumor analysis. Formalin-fixed, paraffin-embedded tissues from primary CRCs were obtained from the archive of the Department of Pathology of the University Hospital Maastricht, Maastricht, The Netherlands and Johns Hopkins University Hospital. Approval was obtained by the Medical Ethical Committees of the University of Maastricht and the University Hospital Maastricht and Johns Hopkins University Hospital. DNA was isolated using the Puregene DNA isolation kit (Gentra Systems, http://www1.qiagen. com/). FOXL2 and NEURL methylation was analyzed by nested MSP. MSI analysis was performed by analysis of the BAT-26 mononucleotide repeat. The primer sequences and PCR conditions for the BAT-26 mononucleotide repeat were used as described previously [47].
Colony Formation Assay. One million HCT116, RKO, or DLD1 cells were plated in six-well dishes (Falcon) and transfected with 5 lg of plasmid (pIRES-Neo3, Invitrogen) using Lipofectamine 2000 according to the manufacturer's instructions. Following a 24-h recovery period, selection in 4 lg/ml gentamycin-(Invitrogen) containing complete medium was performed for 10 d. Staining, visualization, and counting of triplicate wells were performed as previously described [48].
The GEO series in which all ten arrays are linked may be found under accession number GSE4763.