DNA Methylation Analysis of Bone Marrow Cells at Diagnosis of Acute Lymphoblastic Leukemia and at Remission

To detect genes with CpG sites that display methylation patterns that are characteristic of acute lymphoblastic leukemia (ALL) cells, we compared the methylation patterns of cells taken at diagnosis from 20 patients with pediatric ALL to the methylation patterns in mononuclear cells from bone marrow of the same patients during remission and in non-leukemic control cells from bone marrow or blood. Using a custom-designed assay, we measured the methylation levels of 1,320 CpG sites in regulatory regions of 413 genes that were analyzed because they display allele-specific gene expression (ASE) in ALL cells. The rationale for our selection of CpG sites was that ASE could be the result of allele-specific methylation in the promoter regions of the genes. We found that the ALL cells had methylation profiles that allowed distinction between ALL cells and control cells. Using stringent criteria for calling differential methylation, we identified 28 CpG sites in 24 genes with recurrent differences in their methylation levels between ALL cells and control cells. Twenty of the differentially methylated genes were hypermethylated in the ALL cells, and as many as nine of them (AMICA1, CPNE7, CR1, DBC1, EYA4, LGALS8, RYR3, UQCRFS1, WDR35) have functions in cell signaling and/or apoptosis. The methylation levels of a subset of the genes were consistent with an inverse relationship with the mRNA expression levels in a large number of ALL cells from published data sets, supporting a potential biological effect of the methylation signatures and their application for diagnostic purposes.


Introduction
Acute lymphoblastic leukemia (ALL) is the most common childhood malignancy accounting for 25% of all childhood cancers in developed countries. ALL originates from the malignant transformation of lymphocyte progenitor cells into leukemic cells in the B-cell and T-cell lineages [1]. However, most of the known large scale genetic aberrations in ALL are not alone sufficient to induce the disease [2], suggesting that there are other genetic or epigenetic alterations that act in leukemic transformation.
In mammalian genomes, methylation of the C-residue in CpG dinucleotides plays an important role in regulating gene expression [3,4]. DNA methylation is maintained by DNA methyltransferases (DNMTs). Alterations in the expression of DNMTs in blood progenitor cells results in extensive changes in methylation patterns, which may lead to leukemogenesis [5]. Treatment with inhibitors of DNA methylation, such as 59-azacytidine have therapeutic benefits in leukemia [6], indicating that the methylation changes are functionally important. In cancer, the regions near transcription start sites often show increased methylation levels, as opposed to an overall decrease in DNA methylation on the genome-wide level [7,8,9]. DNA hypermethylation in the promoters of putative tumor suppressor genes has been found to correlate with resistance against chemotherapy in ALL [10]. We and others have shown that the methylation levels of sets of genes have potential as prognostic markers for risk of relapse in pediatric ALL [11,12]. Moreover, two studies have suggested that minimal residual disease in leukemia patients can be detected by the methylation status of only a few genes [13,14]. Thus, epigenetic perturbation of DNA methylation can be a valuable source of information for understanding the biology of gene regulation, phenotypic diversity, and treatment outcome in pediatric ALL.
In a previous genome-wide survey of 8,000 genes in 197 bone marrow or blood samples from patients with pediatric ALL, we identified .400 genes that displayed allele-specific gene expression (ASE) [4]. The observed ASE indicates that the expression of these genes could be regulated by DNA methylation that silences or activates gene expression in an allele-specific manner. The methylation pattern of the genes with ASE allowed classification of ALL subtypes and stratification of patients into prognostic subgroups [11]. In the current study, we hypothesized that the selection of genes based on genome-wide ASE analysis would enrich for genes with functional CpG site methylation that could be involved in the pathogenesis of ALL. Our aim was to identify genes that display aberrant DNA methylation independently of cytogenetic ALL subtype for further mechanistic studies of ALL. We investigated how the methylation status of the 1,320 CpG sites in genes with ASE differs between ALL samples taken at diagnosis and matched bone marrow samples from the same patients during and after induction therapy, when the patients were in remission, and in control cells from bone marrow or blood of non-leukemic individuals.

Samples from patients and controls
Mononuclear cells were isolated from bone marrow aspirates or peripheral blood cells by 1.077 g/mL Ficoll-Isopaque (Pharmacia) density-gradient centrifugation from 63 samples. The samples consisted of 20 bone marrow samples taken at diagnosis of ALL, 30 follow-up samples from bone marrow samples taken from the same patients during therapy, and 13 non-leukemic control samples, of which 11 were from bone marrow and two were from peripheral blood of children the same age as the patients. The clinical and cytogenetic information for the patients is provided in Table 1. The patients were treated according to the ALL 2000 protocol of the Nordic Society of Pediatric Oncology (NOPHO) [15], in which no DNA-demethylating drugs are used. The proportion of leukemic cells was estimated in each sample by light microscopy in May-Grünwald-Giemsa-stained cytocentrifugate preparations. The pro- portion of leukemic blasts exceeded 90% in the ALL samples included in this study. The matched patient samples taken during therapy at days 29, 50 and 106 contained less than 5% leukemic blasts, indicating that the patients were in morphological remission. The non-leukemic control cells were obtained from sex-and agematched pediatric patients with an initial suspicion of leukemia, from which an initial ALL diagnosis was excluded by negative diagnostic tests and clinical follow-up ( Table 1). DNA was extracted from cell pellets by the AllPrep DNA/RNA Mini Kit (Qiagen) or the QIAamp DNA Blood Mini Kit (Qiagen). The Regional Ethics Committee in Uppsala, Sweden approved the study, and the patients and/or their guardians provided written informed consent. The study was conducted in accordance with the Declaration of Helsinki.

DNA methylation analysis
A custom-designed panel of CpG sites was analyzed to determine the methylation levels of 1,536 CpG sites located 2 kb upstream to 1 kb downstream of the transcription start site of 416 genes [4]. Six hundred ng of genomic DNA was treated with sodium bisulfite (EZ-96 DNA Methylation Kit, Zymo Research) for subsequent genotyping by the Golden Gate Assay (Illumina Inc.). The methylation level of each CpG site is obtained from the genotyping assay as a b-value ranging from 0.0-1.0, which corresponds to no methylation of either allele to complete methylation of both alleles of the analyzed genes. Genotyping and quality control were performed as previously described [11]. After quality filtering, there were 1,320 CpG sites distributed over 413 gene regions, with 1-10 CpG sites per gene, remaining for analysis (Table S1). We previously reported that the concordance between the methylation levels determined by the Golden Gate assay and by Sanger sequencing of bisulfite-converted DNA for five randomly selected CpG sites was 87% [11] ( Figure S1). Moreover, the concordance between the methylation levels of 21 ALL samples run in replicate using the GoldenGate Assay was high, with a median site-wise Pearson correlation coefficient R = 0.88 for the 28 CpG sites highlighted in the present study (Table S2).

Gene expression data
Genome-wide gene expression data was retrieved from two ALL datasets via the Oncomine tool (Compendia Bioscience). The first dataset contained expression data for 98 ALL patients and bone marrow cells from six healthy controls [16]. The second dataset contained expression data for 533 ALL patients and PBMCs from 74 healthy controls [17].

Statistical analyses
The similarity of individual methylation profiles was assessed using the Pearson correlation coefficient (R). Hierarchical clustering was performed by ''hclust'' with one minus the correlation coefficient as the similarity measure for individual samples and between individual CpG sites. The Wilcoxon Signed-Rank test was used to identify CpG sites with differences in methylation between the paired diagnostic and remission samples. The Wilcoxon Rank-Sum test was used to test for differentially methylated CpG sites between diagnostic BCP and T-ALL samples. Where indicated, P-values were adjusted for multiple testing with the Benjamini-Hochberg method. The Friedman's test was used to identify CpG sites with differential methylation in serial bone marrow samples taken from the same individuals. All statistical analyses were performed in R. Gene lists were analyzed by Ingenuity Pathway Analysis (IPA) (IngenuityH Systems). Pathway-associated P-values were calculated with a Fisher's exact test. The P-value is based on the enrichment of differentially methylated genes compared to the 413 genes with ASE that were analyzed.

Analysis of differential DNA methylation between diagnostic ALL samples, remission samples, and controls
To identify genes with differential DNA methylation, we compared the methylation levels of 1,320 CpG sites in mononuclear cells from bone marrow taken at the time of ALL diagnosis to bone marrow mononuclear cells from the same patients at day 29, 50 or 106 of therapy, when the patients were in remission, and to bone marrow and peripheral blood mononuclear cells from non-leukemic controls. The data for the 1,320 CpG sites from all samples is available in the Supporting Information (Table  S1). We found that the methylation pattern across the 1,320 CpG sites in each of the bone marrow samples of ALL patients were distinct from the samples taken at remission and from the non-leukemic controls ( Figure 1A). The methylation levels of each individual CpG site displayed low variability between samples with a mean standard deviation (SD) of 0.045 across all the 1,320 CpG sites in the DNA samples taken at remission and in the DNA samples from the non-leukemic controls. In contrast, the methylation levels of the CpG site displayed higher variability between samples across the 1,320 CpG sites (mean SD = 0.12) in the ALL cells taken at diagnosis ( Figure 1B). We did not detect any statistically significant differences (Permuted Friedman's P,0.01 and Db.0.10) when the methylation levels of the DNA samples from five ALL patients collected at different time points during remission were compared group-wise (day 29, 50, 106). The small sample size in this analysis precludes detection of statistically significant differences, but we cannot exclude the possibility that there might be differences in CpG site methylation Table 2. CpG sites with differential methylation between acute lymphoblastic leukemia cells and remission cells.  We applied stringent criteria for detecting CpG sites with differential methylation between the cells at ALL diagnosis and bone marrow cells at remission, by requiring a adjusted Pvalue,0.001 for the median difference in Db-values between the two groups and a threshold of 0.30 for calling a CpG site as differentially methylated. This analysis identified 28 CpG sites in 24 genes with differential methylation between the cells taken at ALL diagnosis and bone marrow mononuclear cells at remission ( Table 2). A large proportion (45-95%) of the individual sample pairs fulfilled the criterion of a Db-value.0.3 for the 28 CpG sites. Hierarchical clustering of the samples at diagnosis (n = 20), at remission (n = 30) and the non-leukemic control cells (n = 13) according to the methylation levels of the 28 differentially methylated CpG sites resulted in unequivocal separation between the ALL samples and the bone marrow samples at remission (Figure 2A), with the non-leukemic control samples clustering together with the samples taken at remission. The CpG sites displayed two distinct patterns of differential methylation. For 23 of the 28 CpG sites, exemplified by a CpG site in the WDR35 gene ( Figure 2B), the methylation levels were higher in the ALL cells at diagnosis than in the bone marrow cells during remission (median Db = 0.66). We also identified five CpG sites with the opposite pattern, like FXYD2 ( Figure 2C), with higher median methylation levels in the cells at remission (median Db = 0.55). Four of the genes with differential methylation according to the stringent criteria applied (COL6A2, EYA4, FXYD2, MYO3A) contained two differentially methylated CpG sites. The methylation levels (bvalues) of the CpG sites in these genes were correlated (R.0.70) (Figure 3). At less stringent criteria for calling differential methylation (P,0.05 and Db.0.2) the methylation status of 1-2 additional CpG sites in nine of the genes supported the corresponding hyper-or hypomethyation (Table S1).
The CpG site in the MYBPC2 gene was differentially methylated (Wilcoxon Rank-Sum test, P-value,0.001) between ALL cells of B-cell origin (BCP ALL, n = 16) and T-cell origin (T-ALL, n = 4), with hypomethylation in BCP ALL (median b-value = 0.04) and hypermethylation in T-ALL (median b-value = 0.75). The other 27 CpG sites did not display differential methylation between BCP and T-ALL samples (Table S2), indicating that the majority of the genes identified here based on their methylation profiles are characteristic for ALL cells, independently of immuno-phenotype.

Regulation of gene expression by DNA methylation
On a genome-wide scale there is an inverse relationship between DNA methylation in the vicinity of the TSS and mRNA expression [18]. To examine whether the differentially methylated CpG sites identified here had potential regulatory functions, we queried two published sets of mRNA expression data from ALL cells with data for 98 and 533 ALL samples, respectively [16,17], for up-or down-regulation of the differentially methylated genes. In these datasets, the AMICA1, DBC1, CD300LF, CR1, SEC14L4 and TMEM2 genes identified in our study as hypermethylated were down-regulated and the hypomethylated genes ACY3, FXYD2, and MYBPC2 were up-regulated with 2-fold differences in expression levels between ALL cells and control bone marrow cells [16] or peripheral blood mononuclear cells from healthy individuals [17] ( Table 2) in at least one dataset. The other genes identified in our differential methylation analysis did not meet the minimum criteria of 2-fold differential expression.

Biological roles for the genes with differential methylation
The 24 differentially methylated genes highlighted in our study ( Table 2) were enriched (P,0.05) for functions such as cell-to-cell signaling and interaction (AMICA1, CR1, LGALS8, RYR3) and cell death/apoptosis (CR1, DBC1, EYA4, LGALS8, UQCRFS1) ( Table 3). Among the differentially methylated genes, several have been previously identified as differentially methylated in cancer and are known to be involved in ALL. EYA4 is frequently hypermethylated and down-regulated in colon and esophageal cancers [19,20]. Expression of LGALS8 and UQCRFS1 are associated with relapse in T-ALL [21,22] and the COL6A2, DBC1 and RUNDC3B genes have been found to be hypermethylated and down-regulated in pediatric ALL samples [23,24,25]. The AMICA1 and FXYD2 genes are located near the breakpoint region of the MLL fusion gene on chromosome 11q23 and are potential fusion partners with the MLL gene in ALL cells [26,27]. In a recent study, the MY03A and DBC1 genes were included as methylated markers a panel of 10-genes for detection of bladder cancer in urine samples [28], which is interesting in light of mounting evidence for generalized differentially methylated regions across different cancer types [29]. Besides DBC1, which is a suspected tumor suppressor gene [30], the precise functions on the molecular level of the other genes highlighted in our study have not yet been defined in ALL.

Discussion
In our study of DNA methylation patterns in regulatory regions of 413 genes known to display ASE in ALL cells [4], we identified 24 genes with recurrent differential CpG site methylation that distinguished unequivocally between bone marrow cells from ALL patients and non-leukemic bone marrow cells. To control for possible inter-individual variation in DNA methylation patterns, we compared the ALL cells from each individual patient with ''normal'' mononuclear cells isolated from bone marrow of the patients during follow-up of the treatment when the patients were in remission. We also included bone marrow and blood cells from non-leukemic control individuals in the comparison. It should be noted that the diagnostic ALL samples contained $90% lymphoblasts, while the samples at remission and the samples from the non-leukemic control individuals consist of mononuclear cells from all normal hematopoietic cell lineages, i.e. lymphoid, myeloid and erythroid progenitor cells at varying stages of differentiation. The methylation patterns of the bone marrow cells from the patients at remission and the non-leukemic controls were indistinguishable from one another, and clearly distinct from the methylation Table 3. Functions of genes with differential methylation between acute lymphoblastic leukemia cells and normal bone marrow cells. patterns in the ALL cells at diagnosis. We recognize that the exact proportion between the mononuclear cell types may have varied between the individual remission or control samples. Yet, the biological roles of the differentially methylated genes justify that they could be further explored as diagnostic markers for ALL. Our hypothesis when selecting the 413 genes for methylation analysis based on ASE analysis was that hypermethylation of CpG sites in gene promoter regions may cause ASE by silencing the expression of one of the alleles of expressed genes, and that hypomethylation of one allele could allow expression of only one of the alleles of a gene. ASE can be one-directional, so that all individuals over-express the same allele, or bi-directional, so that either of the two alleles may be over-expressed in different individuals. The majority of the CpG sites in Table 2 displayed methylation differences with absolute Db-values near 0.5, which could reflect complete methylation or lack of methylation of a CpG site on one of the alleles of a gene in the individual ALL cells, as opposed to complete or no methylation of the corresponding CpG site in the normal cells. Of the 22 genes identified in the present study for which ASE data was available, 73% (16/22) displayed bi-directional ASE in ALL cells [4], indicating that stochastic methylation or de-methylation of either allele could cause ASE. Our study confirmed the ALL-specific hypermethylation of three genes (DBC1, RUNDC3B, and COL62A) [23,24,25]. Eight of the differentially methylated CpG sites identified here have been included subtype-specific classifiers for ALL (Table S2) [11]. Although the methylation levels for these sites differs between ALL subtypes, the methylation levels of 27 out of 28 of the sites in the BCP and T-ALL samples deviated from the bone marrow samples at remission in the same direction, indicating that most of these CpG sites reflect ''global'' ALL-specific changes independent of subtype. The CpG site that was differentially methylated between ALL immuno-phenotypes is located in the MYBPC2 gene and is previously known for distinguishing between BCP and T-ALL [11]. Furthermore, eight of the genes (COL6A2, EYA4, MYO3A, RUNDC3B, RYR3, SEC14L4, ZNF462, and ZNF502) were highlighted in our previous study as potential markers for clinical outcome in two subtypes of ALL [11]. Thus, it appears that the aberrant methylation in these genes was acquired in the ALL cells, which renders them potentially interesting targets for studying the molecular events that lead to ALL. According to pathway analysis, the genes identified here are enriched for important cellular functions like cell-to-cell signaling and interaction or apoptosis (P,0.05). The majority of the genes that we identified in our study are hypermethylated in the ALL cells compared to controls, and for 9 out of the 20 genes for which published mRNA expression data from ALL cells was available [16,17], the methylation levels determined in our study show evidence for an inverse relationship with gene expression.
We conclude that our candidate gene approach based on an initial genome-wide survey of ASE in ALL cells was a viable approach to zoom in on genes with methylation signatures that are characteristic of ALL cells and that have plausible functions for the development of ALL. Whether the aberrant methylation patterns in ALL cells were acquired stochastically or is an epigenetic mark characteristic of the leukemia initiating cell [31] will be a key question to address using new tools for genome-wide methylation analysis in future studies. Figure S1 Boxplots showing validation of the Gold-enGate Assay by Sanger sequencing. Bisulfite-converted DNA from eight ALL samples was PCR amplified and sequenced at five randomly chosen CpG sites in five genes (ZNF502 chr3:44,729,363, TNIK chr3:172,661,831, LOXHD1 chr18:42,435,264, NOTCH3 chr19:15,172,990, and NKAIN4 chr20:61,357,043). The methylation status of the C nucleotide in the CpG site as detected by Sanger sequencing (horizontal axis) is plotted against the Beta-values measured by the GoldenGate assay (vertical axis). The data is from Milani et al. [11]. (PDF)