Genome-Wide DNA Copy Number Analysis of Acute Lymphoblastic Leukemia Identifies New Genetic Markers Associated with Clinical Outcome

Identifying additional genetic alterations associated with poor prognosis in acute lymphoblastic leukemia (ALL) is still a challenge. Aims: To characterize the presence of additional DNA copy number alterations (CNAs) in children and adults with ALL by whole-genome oligonucleotide array (aCGH) analysis, and to identify their associations with clinical features and outcome. Array-CGH was carried out in 265 newly diagnosed ALLs (142 children and 123 adults). The NimbleGen CGH 12x135K array (Roche) was used to analyze genetic gains and losses. CNAs were analyzed with GISTIC and aCGHweb software. Clinical and biological variables were analyzed. Three of the patients showed chromothripsis (cth6, cth14q and cth15q). CNAs were associated with age, phenotype, genetic subtype and overall survival (OS). In the whole cohort of children, the losses on 14q32.33 (p = 0.019) and 15q13.2 (p = 0.04) were related to shorter OS. In the group of children without good- or poor-risk cytogenetics, the gain on 1p36.11 was a prognostic marker independently associated with shorter OS. In adults, the gains on 19q13.2 (p = 0.001) and Xp21.1 (p = 0.029), and the loss of 17p (p = 0.014) were independent markers of poor prognosis with respect to OS. In summary, CNAs are frequent in ALL and are associated with clinical parameters and survival. Genome-wide DNA copy number analysis allows the identification of genetic markers that predict clinical outcome, suggesting that detection of these genetic lesions will be useful in the management of patients newly diagnosed with ALL.


Introduction
Acute lymphoblastic leukemia (ALL) is a malignancy of lymphoid progenitor cells characterized by marked heterogeneity at the molecular and clinical levels [1][2][3]. The WHO classification divides ALL with respect to the presence of genetic abnormalities and it is well known that many of them are strong independent predictive factors of outcome [4]. Most adult patients are considered to be high-risk with a disease-free survival rate of <40% at five years. In contrast, the cure rates are over 85% in pediatric ALLs [3,5]. However, nearly one quarter of children with ALL display high-risk clinical features, and those with treatment failure or relapse have a worse outcome. Several clinical and genetic factors are routinely used to stratify patients into different risk groups, including white blood cell count, age, minimal residual disease and genetic features. However, resistant disease is not restricted to the high-risk group, and ignorance of the molecular events responsible for leukemogenesis may contribute to therapy failure [6]. Therefore, one of the challenges in the study of ALL is to identify hidden genomic lesions that contribute to the classification, prognosis and choice of riskadapted treatment [6].
Genome-wide analysis has shown that most patients with ALL have acquired somatically cooperating oncogenic lesions in the leukemic bone marrow cells, which target critical cellular pathways that contribute to leukemogenesis. These include alterations of genes regulating lymphoid development, tumor suppressors, apoptosis regulators, and oncogenes [6][7][8][9].
Microarray platforms allow the construction of high-resolution maps of genome-wide copy number alterations (CNAs), which are a hallmark of cancer and are important for understanding the mechanisms of disease and for identifying clinical biomarkers, for example, recurrence risk and response to therapy [10][11][12][13]. The oligonucleotide arrays use short nucleotide probes, generally ranging from 20 to 100 nucleotides that are either spotted or synthesized on microarrays. This approach enables very high-resolution genome analysis [9,13]. Recent studies using microarray methods have shown that a better understanding of genetic changes in leukemic cells is fundamental for a more accurate classification of ALL subtypes [14]. Nevertheless, about 30% of pediatric and 50% of adult ALL patients still lack defined genetic hallmarks of biological and clinical significance [15,16]. Therefore, the aim of the present study was to characterize the presence of additional copy number changes through a whole-genome oligonucleotide aCGH analysis in children and adults with ALL, and to identify associations of these alterations with biological and clinical features, and with outcome. The present study demonstrates that genome-wide DNA copy number analysis allows the identification of genetic markers that predict clinical outcome in children and adult patients with ALL.
DNA isolation and oligonucleotide array comparative genomic hybridization aCGH was performed on 265 samples of ALL at diagnosis. The genomic DNA was extracted from frozen bone marrow or fixed peripheral blood cell samples with a QIAmp DNA Mini Kit (Qiagen, Valencia, CA, USA) following the manufacturer's instructions. All samples were tested on an aCGH 12X135K array platform (Roche NimbleGen, Madison, WI, USA). Gender-matched human DNA was used as a reference (Promega, Madison, WI, USA). After labeling, slide preparation, hybridization, scanning and image analysis were performed according to the instructions in the NimbleGen Array User Guide. Segmentation analysis was performed using the CGHweb tool [22]. Significant regions in common between cases were assessed by Genomic Identification of Significant Targets in Cancer (GISTIC) analysis [23]. The statistical significance of the aberrations was displayed as the FDR (false-discovery rate) q values obtained for each region. The method accounts for multiple-hypothesis testing using the FDR framework and assigns a q value to each result, reflecting the probability that the event is due to chance [23]. Values of q<0.05 were considered represent significant amplification and deletion peaks in children and adult patients. The Database of Genomic Variants from Toronto (DGV, http://dgv.tcag.ca/dgv/app/home) was used to exclude DNA variations located in regions with defined copy number variations. Thus, all copy number changes with more than 50% overlap with respect to those reported in DGV were excluded.

Statistical methods
Continuous variables were summarized as their median and range; categorical variables were described as the frequency and percentage of subjects in each category. Differences between groups were compared by the Chi-square or Fisher's exact test (for categorical variables), and Student's independent samples t and Mann-Whitney U tests (for continuous variables), as appropriate. The Chi-square test or Fisher's exact test were used to identify significant associations between CNA and prognostic factors in ALL. Gains and losses were handled separately. All tests were two-sided and values of p<0.05 were considered to be statistically significant.
The Kaplan-Meier method (log-rank test) was used to assess the relationship between CNAs with respect to overall survival (OS) and event-free survival (EFS) (with a cutoff of <0.05). In accordance with the PETHEMA group, OS was defined as the time from diagnosis to death or last follow-up, and the EFS was defined as the time from diagnosis to first event, including death during induction therapy, failure to achieve remission, relapse at any site, death during remission, or the development of a second malignant neoplasm. Observations on patients without events were censored at the date of the last consultation [24,25]. Clinical and biological variables associated with worse prognosis in ALL and the most recurrent CNAs were considered in univariate analyses of OS. Cytogenetic risk and risk groups were established according to cytogenetic studies and frontline risk-adapted protocols, respectively. A Cox proportional hazards regression model was used to estimate the hazard ratios (HRs) and 95% confidence intervals (CIs) of risk factors in multivariate analysis to determine which variables were independently associated with OS. Analyses were done using SPSS, version 21.0 (IBM).
The materials, procedures and statistical analyses are described in greater detail in the Supplementary Patients and Methods (S1 File). Table 1 shows the characteristics of the patients with ALL included in this study. The median age was 16 years (range 0-84 years): 5 years (range 0-17 years) for the childhood patients (<18  4 Includes patients with t(12;21)/ETV6-RUNX1 translocation (E/T) and hyperdiploid karyotype (HD). It should be noted that no adult had the E/T translocation 5 Includes patients with t(9;22), t(v;11q23) and hypodiploidy 6 Risk-group stratification was established according to PETHEMA protocols based on age, WBC and cytogenetic subgroup 7 By flow cytometry 8 Time of relapse criteria: very early: earlier than 18 months after initial diagnosis and less than 6 months after cessation of frontline treatment, early: more than 18 months after initial diagnosis, but less than 6 months after cessation of frontline treatment; late: more than 6 months after cessation of frontline treatment 9 Probabilities are for comparisons between children and adult patients The patterns and frequencies of CNAs reproduced the genomic hallmarks typical of a cohort of children and adults with ALL ( Fig 1A and 1B). Table B in S1 File shows pairwise comparisons of CNAs according to immunophenotypic, age and cytogenetic subgroups of ALL patients. These CNAs harbored key target genes involved in leukemogenesis, such as EBF1

Patient characteristics
Regions of significant recurrent amplification and deletion are shown in Fig  1C. In the whole cohort of children with ALL, there were 66 and 74 significant regions of gain and loss, respectively (Table C in S1 File), whereas in the whole cohort of adults with ALL, there were 81 and 79 regions of gain and loss, respectively (Table D in S1 File). S1 Fig shows the log 2 -ratio copy number heatmap and regions of significant recurrent amplification and deletion in childhood and adult B-ALL and T-ALL, respectively (q<0.05). In the childhood B-ALL cohort, there were 41 and 49 significant regions of gain and loss, respectively (Table E in S1 File), whereas in the adult B-ALL cohort, there were 55 and 44 regions of gain and loss, respectively (Table F in S1 File). Meanwhile, in the childhood T-ALL cohort, there were 11 significant regions each of gain and loss, respectively (Table G in S1 File), whereas in the adult T-ALL cohort, there were 11 and 5 regions of gain and loss, respectively (Table H in S1 File).

Copy number alterations influence the survival of patients with ALL
Iand J Tables in S1 File summarize the CNAs associated with shorter OS in each group of patients. In the whole cohort of children with ALL (n = 142), lesions associated with shorter OS were losses on 14q32.33 (p = 0.019) or 15q13.2 (p = 0.04). Particularly in the group of children without chromosomal abnormalities associated with good risk (t(12;21) translocation and hyperdiploid) or poor risk (t(9;22), t(v;11q23) and hypodiploidy) (n = 82), the copy number changes associated with shorter OS were the gain on 1p36.11 (p = 0.036) and the losses on 6p25.3 (p = 0.032), 15q13.2 (p = 0.008), 16p13.11 (p = 0.021) or 17p13.1 (p = 0.027) ( Table I in S1 File and Fig 2). Although the losses on 1p12 (p = 0.027) and 7p12.2 (p = 0.031) were associated with shorter EFS, there were no associations between CNAs and OS in children with B-ALL (Table K in S1 File). By contrast, the gain on Xq28 was associated with shorter OS in children with T-ALL (p = 0.008). There were no associations between CNAs and EFS in children with T-ALL (Table L in S1 File).
In addition, the present study confirmed in the whole cohort of children with ALL the well known associations of particular clinical and biological variables with worse prognosis, such as the high-risk group (5-year OS, 66% vs. 92%, p<0.0001), poor-prognosis frontline therapy due to refractoriness or relapse events (5-year OS, 46% vs. 98%, p<0.0001) and age over 10 years (5-year OS, 65% vs. 93%, p<0.0001). Multivariate analysis of the group of children without good-or poor-risk cytogenetics showed that the gain on 1p36.11 continued to be a statistically  Table 3).     Table M in S1 File). By contrast, no significant differences between CNAs and shorter OS and EFS were observed in adult T-ALL patients.
It should be noted that some CNAs were commonly associated with both high-risk clinical characteristics and OS. Thus, in the whole cohort of children with ALL, the loss on 14q32.33 was associated with leukocytosis (p = 0.005) and OS (p = 0.019), although this alteration was not an independent risk factor in the multivariate analysis. Meanwhile, in the whole cohort of adults with ALL, there were more CNAs commonly associated with both high-risk clinical characteristics and OS. Particularly, the gain on 19q13.2 was correlated with poor-risk cytogenetics (p = 0.036) and resistant ALL due to refractory and/or relapse disease (p = 0.044). This alteration was also associated with shorter OS (p = 0.011) in the whole cohort of adults with ALL and was an independent prognostic factor in multivariate analysis (HR, 2.8, 95% CI: 1.5-5.3; p = 0.001). The loss on 3q26.32 was associated with older age (55 years) (p = 0.019) and reduced OS (p = 0.028) in adults with poor-risk cytogenetics, and was also of independent prognostic value (HR, 3.2, 95% CI: 1.2-8.3; p = 0.018) in these patients. Others alterations, such as the gain on 6p21.1 (p = 0.008) and the loss on 13q14.2 (p = 0.003) were associated with older age (55 years) and OS (p = 0.013 and p = 0.002), respectively, but were not of independent prognostic value in ALL adults.

Chromothripsis in ALL patients
Chromothripsis (cth) was observed in three of the 265 ALL patients. All three patients were adults (a 57-year-old woman, and two men, aged 25 and 54 years). Cth involved chromosomes 6 (B-ALL), 14q (T-ALL), and 15q (B-ALL), respectively (S2 Fig). There was an absolute, though not statistically significant, difference in the number of CNAs between the patients with and without cth (median of 26 CNAs per case vs. median of 7 CNAs per case, p = 0.126). It is of note that all patients had features associated with poor prognosis. The female B-ALL patient with cth(6) had a complex karyotype with isochromosome 9q, which involved losses in CDKN2A, JAK2 and PAX5 loci, and also had focal deletions of BTG1 and RB1 loci. She was treated with a high-risk protocol and received stem cell transplantation during first remission, remaining in complete remission (CR) until her death. The male T-ALL patient with cth(14q) had focal IKZF1 deletion and was treated with a high-risk protocol, attaining complete remission after induction. He did not experience any event and was alive in CR at last follow-up. Finally, the Down syndrome B-ALL patient had cth(15q) and died during induction. This patient had an altered karyotype and focal deletions in the IKZF1 and BTG1 genes.

Discussion
The risk stratification of ALL is mainly based on genetic analyses. However, the presence of cryptic aberrations may contribute to therapy failures. Therefore, one of the current challenges in the study of ALL is to identify hidden genomic lesions that may be related to patient outcome [3,6]. High-density resolution aCGH has been used in hematological malignancies to identify submicroscopic copy number gains and losses across the genome [26]. In this study, new genetic imbalances were observed in 91.7% of ALL patients and related to the prognosis of childhood and adult ALL. Most of the patients with normal cytogenetics had CNAs by aCGH, indicating that, in most cases, ALL originates from cooperating genetic lesions, and that consequently the chromosomal classification of ALL is not representative of the whole leukemic clone [5-7, 10, 15, 27-31]. Microarray analysis also revealed specific patterns of genomic lesions related to the immunophenotypic, cytogenetic and age subgroups, suggesting that the clinical heterogeneity observed in ALL could be explained, at least partially, by the presence of secondary chromosomal abnormalities [7,15,[32][33][34][35]. In agreement with previous studies using array-CGH or SNP array methods [7,10,15,27,28,[30][31][32][33][36][37][38][39][40][41][42], many small (<5 Mb) genetic lesions detected in children and adults with ALL harbored biological and clinically relevant ALLrelated genes, such as lymphoid transcription factors (PAX5, IKZF1 and EBF1), transcriptional regulators and coactivators (ETV6 and ERG), tumor suppressors (CDKN2A/B, RB1 and TP53), as well as putative regulators of apoptosis (BTG1) (C-H Tables in S1 File). This confirms that several pathways are deregulated in ALL [34,35].
In this study we have identified visible and cryptic imbalances associated with poor-risk clinical features and survival. In children, the gain on 1q32.1 was associated with leukocytosis and poor-risk cytogenetics. Duplications in 1q have been reported in B-cell precursor ALL and Burkitt lymphoma [43]. This abnormality has been associated with B-cell precursor ALL and seems to promote clonal evolution during the progress of hematological disorders [44]. Previous studies have indicated that, although the effect of the dup(1)(q32q21) on prognosis in ALL has not yet been defined, the early relapse in some cases might indicate the dismal prognosis of this alteration. It is important to note that the long arm of chromosome 1 is associated with high chromosomal instability in hematological neoplasias, probably because of specific chromatin properties of this gene-rich region. Furthermore, gene dosage effects might play a role in the specific effects of gains of 1q [44].
In our study, cryptic deletions of 14q32 were associated with leukocytosis and shorter OS in whole cohort of children. Previous studies have reported that miRNA clusters are deleted in some ALL cases bearing cryptic deletions at 14q32 [45]. The downregulation of miRNA clusters may influence the expression level of target genes that modify critical cellular pathways, such as the B-cell lymphoid differentiation pathway (e.g., BCL11a gene, a transcription factor involved in lymphoid differentiation, upstream of the transcription factors EBF1 and PAX5). Thus, the loss of heterozygosity on the 14q32/miRNA cluster may be another mechanism involved in lymphoid B-cell transformation and differentiation, and so could be used as a diagnostic marker and therapeutic target in subsets of ALL [45].
Losses on 15q13 were associated with significantly shorter OS in our whole cohort of children with ALL. This region harbors the leukemia-related gene TJP1 (ZO-1), which encodes a protein involved in signal transduction at cell-cell junctions. Although TJP1 (ZO-1) deletions are infrequent in leukemia, previously have detected the hypermethylated status of TJP1 (ZO-1) gene promoter region in newly diagnosed acute leukemia and relapse disease patients [46]. This status is closely correlated with the pathogenesis and progression of the disease, so the TJP1 (ZO-1) gene has been proposed as being a clinical molecular marker of leukemia [46].
In adult ALL, we observed that the presence of deletions on 13q14 was associated with old age and shorter OS in the whole cohort of adults and those adults without poor-risk cytogenetics [28]. Band 13q14 contains the RB1 gene. This tumor suppressor is rarely reported to be deleted in T-ALL. In contrast, deletion of RB1 has been detected in 30% of B-ALL and nearly 60% of B-CLL cases. Thus, the RB1 pathway is a potential target for therapy of ALL [47]. Furthermore, in our whole cohort of adults, the losses on 17p and gains on 19q13.2 and Xp21.1 were identified as risk factors independently associated with significantly shorter OS, the 17p deletion being the strongest predictor of poor survival in adults with ALL. The tumor suppressor gene TP53 is affected by 17p deletions and plays a crucial role in cell cycle regulation and apoptosis after DNA damage, and its role in tumorigenesis is well recognized in solid and hematological malignancies [48]. TP53 abnormalities have been associated with resistance to treatment and worse prognosis of patients in several tumors. In ALL, TP53 gene abnormalities are important in relapse in childhood and adult ALL, in which they independently predict the high risk of treatment failure in a substantial number of patients [49]. The presence of TP53 alterations has been associated with a reduced response rate to induction therapy and correlated with shortened survival duration, even after successful reinduction therapy [50].
We also observed that the deletions in the 7p12.2 region were associated with shorter EFS in children with B-ALL and shorter OS in adults not included in the poor cytogenetics risk group. This region harbored the IKZF1 gene, which encodes a zinc finger transcription factor that is required for the earliest stages of lymphoid lineage commitment, and is expressed in stem cells and multipotent progenitors; the loss of IKZF1 leads to an arrest at an even earlier stage of lymphoid development [51]. Previous studies in pediatric patients with ALL have showed that the genetic alteration of IKZF1 is associated with a very poor outcome in children with B-cellprogenitor ALL [51]. Likewise, in Philadelphia-negative adults have also shown that IKZF1 deletions are associated with a worse 5-year OS in univariate analysis but are not an independent risk factor in multivariate analysis [52,53]. IKZF1 deletions are secondary events, so their effect on outcome is likely to depend not only on the therapy delivered, but also on the nature of the primary chromosomal abnormality. As in other studies, it was difficult to assess whether the 7p12.2 deletions contribute to the poor outcomes seen in adults with poor-risk cytogenetics, because the incidence of IKZF1 deletions is linked to the primary chromosomal abnormality (Ph+), which probably has a greater impact in this group of patients [52]. Larger studies are needed to determine whether deletion of the IKZF1 gene itself causes the poor outcome seen in these patients or if the effect is driven solely by the primary genetic event [52].
We established that the losses on 3q26.32 were associated with inferior OS in adults with ALL. Particularly, in the group of adults with poor cytogenetics risk, this loss was selected as a risk factor independently associated with shorter OS. The 3q26.32 region harbors the TBL1XR1 gene, which codes for an F-box-like protein responsible for regulating the nuclear hormone repressor complex stability. TBL1XR1 is focally deleted in pediatric ALL [32,33]. However, in our study this loss also significantly affected the adult group. Glucocorticoids (GCs) exert anti-leukemic effects through the induction of apoptosis and/or cell cycle arrest, and are therefore a central component in the treatment of lymphoid malignancies, particularly childhood ALL [54]. Previous studies have indicated that loss of TBL1XR1 is a driver of glucocorticoid resistance in ALL and that epigenetic therapy may have applications for restoring drug sensitivity at relapse [55].
In addition, high-resolution oligonucleotide aCGH also revealed three of the 265 ALL patients to have chromothripsis. The overall incidence of this abnormality may be around 1% of ALL cases. This submicroscopic genetic abnormality is not detectable with standard cytogenetics or FISH. Chromothripsis is caused by a process whereby localized genomic regions are shattered and rearranged in a one-off catastrophic event [56][57][58][59][60][61][62][63][64][65][66] in which the presence of juxtaposed gains and losses leads to an absolutely abnormal genetic architecture [56][57][58][59][60][61][62][63][64][65][66]. Chromothripsis has been reported in a few specific subgroups of ALL, such as early T-cell precursor ALL (ETP ALL) [67], and sporadic and rob(15;21)c-associated (constitutional Robertsonian translocation between chromosomes 15 and 21) iAMP21 B-ALL [68,69]. However, no cases with a high-hyperdiploid karyotype showed chromothripsis [70]. Several studies suggest that chromothripsis is associated with an aggressive malignant phenotype and rapid disease progression in the context of the respective tumor type [56]. In this study we reported some cases of this genetic chaos affecting different chromosomes, both in B and T-ALL. In these cases, chromothripsis was associated with multiple structural rearrangements, hidden microdeletions involving the IKZF1, BTG1 and RB1 genes, and broad deletions in CDKN2A and PAX5 due to the loss of chromosome 9p. However, there was no association between the presence of chromothripsis with an aggressive malignant phenotype and/or rapid disease progression. In fact, two patients were long-term survivors.
In summary, the present study demonstrates that genome-wide DNA copy number analysis allows the identification of genetic markers that predict clinical outcome, suggesting that detection of these genetic lesions will be useful for managing patients newly diagnosed with ALL. The comprehensive analysis of these lesions during diagnosis and treatment will provide new and relevant information about how these lesions may influence the course of the disease and its response to treatment.
Supporting Information S1 File. Supplementary Patients and Methods. Patient characteristics, clinical status, cytogenetics, and aCGH analysis of the studied patients (Table A). Pairwise comparisons of CNAs according to immunophenotypic, age and cytogenetic subgroups of ALL patients (Table B). Regions of significant recurrent amplification and deletion in the whole cohort of children with ALL (n = 142) (q<0.05) ( Table C). Regions of significant recurrent amplification and deletion in the whole cohort of adults with ALL (n = 123) (q<0.05 (Table D). Regions of significant recurrent amplification and deletion in the whole cohort of children with B-ALL (n = 115) (q<0.05) ( Table E). Regions of significant recurrent amplification and deletion in the whole cohort of adults with B-ALL (n = 100) (q<0.05) ( Table F). Regions of significant recurrent amplification and deletion in the whole cohort of children with T-ALL (n = 27) (q<0.05) ( Table G). Regions of significant recurrent amplification and deletion in the whole cohort of adults with T-ALL (n = 23) (q<0.05) (Table H). CNAs associated with shorter OS in the groups of child patients with ALL (Table I). CNAs associated with shorter OS in the groups of adult patients with ALL (Table J). CNAs associated with shorter EFS in children with B-ALL (Table K). CNAs associated with shorter OS in children with T-ALL (Table L). CNAs associated with shorter OS and EFS in adults with B-ALL (Table M).