Genomic mutation profile in progressive chronic lymphocytic leukemia patients prior to first-line chemoimmunotherapy with FCR and rituximab maintenance (REM)

Chronic Lymphocytic Leukemia (CLL) is the most prevalent leukemia in Western countries and is notable for its variable clinical course. This variability is partly reflected by the mutational status of IGHV genes. Many CLL samples have been studied in recent years by next-generation sequencing. These studies have identified recurrent somatic mutations in NOTCH1, SF3B1, ATM, TP53, BIRC3 and others genes that play roles in cell cycle, DNA repair, RNA metabolism and splicing. In this study, we have taken a deep-targeted massive sequencing approach to analyze the impact of mutations in the most frequently mutated genes in patients with CLL enrolled in the REM (rituximab en mantenimiento) clinical trial. The mutational status of our patients with CLL, except for the TP53 gene, does not seem to affect the good results obtained with maintenance therapy with rituximab after front-line FCR treatment.

Introduction Chronic Lymphocytic Leukemia (CLL) is the most prevalent leukemia in Western countries and is notable for its variable clinical course. This variability is partly reflected by the mutational status of IGHV genes that defines two subgroups characterized by different clinical outcomes. IGHV-mutated status is associated with long-lasting stable disease and better prognosis, while the IGHV-unmutated genotype (U-IGHV) is associated with a more active and proliferative disease [1][2][3]. Many CLL samples have been studied in recent years by nextgeneration sequencing (NGS). These studies have identified recurrent somatic mutations in genes, such as NOTCH1, SF3B1, ATM, TP53, BIRC3, and others, which play roles in cell cycle, DNA repair, RNA metabolism and splicing, inflammation and NOTCH and WNT signaling pathways [4][5][6]. Some of them have been found to have prognostic and/or predictive significance. Mutations in TP53, ATM, SF3B1 and NOTCH1 are associated with a significantly shorter time to first treatment and/or overall survival (OS) [4,7]. In general, patients with a more aggressive disease have higher mutation rates, and patients with shorter progression-free survival (PFS) harbor more mutations per megabase [7]. The standard treatment of choice as first-line therapy for young physically fit patients with CLL is the combination of chemoimmunotherapy (CIT) with fludarabine, cyclophosphamide and rituximab (FCR). Long-term results from three studies [8][9][10] have demonstrated a long-duration PFS and OS of nearly 12 years in the subset of patients with mutated IGHV and an absence of adverse genetic features (11q deletion [del (11)] or 17p deletion [del (17)]/TP53 mutation) after treatment with front-line FCR. However, the recent introduction of targeted oral agents, including BTK and BCL2 inhibitors (ibrutinib, acalabrutinib and venetoclax), alone or in combination with monoclonal antibodies (rituximab or obinutuzumab) have demonstrated considerable efficacy in the front-line treatment of patients with CLL with U-IGHV and high-risk cytogenetic biomarkers (del(11q) and del(17p)/TP53 mutation) [11][12][13]. However, we do not know the prognostic impact of new recurrent mutations in patients with CLL suitable for front-line immuno-chemotherapy. Indeed, undetectable measurable residual disease (MRD) at the end of treatment is currently the most powerful predictor of clinical outcome related to favorable PFS and prolonged OS in CLL [8,14].
In this study, we have taken a deep-targeted massive sequencing approach to analyze the impact of mutations in the most frequently mutated genes in a prospectively selected group of patients with CLL with active progressive disease who require treatment. All patients were enrolled in the REM (Rituximab En Mantenimiento [Rituximab in Maintenance]) clinical trial, which consisted of rituximab maintenance for 36 months after achieving at least a partial clinical response to front-line FCR treatment [15].

Patients and samples
Seventy-one peripheral blood samples from treatment-naïve patients with CLL with progressive active disease were included in the present study. The patients were enrolled in the REM clinical trial. REM is a multicenter, non-randomized, prospective phase II clinical trial evaluating the overall response and PFS in patients with CLL with active progressive disease after first-line treatment with FCR, followed by rituximab maintenance every two months for three years in responding patients [15]. Samples were collected at the time of enrollment before treatment. Patient characteristics are summarized in Table 1.
The research project was approved by the Ethics Committee of Hospital Universitario Puerta de Hierro-Majadahonda and conducted following the Declaration of Helsinki. All patients gave their written informed consent for blood collection and the processing of biological analyses included in the present study. The REM study was registered as a clinical trial with NCT#: 00545714 and EudraCT#: 2007-002733-36. Samples were collected from peripheral blood mononuclear cells (PBMCs) using Ficoll (Rafer, Zaragoza, Spain). Tumor-cell purity was calculated based on the CD19/CD5 ratio, measured by FACS. It ranged from 75% to 98%. DNA was extracted with DNAzol Genomic DNA Isolation Reagent (Molecular Research Center, Cincinnati, OH, USA) following the manufacturer's instructions. The quality and quantity of purified DNA were assessed by fluorimetry (Qubit, Invitrogen, Waltham, MA, USA) and gel electrophoresis.
Amplifications of the IGHV-diversity (D)-joining (J) segment were performed on genomic DNA using standard procedures and analyzed by Sanger sequencing according to ERIC recommendations [16]. IGHV sequences were considered mutated or unmutated using the conventional cut-off of 98% identity with the closest germline IGHV gene.

Flow cytometry and MRD analysis
Samples were stained and lysed using a direct immunofluorescence technique as previously described [15]. In summary, sequential bone marrow (BM) and peripheral blood (PB) samples were collected in tubes containing K3 EDTA as anticoagulant. BM samples were immediately diluted 1/1 (vol/vol) in phosphate-buffered saline (PBS). Whole BM and PB samples (approximately 2x10 6 cells in 100 μL per test) were stained and lysed using a direct immunofluorescence technique, as previously described [15]. The antibody combinations tested were CD22/CD23/CD19/CD5, CD81/CD22/CD19/CD5, CD20/CD38/CD19/CD5, CD20/CD79b/CD19/CD5 and sIgKappa/sIgLambda/CD19/CD5. Cells were acquired in two consecutive steps in order to increase the sensitivity of the analysis. First, 20,000 events corresponding to all nucleated cells were acquired. In the second step, the acquisition was done through a "live gate" drawn on the SSC/CD19+ region in which B-lymphocytes are located. When no CLL cells were detected, to have a limit of detection of 0.01%, a minimum of 20 events was needed and 200,000 events were acquired. To ensure a lower limit of quantification of 0.01%, a minimum of 50 events were required. For ZAP70, CD38 and CD49d measurements see García-Marco et al. 2019 [15].

Targeted massive sequencing
To select genes to be analyzed, we browsed the COSMIC and ICGC databases and reviewed previously published data on CLL (library designed in 2013) [17][18][19][20][21][22][23][24][25][26]. Based on their recurrence and prognostic/predictive capacity described in the literature and in our results, the following recurrently mutated genes were selected:  [27]. A library pool was constructed combining all the indexed libraries. Paired-end sequencing was performed in a MiSeq instrument. 100% and 66.67% of the analyzable target regions were covered by at least 5,000 and 10,000 reads, respectively (QC data in S1 Table in S1 File).

Data analysis and variant calling
Data were analyzed using two pipelines: i) MiSeq Reporter alignment, which was performed using the Burrows-Wheeler Aligner (BWA). Variants were identified and annotated with the Genome Analysis Toolkit (GATK); and ii) analysis with SureCall 2.1.13 (Agilent Technologies, Santa Clara, CA, USA) software. The variant lists obtained were analyzed by filtering in Excel and visualizing with the Integrative Genome Viewer (IGV) tool [28]. We applied the filters to identify putative somatic mutations, filtering out those not reaching 100x and those of bad quality (based on the QC Score obtained from MiSeq Reporter). The percentage of reads supporting the mutation from the total number of reads at a given position was taken as 5% in the tumor DNA with a minimum depth of around 200. Only those variants with an allele frequency greater than 20% were considered for validation. Biological impact predictions for detected variants were obtained from the Ensembl Variant Effect Predictor (VEP: http://www.ensembl.org/tools.html), SIFT and PolyPhen predictions for the effect of the mutations on protein function. Variants present in germline DNA or identified as SNPs were excluded from the candidate list.
Matched non-tumoral samples were not available for most patients. The GATK annotates the SNPs available at dbSNP 132 (hg19) and 1000 Genomes Project. Variants present in germline DNA or identified as SNPs were excluded from the candidate list. SNVs with variant allele frequency � 5% and not listed as a single nucleotide polymorphism, or listed but with a MAF < 0.01% (The Exome Aggregation Consortium, 1000 Genomes Project of the International Genome Sample Resource (IGSR), Single Nucleotide Polymorphism Database (dbSNP) v132 of the National Center for Biotechnology Information (NCBI)) were considered. Missense, frameshift, and nonsense mutations were selected.

Validation of mutations by Sanger sequencing
A group of selected variants was chosen for validation by Sanger sequencing. According to the following criteria, mutations to be validated were chosen: VAF greater than 20%, not previously described as recurrent, and with sufficient available DNA. Some known variants were also validated. The validation primers (available upon request) were designed with the Primer3 web tool and sequenced with the Big Dye terminator v3.1 Cycle Sequencing kit and an ABI3730 DNA Analyzer (Life Technologies, Carlsbad, CA, USA).

Statistical analysis
The significance of bivariate relationships between factors was assessed using Pearson's chisquared or Fisher's exact test; values of P < 0.05 were considered significant. Endpoints were PFS, OS and MRD status. OS was calculated from the date of sampling to the date of death or last follow-up, whichever came first. Time to progression was calculated from the date of first treatment to the date of clinical progression or death due to progression. Logistic regression was used to evaluate the association of genetic alterations with MRD. Univariate and multivariate Cox proportional hazard (PH) regression models were used to test the associations of mutations with outcomes. A manual backward selection strategy was used to obtain the final model, with the criterion for eliminating variables being a significance level of P > 0.05. The PH assumption was tested using Schoenfeld residuals [29]. Hazard ratios, with 95% confidence intervals, were estimated for each parameter. All calculations were performed using IBM SPSS Statistics 19 and STATA v14.1.

Gene mutations and correlation with patients' cytogenetic and phenotypic features
Seventy-one peripheral blood samples from treatment-naïve patients with CLL enrolled in the REM clinical trial [15] with symptomatic, progressive disease were included in our analysis ( Table 1). Samples were collected at the time of enrollment in the REM clinical trial (up to 28 days before the first cycle of FCR).
We analyzed the impact of mutations in 26 genes by targeted deep-sequencing as described in the "Methods" section. After sequencing, the median read depth within the regions of interest was 1485 reads/base. A total of 100 mutations were identified in 49/71 (69.0%) patients. Eighteen (25.3%) patients harbored one mutation, whereas 31 (43.6%) had multiple mutations. We did not detect any mutations in 22 patients (31%) (Fig 1 and S3 Table in S1 File).
Statistical analysis of the presence of gene mutations with CLL phenotypic or cytogenetic characteristics revealed some significant associations (S4 Table in S1 File). The presence of � 2 gene mutations was associated with aggressive CLL features such as U-IGHV (P = 0.017), ZAP70 (P = 0.001) and CD49d expression (P = 0.005). As expected, patients with mutations in TP53 had a concurrent del(17p) (P < 0.001). Five of the six TP53-mutated samples were found in elderly patients (> 65 years; P = 0.001). Eighty-three percent of NOTCH1-mutated samples (10/12) showed concurrent mutations in other genes, and NOTCH1 mutations were associated with the expression of ZAP70 (P = 0.021) CD49d (P = 0.001) and were more frequently found in IGHV-U cases (P = 0.006). It was hypothesized that mutations in NOTCH1 regulate CD49d expression through the NFkB pathway involvement, favoring drug resistance [30]. All XPO1mutated samples were found in the U-IGHV group (P = 0.044). Finally, mutations in EGR2 were associated with del(11q) (P = 0.065), and all of them expressed CD38 in more than 30% of CLL cells (although not statistically significant, P = 0.115). By contrast, mutant LRP1B was only detected in samples with a normal/del(13q) karyotype (P = 0.048). Therefore, mutations in NOTCH1 and XPO1 were enriched among cases with high-risk disease.

Association with clinical follow-up and response to therapy
Regarding the prognostic significance of gene mutations in patients with CLL with an active progressive disease requiring treatment, we analyzed the clinical significance of these genetic alterations in terms of clinical response, PFS, OS and analyzing their association with measurable MRD (data available for 61 patients) at the end of treatment. Statistical analyses took all the mutations with VAF > 5% into account since we did not find any significant differences between clonal and subclonal mutations. We analyzed del(11q) together with ATM mutations (ATM/del(11q)) and del(17p) with TP53 mutations (TP53/del(17p)).
Achieving undetectable MRD remission is the most important predictor of PFS in patients treated with CIT, independent of clinical remission status and patients' pretreatment characteristics [8,14], and it is currently accepted by the European Medicines Agency (EMA) as a surrogate marker for PFS. In our series, undetectable MRD was significantly associated with prolonged PFS (HR: 6.049, P < 0.001) and so for OS (HR: 3.907, P = 0.044) (S1 Fig in S2 File), as it is also shown for the whole REM series in the previous publication by García-Marco et al. https://doi.org/10.1371/journal.pone.0257353.g001 [15], and in accordance with the criteria established in previous studies [14]. Therefore, we analyzed the correlation between gene mutations and MRD (in 61 patients with MRD data) by logistic regression analysis including cytogenetic abnormalities, IGHV status, number of mutations (0-1 mut vs. � 2 mut) and genes mutated in at least 3 samples (5% of cases): SF3B1, NOTCH1, XPO1, CSMD3, EGR2, POT1, FBXW7, NFKBIE, and PLEKHG5. We analyzed del (11q) together with ATM mutations (ATM/del(11q)) and del(17p) with TP53 mutations (TP53/del(17p)). Our results showed that additionally to IGHV mutational status (OR: 10,35, P = 0.004), NOTCH1 mutations (OR: 4,35, P = 0.046), were associated with detectable MDR and, therefore, could be used as predictor of MRD detection together with U-IGHV status.
Finally, we performed univariate Cox PH regressions with each of the following clinical variables: IGHV status, gender, age (> vs. � 65 years), Binet stage (low vs. high risk), cytogenetic alterations and with genes mutated in at least 5% of the patients: ATM/del(11q) (24 cases out of 71, 34%), SF3B1 (14 cases, 20%), NOTCH1 (12 cases, 17%), XPO1 (8 cases, 11%), TP53/del (17p) (6 cases, 8.5%), CSMD3 (5 cases, 7%), EGR2 (4 cases, 5.5%), and POT1 (4 cases, 5.5%). In the multivariate analyses of the variables that were significant in the univariate analyses (Table 2A and S2A Fig in S2 File), we found that TP53/del(17p) (HR: 12.843, P < 0.001) and EGR2 mutations (HR: 8.256; P = 0.002) increased the risk of progression after treatment. These findings suggest that EGR2 mutations could be an adverse prognostic biomarker in patients with CLL prospectively treated with FCR followed by R maintenance and could be used as a biomarker to identify patients with poorer outcomes after standard CIT. These results are similar to those reported from the UK LRFCLL4 trial and CLL Research Consortium (CRC) [31], in which alterations in both genes were significantly associated with PFS. The CLL8 study [32] showed that TP53 and SF3B1 were the strongest adverse prognostic markers in patients with CLL receiving current-standard first-line therapy; however, EGR2 was neither associated with PFS nor OS [33].
For OS, additionally to Binet stage only TP53mut/del(17p) was found to be an adverse prognostic marker (Table 2B and S2B Fig in S2 File), and also reported by others studies [4,31,32]. In conclusion, we have found that the mutation frequencies of several genes by next-generation sequencing, mainly SF3B1, NOTCH1, and ATM1, are similar to those reported in series of patients with CLL requiring therapy.
Also, our results show that the mutational status of patients with CLL in cases that reach an undetectable measurable residual disease at 10 −4 level does not seem to affect the PFS status compared to cases with absence or few gene mutations after front-line FCR treatment followed by limited Rituximab maintenance [15]. Mutations of most recurrent driver genes in CLL, except for the TP53 gene, do not seem to affect the sustained clinical response obtained with front-line FCR treatment followed by Rituximab maintenance for three years in our cohort of patients.
Supporting information S1 File. Supporting tables. This file contains S1 Table: Coverage and sequencing quality data for HaloPlex and hotspots. S2 Table: Hotspot custom primers for EGR2. S3 Table: Somatic variants from targeted resequencing. S4 Table: Clinical characteristics, cytogenetic abnormalities and molecular markers REM series of 71 patients).