Germline ETV6 Mutations Confer Susceptibility to Acute Lymphoblastic Leukemia and Thrombocytopenia

Somatic mutations affecting ETV6 often occur in acute lymphoblastic leukemia (ALL), the most common childhood malignancy. The genetic factors that predispose to ALL remain poorly understood. Here we identify a novel germline ETV6 p. L349P mutation in a kindred affected by thrombocytopenia and ALL. A second ETV6 p. N385fs mutation was identified in an unrelated kindred characterized by thrombocytopenia, ALL and secondary myelodysplasia/acute myeloid leukemia. Leukemic cells from the proband in the second kindred showed deletion of wild type ETV6 with retention of the ETV6 p. N385fs. Enforced expression of the ETV6 mutants revealed normal transcript and protein levels, but impaired nuclear localization. Accordingly, these mutants exhibited significantly reduced ability to regulate the transcription of ETV6 target genes. Our findings highlight a novel role for ETV6 in leukemia predisposition.


Introduction
Acute leukemias comprise the most common form of pediatric cancer, among which acute lymphoblastic leukemia (ALL) makes up 80-85% of the cases [1,2]. It is well recognized that a proportion of affected children develop the disease due to an underlying predisposition. The currently recognized genes responsible for autosomal dominant transmission of childhood leukemia include TP53, CEBPA, PAX5 and GATA-2 [3][4][5][6][7][8]. Occasionally, acute leukemia presents in the context of thrombocytopenia. Consistent with this feature, several heritable thrombocytopenia syndromes are known to exist, some of which are associated with an increased incidence of leukemia. Genes associated with these syndromes include RUNX1, ANKRD26, GATA1, MPL, HOXA11 and RMB8A [9][10][11][12][13][14][15][16]. Despite the identification of these genes, there remain many cases for which the underlying mechanism remains unexplained. In this study, we analyzed one large kindred and one parent-child trio, both affected by ALL and thrombocytopenia. By exome sequencing and also sequencing plausible candidate genes such as those involved in B-lymphocyte development and differentiation, we identified germline mutations in the transcription factor ETV6 that co-segregated with disease in each kindred. Functional studies support a pathogenic role for the observed mutations, both of which affect the DNA binding domain. These findings are consistent with independent observations describing additional kindreds characterized by thrombocytopenia and predisposition to hematopoietic malignancy [17,18] and provide insights into the mechanisms of leukemia susceptibility and clinical phenotypes associated with germline ETV6 mutations [19].

Phenotypic features of the kindreds
As part of a collaborative study focusing on pedigree analysis and gene discovery in childhood leukemia, we identified a Polish/Moroccan kindred in which 10 individuals developed thrombocytopenia and 4 individuals developed thrombocytopenia and ALL (Kindred 1 in Fig 1A). In the 3 ALL cases in Kindred 1 for whom flow-cytometric data were available, all were of the pre-B-ALL subtype. In 3 cases with thrombocytopenia and no evidence of ALL, the mean corpuscular volume (MCV) was decreased in 1 case and normal in 2 others (Table 1). In 2 individuals with no evidence of hematologic abnormalities, there was a history of renal cell cancer and duodenal adenocarcinoma. A second unrelated Western European/Native American family was identified in which a child developed ALL followed by myelodysplastic syndrome and acute myeloid leukemia (AML). This child's mother, maternal aunt and maternal grandfather exhibited thrombocytopenia (Kindred 2 in Fig 1B). This patient was evaluated by a geneticist due to subtle dysmorphic features; however, clinical assessment did not suggest a known genetic syndrome, and microarray and karyotype did not reveal any large deletions, rearrangements or other structural chromosomal abnormalities.
Identification of germline ETV6 mutations DNA from 16 individuals in Kindred 1 (9 individuals with thrombocytopenia and/or ALL and 7 unaffected individuals) was subjected to Sanger sequencing for all exons of a targeted panel of leukemia-associated genes (Methods). Co-segregation of identified variants was tested using an autosomal dominant mode of inheritance. Published demographic data and medical literature were manually reviewed for all variants observed. Only one variant chr12:12,037,415 T>C satisfied the criteria of segregation as well as rarity, as evidenced by its absence in public genomic databases such as dbSNP [20], 1000 genomes [21], Exome Sequencing Project [22] and Sequencing was performed on 9 individuals including the proband (arrow) affected with thrombocytopenia and/or ALL and 7 unaffected individuals as noted in Table 1. (b) In Kindred 2, clinical whole exome sequencing was performed on the proband (arrow) with ALL, MDS and AML, the mother with thrombocytopenia as well as the unaffected father. An ETV6 N385fs mutation was identified. In both kindreds, the ETV6 mutations segregated with disease. Exome Aggregation Consortium (http://exac.broadinstitute.org). This variant, identified in 9 out of 9 (100%) affected family members tested, represents a heterozygous missense c. T1046C mutation in ETV6 (NM_001987). One individual (generation 3, individual 13) with thrombocytopenia and leukemia was not tested. This nucleotide change is predicted to result in the substitution of proline for leucine at codon 349 (L349P; Fig 1A and Table 1). Seven out of 7 (100%) unaffected family members tested exhibited a wild type(WT) ETV6 sequence.
Fibroblast and lymphocyte DNA from the proband with ALL and parents in Kindred 2 were analyzed by clinical whole exome sequencing (Ambry Genetics, Aliso Viejo, CA, USA). The proband and his mother harbored a heterozygous deletion of 5 nucleotides (c.1153-5_1153_1delAACAG) within ETV6. This deletion is predicted to lead to a frameshift at codon 385 and truncation of the ETV6 protein at codon 389 (N385fs, Fig 1B and Table 1). Genomewide DNA copy alteration analysis using single nucleotide polymorphism microarrays of the diagnostic ALL sample from the proband in Kindred 2 revealed deletion of the wild type and retention of the mutant ETV6 allele, as well as deletions of IKZF1, PAX5, BTG1, and RB1.
Other than the 2 mutations in ETV6, there were no pathologic genetic mutations associated with ALL or thrombocytopenia that co-segregated with disease in either kindred. Both ETV6 variants were absent in the National Heart Lung Blood Institute (NHLBI) Exome Sequencing Project (ESP) (http://evs.gs.washington.edu/EVS/), Exome Aggregation Consortium (ExAC) (http://exac.broadinstitute.org/), or St. Jude Children's Research Hospital-Washington University Pediatric Cancer Genome Project (PCGP) databases [23]. SIFT [24] and Polyphen prediction tools suggest the mutations to be deleterious and probably damaging to protein function. To understand how these two mutations might influence protein function, we modeled their effect on the ETV6 protein structure. Both the L349P and the N385fs mutation are located in the ETS domain of ETV6 (Fig 2A). The L349P mutation is predicted to cause significant conformational changes in areas adjacent to the ETS domain by introducing a kink in the H2 α-helix, resulting in possible ETV6 protein misfolding. The N385fs mutation affects the ETS domain and is predicted to truncate ETV6 at a region involved in DNA interaction ( Fig 2B).

Functional assessment of the ETV6 mutations
To evaluate the functional consequences of these mutations, we first assessed whether L349P and N385fs might impair transcriptional repression by ETV6. HeLa cells were transiently cotransfected with constructs encoding the WT or mutant ETV6, as well as constructs containing the PF4 or MMP3 promoters, which harbor multiple ETS binding sites and are natural ETV6 targets. We compared the results to those obtained using other recently described germline ETV6 variants, P214L, R369Q, R399C [17]. As expected, WT ETV6 repressed expression of both reporters (Fig 3A), while each of the ETV6 mutants exhibited significantly decreased repression. To further explore the effects of the ETV6 mutations, we analyzed the expression of EGR1 and TRAF1, genes that are normally upregulated by WT ETV6 [17]. Consistent with published reports, EGR1 and TRAF1 were upregulated 3-fold in cells transfected with WT ETV6. In contrast, the mutants induced minimal to no upregulation for both of these target genes ( Fig 3C). Indeed, the levels were significantly reduced compared to WT ETV6. In each of these assays, we observed comparable levels of WT and ETV6 mutant mRNA transcripts (Supporting Information 1). Thus, transcript stability appears to be unaffected by the ETV6 mutations.
To examine whether the L349P and N385fs mutations negatively impact translation or alter subcellular localization of the ETV6 protein, we performed cell fractionation assays and western blotting of HeLa cells transiently transfected to express WT or mutant ETV6. Both proteins were detectable by Western blotting, with a smaller product observed for the N385fs mutation. Both mutants were undetectable in the nucleus (Fig 4A), but detected within the cytoplasmic fraction (Fig 4B), This is in contrast to the described mutants P214L, R369Q and R399C, which were detected in cytoplasmic as well as nuclear fractions. These patterns were quantitated and confirmed by measuring the nuclear to cytoplasmic ratio ( Fig 4C).

Evaluation of the incidence of germline ETV6 mutations
Fusions involving ETV6 in leukemia have long been recognized [25][26][27]. Other mutation types, including single nucleotide variations, insertions, deletions, frame-shifts and non-sense alterations are also becoming increasingly evident in hematologic malignancies [17,18,28]. We performed additional sequence analysis on exons 5-8 of ETV6 in unrelated probands from 27 unrelated kindreds with a family history of ALL, but identified no mutations in this region of ETV6. To further characterize the spectrum of germline and somatic ETV6 mutations that contribute to childhood leukemia, we screened a cohort of 588 leukemia patients evaluated through the PCGP, a genomic sequencing effort involving pediatric cancers [4,[28][29][30][31][32][33][34][35][36][37][38][39][40][41][42](acces-sion# EGAS00001000348, EGAS00001000654, EGAS00001000380, EGAS00001000253, EGAS00001000246, EGAS00001000447). Seventeen distinct somatic ETV6 variants and two rare germline variants were identified (V37M, R181H; Fig 2A). Both rare variants occurred in patients with B-ALL, but with no evidence for loss or mutation of the WT ETV6 allele within the leukemia samples. In one of these cases, there was a secondary vulvar squamous cell carcinoma. There was nofamily history of leukemia or thrombocytopenia in either of these cases. Luciferase assays performed on these variants showed no significant changes in transcriptional repression activity when compared to WT ETV6 (Fig 3B). We queried several public variant databases for the presence of these two variants. The 1000 genomes project has a total of 2,819 samples from the world's major populations. The current version of the Exome sequencing project (EVS/ESP) has a set of 2,203 African-American and 4300 European-American unrelated individuals, totaling 6,503 samples. The Exome aggregation consortium (ExAC) has 60,706 unrelated individuals sequenced as part of various disease-specific and population genetic studies. In total, we have queried over 140,000 chromosomes. The V37M variant (chr12:11905459G>A) was seen only in the ExAC data at an allele frequency of 1.649x10 -05 and the R181H variant (chr12:12022436 G>A) was found at an allele frequency of 1.071x10 -04 promoter constructs when contrasted with the WT ETV6 in the co-transfection experiment. The experiment was performed with 6 replicates for each condition and repeated 3 times. Statistical analysis was done using an unpaired t-test, the error bars show the Standard Error of Mean (SEM). (b) The effects of V37M and R181H germline mutations on ETV6 function were examined using a Dual Luciferase Reporter Assay. The experiment was performed with 6 replicates for each condition and repeated twice. Statistical analysis was done using an unpaired t-test, the error bars show the Standard Deviation (SD). (c) Quantitative PCR of ETV6 transcriptional targets EGR1 and TRAF1 showed reduced transcriptional abundance in the mutants when contrasted with the WT. The effect was most pronounced in the frameshift mutant. The experiment was performed in triplicate for each condition and repeated three times. Statistical analysis was done using an unpaired t-test, the error bars show the Standard Error of Mean (SEM). in ExAC. It was also found 2 times in NHLBI-ESP (AF = 1.162x10 -04 ) and assigned as rs150089916. While V37M was predicted in silico as benign by SIFT, R181H was classified as deleterious. Based on these preliminary findings the clinical significance of these two additional rare germline variants remains to be determined and at this time is classified as variants of unknown significance.

Discussion
ETV6 encodes an ETS family transcription factor that is frequently rearranged or fused with other genes in human leukemias of myeloid or lymphoid origin [28]. Also known as the TEL oncogene, ETV6 is a sequence specific transcriptional repressor, regulated by auto-inhibition and self-association [43,44]. Descriptions of ETV6 largely focus on the ETV6/RUNX1 fusion, which is a product of a t(12;21) chromosomal translocation, the most common genetic abnormality in pediatric ALL [25]. While somatic deletions or mutations in ETV6 are increasingly recognized in ALL, nothing is known regarding the impact of germline ETV6 mutations [17,28]. Here we extend the description of the clinical phenotype and functional effects associated with novel germline ETV6 L349P and ETV6 N385fs mutations, both of which reside in the highly conserved ETS DNA binding domain and co-segregate with disease in 2 unrelated kindreds affected by thrombocytopenia and ALL.
In both kindreds ETV6 mutations were inherited in an autosomal dominant manner with variable expression of thrombocytopenia and/or ALL. There was no evidence for parent of origin or sex-delimited expression, as males and females equally transmitted the putative predisposing alleles with associated phenotypes manifesting in daughters as well as sons. Interestingly, in addition to his leukemia, the proband in Kindred 2 exhibited craniofacial and musculoskeletal anomalies (anterior placement of the right ear, downward shaped mouth, joint hypermobility and CNS heterotopias seen on magnetic resonance imaging). No other obvious pathogenic variants were identified in this individual by whole exome sequencing. In addition to atypical physical features, the proband in Kindred 2 developed grade 3 myelosuppression following exposure to anti-metabolite therapy; this feature of chemotherapy hypersensitivity was shared by another patient with T-/myeloid mixed phenotype leukemia and a germline ETV6 mutation (P214L) [17]. In addition, two of the three individuals affected with ALL and harboring ETV6 mutations in the kindreds reported here required bone marrow transplantation, and 1 of the 3 expired from disease, in contrast to the 90% rate of cure with chemotherapy alone in more typical ALL. Whether germline ETV6 mutations might serve as markers for toxicity and outcome will require larger studies controlling for other prognostic variables.
In vitro studies revealed impaired function of the ETV6 mutants identified in both kindreds. While ETV6 L349P and N385fs exhibited normal mRNA levels, both mutations were associated with decreased transcriptional regulation (repression and activation). Structural modeling suggests that both ETV6 mutations would impair transcriptional activity by altering the conformation of the ETV6 protein or truncating it within the DNA binding domain. Interestingly, neither mutant localized to the nucleus. Although the precise mechanism for this behavior remains unclear, it seems likely that these two mutations may affect intracellular transport. Consistent with its putative role as a tumor suppressor, examination of the diagnostic leukemia sample in the proband from Kindred 2 revealed retention of the mutant and deletion of the WT ETV6 allele. Our findings are in agreement with 2 recent reports describing additional ETV6 mutations, including R399C, R369Q [17] and R418G [18] in the ETS DNA binding domain and P214L [17,18], located in a serine-proline phosphorylation motif present in the internal linker domain. In the 3 reports of germline ETV6 mutations to date (including the current series), a mixed phenotype of thrombocytopenia and ALL is observed. An association with elevated MCV was not observed in 3 cases included here, which is in contrast to one of the other recent reports [18]. The 2 additional germline variants reported here in patients with ALL (V37M and R181H) did not impair transcriptional repression of ETV6. While this was expected given that these mutations are not located in or close by the ETS DNA binding domain, we cannot exclude that these variants impair ETV6 function on another functional level.
The discovery of mutations in ANKRD26, RUNX1, and the ETS family transcription factors has led to an increased understanding of the genetic basis of hereditary syndromes involving thrombocytopenia, red cell macrocytosis and leukemia [9,10,17,18] and of the pathways regulated by these genes [17,45]. Constitutional alterations in RUNX1 predispose individuals to thrombocytopenia and hematological malignancies, mainly myelodysplastic syndrome and AML, but also T-ALL [3,9,10,46,47]. Mutations in RUNX1 have been shown to result in either haploinsufficiency or can act in a dominant-negative manner, the latter resulting in an increased risk of hematological malignancies [48,49]. Inherited mutations in ANKRD26 [10,45], which is transcriptionally regulated by RUNX1 lead to a similar clinical phenotype, in which thrombocytopenia is often associated with AML and in some cases, with chronic myelogenous leukemia, chronic lymphocytic leukemia and myelodysplastic syndrome [38]. However, there remain additional kindreds affected by thrombocytopenia and/or leukemia that do not demonstrate germline mutations of RUNX1 or ANKRD26. Our data suggest that at least a proportion of these cases result from ETV6 mutations. To date, it is not known whether the ETV6 pathway contributes to non-leukemic cancer phenotypes. We observed no pathogenic germline ETV6 mutations in children with cancers other than ALL in the PCGP. Therefore, the contribution of ETV6 mutations to solid tumor predisposition remains to be determined.
Improved understanding of the heritable nature of childhood cancers has important clinical implications pertaining to genetic counseling and testing of other family members, therapeutic decisions, donor selection for hematopoietic transplantation, and long-term monitoring for therapy-associated or second primary neoplasms [17,50,51]. Evaluation for germline alterations of ETV6 is therefore warranted in families with acute lymphoblastic leukemia, particularly when there is preceding evidence of thrombocytopenia.

Patients and controls
All individuals analyzed for purposes of our research were formally consented to Memorial Sloan Kettering Cancer Center's IRB approved research Protocol, Protocol #00-069, "Ascertainment of Families for Genetic Studies of Familial Lymphoproliferative Disorders", or St. Jude Children's Research Hospital IRB approved research Protocols NR14-132, "Case report of child with novel ETV6 mutation associated with development of leukemia" and/or NR14-162, "ETV6 germline variants in children with acute lymphoblastic leukemia", respectively. For Kindred 1, we included 9 affected and 7 unaffected individuals for sequencing. For Kindred 2, we included the proband with ALL, his mother with thrombocytopenia and his unaffected father. For both kindreds, the presence and subtype of leukemia were confirmed by review of pathology reports, while thrombocytopenia was confirmed by medical history.
Sequencing. DNA from the proband of Kindred 1 was collected by buccal swab and extracted using the buccal swab DNA isolation kit, (Isohelix, Cat-# DDK-50SK2). DNA from all other family members in Kindred 1 was extracted from saliva using the Oragene DNA extraction kit (DNA Genotek, Cat# OG-250). DNA sequencing of all exons of the leukemia associated genes PAX5, ETV6, HOXA11, CDKN2A, TAL1 and ERG was performed using Sanger sequencing. Clinical exome sequencing (Kindred 2) was performed by Ambry Genetics (Ambry Genetics, Aliso Viejo, CA, USA). To this end, DNA libraries were prepared using 2μg of blood derived DNA (Paired End DNA Sample preparation Kit; Illumina). DNA was fragmented and libraries prepared. Target enrichment was carried out utilizing the TruSeq Exome enrichment Kit, which targets 62 megabases of the human genome. Captured DNA libraries were PCR amplified using the supplied paired end PCR primers.
Variant assessment. For Kindred 2, sequence reads were aligned to the human genome reference GRCh37.1 using the Burrows-Wheeler Aligner (BWA) [52], The resulting binary alignment format (BAM files) were jointly called for single nucleotide variants and insertion/ deletions (indels) using the Genome Analysis Toolkit (GATK) v.3.1 [53], while structural variations were detected using Clipping Reveals Structure (CREST) [37,54]. Variant level annotations were performed using in-silico tools, such as ANNOVAR [55]. These annotations were used to predict the effects of identified germline variants on gene function and the relevant medical literature was reviewed. Variants were manually reviewed against the medical literature and disease locus specific databases.
Cell culture and transfections. The HeLa cells used in this study (a gift from A.Ventura, MSKCC) were derived from a subculture of the HeLa cell line (ATCC Cat# CCL-2) and subsequently tested for mycoplasma before being used in experiments. These cells were cultured in Dulbecco's Modified Eagle Medium supplemented with 10%FBS, 1mM L-Glutamine and 1% penicillin-streptomycin. Cell cultures were maintained in a humidified incubator at 37°C in 5% CO 2. Transfections were carried out with FuGENE 6 transfection reagent (Promega) or Lipofectamine 2000 (Life Technologies) according to the manufacturer's instructions.
Luciferase assay. HeLa cells were co-transfected with pHAGE expression constructs, pGL3 reporter constructs and pCS2-Renilla luciferase construct and were harvested 48h after transfection using passive lysis buffer (Promega). Measurement of Firefly and Renilla luciferase expression levels was performed using the Dual-Luciferase Reporter Assay System (Promega) on a GloMax-96 Microplate Luminometer (Promega).
Supporting Information S1 Fig. Quantitative PCR analysis. Quantitative PCR analysis using cDNA derived from transiently transfected HeLa cells reveals comparable transcript levels for WT ETV6, the L349P and N385fs mutants identified in the MSKCC and SJCRH kindreds, as well as mutants described in a recent separate report (P214L, R369Q, R399C) 17 . (TIF)