Targeted Next-Generation Sequencing Revealed Novel Mutations in Chinese Ataxia Telangiectasia Patients: A Precision Medicine Perspective

Ataxia telangiectasia (AT) is an autosomal recessive disease characterized by progressive cerebellar ataxia, oculocutaneous telangiectasia and immunodeficiency due to mutations in the ATM gene. We performed targeted next-generation sequencing (NGS) on three unrelated patients and identified five disease-causing variants in three probands, including two pairs of heterozygous variants (FAT–1:c.4396C>T/p.R1466X, c.1608-2A>G; FAT–2:c.4412_4413insT/p.L1472Ffs*19, c.8824C>T/p.Q2942X) and one pair of homozygous variants (FAT–3: c.8110T>G/p.C2704G, Hom). With regard to precision medicine for rare genetic diseases, targeted NGS currently enables the rapid and cost-effective identification of causative mutations and is an updated molecular diagnostic tool that merits further optimization. This high-throughput data-based strategy would propel the development of precision diagnostic methods and establish a foundation for precision medicine.


Introduction
Ataxia telangiectasia (AT, MIM#208900), due to mutations in the ataxia telangiectasia mutated gene (ATM, MIM Ã 600118), is an autosomal recessive disease characterized by progressive cerebellar ataxia, oculocutaneous telangiectasia and immunodeficiency, as well as elevated α-fetoprotein (AFP) serum levels, immunoglobulin deficiency and predisposition to cancers [1][2][3][4]. Typically occurring early in childhood, AT patients usually die in their twenties due to malignancies or respiratory failure. Since two Chinese AT patients were first described in our previous work, very few AT cases have been reported in China [5][6][7]. For such a rare genetic disease, the advent of precision medicine that aims to generate individualized approaches for prevention, diagnosis and treatment would provide broad insight into genetic diagnosis and counselling [8]. Targeted next-generation sequencing (NGS) currently allows the rapid and cost-effective identification of causative mutations and is an updated molecular diagnostic tool that merits further optimization. In this study, we performed targeted NGS on three unrelated AT patients whose diagnoses were confirmed by the identification of disease-causing variants, thereby illustrating the utility of NGS in precision medicine.

Patients
Three unrelated patients (two males, one female in FAT-1, FAT-2 and FAT-3 families respectively) suspected of having AT were recruited in the study. These patients underwent clinical investigations including neurologic examinations, laboratory tests and brain MRI evaluations. The primary clinical diagnosis for each individual was mainly based on the symptoms of progressive cerebellar ataxia, dysarthria and oculocutaneous telangiectasia, as well as signs of cerebellar atrophy revealed on MRI, elevated serum α-fetoprotein, and altered immunoglobulin profiles (Fig 1, Table 1). Meanwhile, their parents were clinically unaffected without any neurological involvement.

Ethics statement
The study was approved by the Ethics Committee of Xiangya Hospital of Central South University in China (equivalent to an Institutional Review Board). This study was conducted according to the principles of the Declaration of Helsinki. Three affected individuals as well as their parents and 500 Chinese Han unaffected individuals as a healthy control were recruited in the study. Written informed consents were obtained from the 500 healthy controls and parents of the three probands to publish these case details because they did not have the capacity to understand and sign written informed consents.  Targeted capture and next-generation sequencing Genomic DNA was extracted from peripheral blood leukocytes by standard methods. The qualified genomic DNA sample was randomly fragmented by ultrasonoscope (Covaris S2, Massachusetts, USA) and the size of the library fragments was mainly distributed between 200 to 250bp. Next, purified DNA was treated with T4 DNA polymerase, T4 phosphonucleotide kinase and the Klenow fragment of Escherichia coli DNA polymerase to fill 5' overhangs and remove 3' overhangs. Terminal A residues were added following the incubation with dATP and the Klenow 3'-5' exo-enzyme by standard Illumina protocols. Then adapters were ligated to both ends of the resulting fragments. A customized 2.1M Human capture array (Roche NimbleGen, Madison, WI) was designed to capture the fragments including exons, splice sites, and the adjacent intron sequences of the ATM gene, with subsequent sequencing performed with 90bp paired-end reads on a HiSeq2000 instrument (Illumina, San Diego, CA). Sequence reads were mapped to reference genomic DNA (UCSC hg19) with Burrows-Wheeler Alignment software for the subsequent variant analysis [9]. SNPs and indels were identified by using GATK software. Previously identified common variants (frequency > 1%) and synonymous substitutions were filtered out using public databases including dbSNP 142, HapMap samples, and the 1000 Genome Project (http://www.1000genomes.org). Potential disease-causing variants were evaluated using reference tools such as SIFT, Polyphen-2, as well as Mutation Taster predictions. All raw data available from the NIH Short Read Archive with the accession number SRP060492.

PCR, RT-PCR and Sanger sequencing
PCR-based Sanger sequencing was performed to analyze the missense, nonsense, frameshift, and splicing mutations in the probands and their parents. Primer 5.0 was used to generate primers for the amplification of the target gene sites and related flanking sequences. The sequences were compared with the annotated ATM gene reference sequence (NM_000051) to confirm the candidate variants. Additionally, reverse transcription-PCR (RT-PCR) and Sanger sequencing were performed to verify the effect of c.1608-2A>G splicing mutation detected in proband of FAT-1 and his mother using the following primers: forward: 5-CACCTTCAGAA GTCACAGA ATGA-3', reverse: 5'-GCCAATACTGGACTGGTGCT-3'.

Western blot analysis
LCLs were lysed by RIPA buffer (50 mM Tris/pH 8.0, 150 mM NaCl, 1% NP-40, 2 mM EDTA, and protease inhibitor cocktail). Proteins extracts were separated on a 6% SDS-polyacrylamide gel and transferred to nitrocellulose membranes. The membranes were blocked with 5% nonfat milk and incubated with 1:1000 anti-ATM (Cell Signaling Technology, Boston, MA) and βactin. The anti-rabbit or anti-mouse secondary antibodies coupled to horseradish peroxidase were detected using an ECL kit (Millipore, Bedford, MA). Quantification was performed using ImageJ software.

Molecular analysis
Targeted next-generation sequencing for ATM gene was performed on each proband and generated an average of 18.25 million mapped reads of sequence with 226.33-fold average coverage and 99.3% sufficient coverage. We detected more than 12 thousand variants in each patient: 10981 SNPs and 1843 Indels in FAT-1 proband, 13222 SNPs and 2472 Indels in FAT-2 proband, 12426 SNPs and 2320 Indels in FAT-3 proband ( Table 2). Initial variants were filtered out using public databases (dbSNP 142, HapMap samples, and the 1000 Genome Project) in combination with functional prediction. Using Sanger sequencing, we identified five diseasecausing variants on ATM gene in three probands, including two pairs of heterozygous variants (FAT-1:c.4396C>T/p.R1466X, c.1608-2A>G; FAT-2:c.4412_4413insT/p.L1472Ffs Ã 19, c.8824C>T/p.Q2942X) and one pair of homozygous variants (FAT-3: c.8110T>G/ p.C2704G, Hom) (Fig 2, Table 3, S1 Fig), which were segregated with their parents and absent in 500 unaffected healthy controls. Notably, except for one missense mutation (c.4396C>T) reported previously [10,11], four of these mutations were novel, including splicing, frameshift, nonsense and missense mutations. The novel frameshift (c.4412_4413insT) and nonsense mutations (c.8824C>T), as well as the missense mutation (c.8110T>G) were predicted to be damaging by SIFT, Polyphen-2, or MutationTaster and resulted in amino acid substitutions in conserved

Functional investigation
Western blot analysis was performed on LCLs derive from the proband of FAT-3 and normal control to investigate the protein expression of the c.8110T>G/p.C2704G homozygous mutation on ATM gene. Compared to wild type, the ATM protein amount of p.C2704G homozygous mutant was significant decreased, suggesting that c.8110T>G variant on ATM gene might be deleterious (S2 Fig).

Discussion
AT is a multisystem autosomal recessive disorder caused by mutation of ATM gene. To date, more than 780 mutations have been reported (http://www.hgmd.cf.ac.uk/ac/gene.php?gene= ATM), include missense, nonsense, splicing, small indels, large deletions, and duplications in the 66 exons of the ATM gene without apparent hotspots. Most of missense mutations responsible for AT often lead to ATM protein underexpression [12]. In this study, we made the accurate genetic diagnoses on three AT patients due to typical clinical symptoms combined with identification of five causative variants of ATM gene via targeted next-generation sequencing. The finding of novel mutations would broaden the genotypic spectrum of the ATM gene, which is beneficial for better understanding the relationship between the genotype and phenotype of AT. Precision medicine is a group of new strategy that not only improves the prevention, diagnosis and treatment on common diseases and rare diseases, but also enhances the application of computational and bioinformatic tools on clinical medicine. With regards to precision medicine for rare genetic diseases, arriving at a precise diagnosis is essential. Considering our previous report of AT patients via laborious direct Sanger sequencing, such a costly evaluation would not facilitate the evolution of accurate diagnoses of rare Mendelian disorders [5,13]. Currently, this requirement of time and expense is not only a major challenge for clinicians to make a clinical diagnosis efficiently but also for researchers to develop prompt technical, statistical and bioinformatic innovations to guide clinical practice. As a promising tool for molecular detection, such disease-specific NGS assays should be developed to achieve timely and precise diagnosis. A high-throughput data-based strategy would propel the data mining of biomedical information including genomic, molecular and cellular parameters, which represent the "new frontier" for precision diagnosis and lay a firm foundation for precision medicine.
Supporting Information