A New Genetic Diagnostic for Enlarged Vestibular Aqueduct Based on Next-Generation Sequencing

Enlarged vestibular aqueduct (EVA) is one of the most common congenital inner ear malformations and accounts for 1–12% of sensorineural deafness in children and adolescents. Multiple genetic defects contribute to EVA; therefore, early molecular diagnosis is critical for EVA patients to ensure that the most effective treatment strategies are employed. This study explored a new genetic diagnosis method for EVA and applied it to clinic diagnoses of EVA patients. Using next-generation sequencing technology, we set up a multiple polymerase chain reaction enrichment system for target regions of EVA pathogenic genes (SLC26A4, FOXI1, and KCNJ10). Forty-six EVA samples were sequenced by this system. Variants were detected in 87.0% (40/46) of cases, including three novel variants (SLC26A4 c.923_929del, c.1002-8C>G, and FOXI1 c.519C>A). Biallelic potential pathogenic variants were detected in 27/46 patient samples, leading to a purported diagnostic rate of 59%. All results were verified by Sanger sequencing. Our target region capture system was validated to amplify and measure SLC26A4, FOXI1, and KCNJ10 in one reaction system. The result supplemented the mutation spectrum of EVA. Thus, this strategy is an economic, rapid, accurate, and reliable method with many useful applications in the clinical diagnosis of EVA patients.


Introduction
Enlarged vestibular aqueduct (EVA; MIM 600791) is an autosomal recessive genetic disease causing congenital inner ear malformation that accounts for 1-12% of sensorineural deafness in children and adolescents [1]. EVA can be divided into syndromic EVA (mostly Pendred syndrome [PDS]; MIM274600) and the more common nonsyndromic EVA (DFNB4; MIM 600791) depending on the presence of other inner ear malformations or diseases. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The SLC26A4 (DFNB4; MIM 605646) gene encodes pendrin, which is expressed in the inner ear and is responsible for EVA symptoms. In many EVA patients, SLC26A4 screening identified two disease-causing allele variants [2]. More than 300 variants have been identified in the SLC26A4 gene with EVA or PDS (www.healthcare.uiowa.edu/labs/pendredandbor), and each ethnic population has a different and diverse variant spectrum with their own specific mutation hot spots [3,4]. The function of SLC26A4 also has been explored in Pendrin knockout mice that recapitulate the pathology observed in humans: profound deafness and bulged endolymphatic spaces of the inner ear with striking [5]. Recent studies found that the SLC26A4 promoter contains a key transcriptional regulatory element that binds FOXI1 (MIM 601093), a transcriptional activator of the gene [6]. Additionally, double heterozygosity of SLC26A4 and KCNJ10 (MIM 602208) was identified in individuals with an EVA phenotype from two families, linking KCNJ10 variants with EVA [7].
Routine clinical examinations to diagnose EVA involve audiological tests (e.g., pure tone audiometry, acoustic immittance, auditory steady-state response, and auditory brainstem response) and temporal bone imaging (e.g., computed tomography and magnetic resonance imaging) to reveal expansile vestibular or endolymphatic sac. EVA manifests clinically as fluctuating or progressive sensorineural hearing loss, ranging from mild to profound deafness [8], so most patients are clinically diagnosed when their hearing is already poor. Therefore, early clinical genetic diagnoses of EVA patients are critical to clarify the molecular etiology and implement the appropriate disease control and prevention responses, such as avoiding head trauma, getting cold and noise stimulation. When patients with hearing loss were diagnosed of EVA, they can choose hearing aid or artificial cochlear implantation as soon as possible.
EVA molecular diagnoses have traditionally relied upon Sanger sequencing. More recently, variant detection systems using denaturing high-performance liquid chromatography (DHPLC) have been developed for PDS screening [9,10]. Array-based variant screenings also are a rapid and efficient technique to detect known variants. A recently developed genomewide association approach to detect loci affecting PDS susceptibility used 597 genotyped sows with 62,163 single nucleotide polymorphisms (SNPs) [11]. We developed a microarray to detect 240 variants underlying syndromic and nonsyndromic sensorineural hearing loss, including 11 distinct variants in SLC26A4 [12]. However, all of these EVA genetic diagnosis strategies rely upon either full gene sequencing or test for common variants in only the SLC26A4 gene. These approaches are not optimal, as they do not allow for all EVA-associated genes to be simultaneously examined. Furthermore, these techniques are both time-consuming and costly, limiting their clinical application. Recent developments in next-generation sequencing (NGS) have widely expanded its use in scientific research and clinical fields, promoting the development of assays to rapidly and cost-effectively sequence all genes and noncoding regions of interest. NGS can provide more precise information about genetic causes of disease and recurrence risks, ultimately leading to better treatment [13,14]. Several NGS methods have been developed and clinically applied for hereditary hearing loss [14,15,16].
We developed a molecular diagnosis for EVA based on multiple polymerase chain reaction (PCR) targeted enrichment and NGS that includes three known EVA-associated genes. This genetic diagnostic can be extended to clinical practices and has the power to advance simple genetic screening into genetic diagnoses.

Subjects
A cohort of 46 sporadic Chinese probands diagnosed with EVA was recruited between 2014 and 2016 from the Otolaryngology Department of Xiangya Hospital, Central South University, including 32 males and 14 females, the average age was 6, range from 1 to 26 years old (Table 1). A detailed medical history was available for each proband. Every participant was examined thoroughly, including systemic and specialized physical examination, electric otoscopy and audiological assessment (pure tone audiometry, acoustic immittance, auditory steady-state response and auditory brainstem response). High Resolution Computed Tomography (HRCT) scanning of temporal bone and Magnetic Resonance Hydrography (MRH) examination of inner ear were performed on all the patients. Inclusion criteria of EVA patients were HRCT shows significant bone defect on posterior border of petrosum. The width of external opening of vestibular aqueduct is more than 1.5 millimeter, or the width of middle opening is over 2 millimeter. MRH shows expanded endolymphatic sac. Syndromic features were not detected. The controls consisted of 100 unrelated healthy Chinese volunteers with normal hearing and without another genetic disease. All patients and controls were ethnically Chinese. Written informed consent was obtained from all the participants or their parents (when participants were under 18 years old).and the research was approved by the Ethic Committee of the Xiangya Hospital of Central South University and is compliant with the Code of Ethics of the World Medical Association [17].

Design of Captured Target Genome Regions and Multiple PCR Enrichment System
Genomic DNA was extracted from peripheral blood using standard phenol-chloroform protocols and stored at -20˚C. After purification and quality testing, multiple PCR enrichment was performed in accordance with the special reaction conditions developed in this study. The target genome regions of the three candidate genes (SLC26A4, FOXI1, and KCNJ10) were designed to include their promoter regions (~500 bp), 5'untranslated region (5'UTR), coding regions, splice sites (~8 bp), and 3' untranslated region (3' UTR) ( Table 2). Thirtynine primer pairs were designed using FastTarget Primer (V5.0.1) software developmented by Genesky, with the most stringent conditions (no SNPs in primer annealing region, amplicon length between 230-315 bp, GC content between 30 and 80%). These primers were synthesized and assigned into four multiplex PCR panels to amplify all the target regions of the three genes. The first round enrichment amplification reactions were carried out on a ABI 2720 Thermal Cycler (Life Technologies Corporation, USA) with following cycling program: 95˚C for 2 min; 11 cycles of 94˚C for 20 s, 63˚C-0.5˚C per cycle for 40 s, 72˚C for 1mins; 24 cycles of 94˚C for 20 s, 65˚C for 30 s, 72˚C for 1 mins; 72˚C for 2 min. In the second round, four multiple PCR reaction products from the first round were mixed, and a pair of universal primer with an added index sequence was used to amplify for distinguishing different samples. The EVA sequencing library was constructed after the two rounds of amplification (Fig 1).

Next-Generation Sequencing
The PCR production of each sample was labeled with 8bp index, all the libraries of each sample were pooled. After Cluster Generation and hybridization of sequencing primer, base incorporation was carried out on MiSeq Benchtop Sequencer (Illumina, Inc, San Diego, CA) in one single lane following the manufacturer's standard cluster generation and sequencing protocols, for 608 cycles of sequencing per read to generate paired-end reads including 300bp at each end and 8 bp of the index tag. The average effective sequencing depth of every sample was300× and sequencing depth of all bases was above 20×.

Variants Analysis
Sequencing reads were aligned to hg19 using the Burrows-Wheeler Aligner (BWA) [18]. SNV calling was performed using both GATK and Varscan programs [19,20], and the called SNV data were then combined. The Annovar program was used for SNV annotation [21]. The functional effect of non-synonymous SNVs was assessed by the PolyPhen-2, SIFT, and Mutation-Taster [22,23,24]. Non-synonymous SNVs with SIFT score of <0.05, Polyphen-2 score of >0.85 or MutationTaster score of >0.85 were considered as significant of not being benign.
To sort potentially deleterious variants from benign polymorphisms, perl scripts were used to filter the SNVs against those of dbSNP135. Any SNV recorded in dbSNP135 and with a minor allele frequency of !1% in Chinese from 1000 genome database was considered as benign polymorphisms and therefore removed for subsequent analysis. We also test all the variants for their allele frequency in the Exac exome variant database (http://exac.broadinstitute.org/) to further support the novel variants being pathogenic.

Sanger Sequencing
Variants selected and suspected to be pathogenic were confirmed by Sanger sequencing. Parental samples were used for segregation analysis of the sequence variants identified in the index proband via Sanger sequencing. In addition, 100 controls were sequenced for the variants detected to evaluate the population-wide incidence of the novel variants. Data were analyzed using DNASTAR software program (DNASTAR, Inc., Madison, Wisconsin, US).

Results
All probands were diagnosed as DFNB4. Possibly pathogenic gene variants were found in 40 of 46 cases (87%). Thirty-eight cases carried SLC26A4 variants and two cases carried FOXI1 variants. KCNJ10 gene variants were not detected. By analyzing variant results of all the available DNA of patients' parents, we found that 27 cases conformed to cosegregation principles, including 19 compound heterozygous, two homozygous variants and six heterozygous (Table 3), leading to a purported diagnostic rate of 59%. We identified a total of 24 potentially pathogenic variants in these three genes (Table 4), including three novel variants (SLC26A4 c.923_929del, c.1002-8C>G and FOXI1 c.519C>A), which were absent in 100 control subjects and not reported in the dbSNP, 1000 Genomes Project database and the Exac exome variant database. To see if these mutations were de novo, we also sequenced their parental DNA, and found that the variant FOXI1 c.519C>A was not inherited from the parents. All the three novel variants were uploaded to the Leiden Open Variation Database (http://www.lovd.nl/3.0/home).
Of these variants, 22 were SLC26A4 variants and two were FOXI1 variants. Twenty-four variants included 19 missense, two insertions, one deletion, and two splicing variants. In the SLC26A4 gene, 19 compound heterozygous variants (50%), nine heterozygous variants (23.7%), and three homozygous variants (7.9%) were detected. Besides, we detected seven double heterozygote (18.4%), including five patients whose parents' DNA was not available and two patients(05 and 15) carried two heterozygote, one of which derived from the paternal  transmission, however, the other heterozygote was not inherited from the parents. We cannot be sure that the two heterozygous mutations of the seven patients are located in one allele or two alleles respectively, which needs further analyses. Both FOXI1 gene variants were heterozygous. Sanger sequencing completely verified the NGS results, indicating that the NGS accuracy rate was 100% in our study.

Discussion
EVA is an autosomal recessive hereditary disease with obvious genetic heterogeneity that complicates investigations into its molecular mechanism. Currently studies suggest that an SLC26A4 biallelic variant (compound heterozygous or homozygous) was the main cause of EVA and PDS. EVA patients carrying SLC26A4 biallelic variants usually can be verified by videography diagnosis [27]. In our research, 50% of cases had compound heterozygous variants, 23.7% had heterozygous variants, and 7.9% had homozygous variants, consistent with previous studies. EVA patients carrying SLC26A4 monoallelic variants might only be carriers. However, there were a considerable number of EVA patients who carried SLC26A4 monoallelic variants or variants not detected in SLC26A4. Some arguments support that probably there are other undetected mutations harboring in the promoter region or in a potential splice site of intron of the second SLC26A4 allele, which was not searched in the present study, or there might be a digenic pattern of inheritance with the implication of a second gene [6]. In addition, some researchers suggested that the interaction between genetic and environmental factors may play a role in the pathogenic process of EVA [42].
To date, three genes have been associated with EVA: SLC26A4, FOXI1, and KCNJ10. Variants in SLC26A4 reportedly account for about 50% of PDS and nonsyndromic EVA cases [43], while FOXI1 and KCNJ10 account for only <1% of all cases [44]. Our result indicates that 82% of patients had SLC26A4 variants, further confirming that SLC26A4 is the most common pathogenic gene of EVA. At the time of writing, more than three hundred SLC26A4 variants have been reported (http://www.hgmd.cf.ac.uk/ac/index.php). In this study, we identified two novel mutations of SLC26A4 were not reported in the NCBI dbSNP, 1000 Genomes Project database and the Exac exome variant database, of which, a deletion (c.923_929del TAATTGC) was predicted to cause frameshift and produce truncated proteins by premature stops. The truncated region caused by the variation was located in a highly conserved region among mammals and located in the predicted SulP(high affinity sulphate transporter 1) domain, which were predicted to be disease causing by MutationTaster. In addition, a splice site change (c.1002-8C>G) of SLC26A4 was reported for the first time in this study. The variation was predicted to cause aberrant splicing and considered as pathogenic by MutationTaster. It was confirmed that SLC26A4 c.1002-4 C>G was contributed to PDS by mRNA studies revealing the splice mutation resulted in a putative truncated protein [45]. Interestingly, the novel variation (c.1002-8C>G) discovered in our study adjacent to the reported mutation (c.1002-4 C>G), which is possible impairs the same functional region of SLC26A4 lead to EVA by causing a frameshift and introduction of a premature stop codon. Functional analyses are suggested to be completed.
Though FOXI1 and KCNJ10 have been confirmed to be related to EVA, most research devoted to these genes has failed to find specific pathogenic variants through large sample screening studies [46,47]. Two FOXI1 gene variants were detected in our work and one KCNJ10 variant (c.812G>A) was detected in another study [48]. FOXI1 can activate and regulate the transcription of SLC26A4 gene by combined with two binding sites, FBS1 and FBS2, in promoter region of SLC26A4. A missense mutation (c.519C>A) of FOXI1 was reported for the first time in this study, which was absent in the parents, 100 control subjects and not reported in the dbSNP database. This variant lies within the conserved forkhead DNA-binding domain. The significance of variant located in this domain has been substantiated that compromise FOXI1 transactivation ability of SLC26A4 expression and are causally related to disease phenotype in EVA patients [6]. Therefore, it is possible that the mutation discovered in our study impairs its ability to activate SLC26A4 transcription. Additionally, the missense variant was predicted as disease causing by MutationTaster, a potential detrimental effect at the EVA phenotype is still possible to hypothesize. Functional study will be performed for verification in the future.
Several methods have been traditionally used for deafness gene detection (e.g., Sanger sequencing, restriction enzyme fingerprinting-single strand conformation polymorphism analysis, restriction fragment length polymorphism, DHPLC, gene chip, and mass spectrometry). While each of these technologies has its advantages, they also tend to be time-consuming, tedious, costly, and overall not suitable for large-scale detection in clinical applications. EVA displays high genetic heterogeneity with a genetic diagnosis involving multiple known and unknown loci. Thus, good diagnoses require simultaneous high-throughput detection of multiple gene variants. Since its introduction in 2005, NGS has revolutionized genomic research by providing more cost-effective, faster, and more high-throughput sequencing than traditional technologies [49,50]. Three main NGS platforms currently exist: Illumina/Solexa, Roche/454, and Life Technologies/SOLiD [51]. In this study, the target region capture system used multiple PCR enrichment with special reaction conditions in a PCR-based non-hybridization gene enrichment scheme. Multiple PCR enrichment technology can run 140-200 multiplex PCRs of 150-450 bp simultaneously. This system is easy to use and allows simple NGS library preparation. The capture range is small and amplified segments overlap to optimize cost and uniform coverage. This method is customizable for application in unique research cases. The NGS-based targeted sequencing method developed in this study could directly achieve nearly complete coverage of all coding regions of the three EVA genes. Furthermore, our results manifested that this sequencing technology is highly sensitive and specific in detecting sequence variants in these EVA genes. We propose that this NGS-based screening strategy is an effective alternative method to identify the multiple genetic causes of EVA that will improve the molecular diagnosis of EVA patients in clinical applications.