Linkage Study and Exome Sequencing Identify a BDP1 Mutation Associated with Hereditary Hearing Loss

Nonsyndromic Hereditary Hearing Loss is a common disorder accounting for at least 60% of prelingual deafness. GJB2 gene mutations, GJB6 deletion, and the A1555G mitochondrial mutation play a major role worldwide in causing deafness, but there is a high degree of genetic heterogeneity and many genes involved in deafness have not yet been identified. Therefore, there remains a need to search for new causative mutations. In this study, a combined strategy using both linkage analysis and sequencing identified a new mutation causing hearing loss. Linkage analysis identified a region of 40 Mb on chromosome 5q13 (LOD score 3.8) for which exome sequencing data revealed a mutation (c.7873 T>G leading to p.*2625Gluext*11) in the BDP1 gene (B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB) in patients from a consanguineous Qatari family of second degree, showing bilateral, post-lingual, sensorineural moderate to severe hearing impairment. The mutation disrupts the termination codon of the transcript resulting in an elongation of 11 residues of the BDP1 protein. This elongation does not contain any known motif and is not conserved across species. Immunohistochemistry studies carried out in the mouse inner ear showed Bdp1 expression within the endothelial cells in the stria vascularis, as well as in mesenchyme-derived cells surrounding the cochlear duct. The identification of the BDP1 mutation increases our knowledge of the molecular bases of Nonsyndromic Hereditary Hearing Loss and provides new opportunities for the diagnosis and treatment of this disease in the Qatari population.


Introduction
Hearing loss is the most common sensory deficit in humans. Roughly one child in a thousand is born with hearing impairment significant enough to compromise the development of normal language skills. Hearing loss can be caused by environmental as well as genetic factors or by the combination of both. Hereditary Hearing Loss (HHL) includes a broad range of disorders that affect infants, children and adults [1]. HHL can be conductive (involving the outer ear, the tympanic membrane or the middle ear) and/or sensorineural which involves the inner ear or the acoustic nerve [2]. There are two main forms of HHL, Syndromic (SHHL) (about 15-30% of cases) and Nonsyndromic (NSHHL) (approximately 70%) and they can be transmitted with different patterns of inheritance, the most common being autosomal recessive (approx. 75-80% of all cases). In general, HHL with recessive inheritance shows pre-lingual or post-lingual onset of severe to profound hearing loss with all frequencies affected. In autosomal dominant forms, the phenotype is often less severe, the onset usually post-lingual and the severity ranging from moderate to severe [3]. The pathophysiology reflects the vast genetic and clinical heterogeneity, with many different loci and/or genes associated with auditory dysfunction [4]. According to the HHL homepage, more than 140 NSHHL loci have been mapped, and approximately 65 genes have been identified (see http:// hereditaryhearingloss.org/). Based on the type of gene product, these genes can be categorized into several groups such as those coding for proteins involved in the structure and function of hair cells, auditory nerve, and virtually every structural element of the inner ear. As already reported in other studies, HHL especially in Middle Eastern populations is highly heterogeneous, both in the number of genes involved and in the number of alleles at each gene [5]. As regards the Qatari population, a recent study using high-density SNP arrays revealed three clusters consistent with Arabian origin, an eastern or Persian origin and individuals with African admixture [6]. A previous report on HHL in the Qatari population demonstrated a minor role for the GJB2 gene but no role for GJB6 or the A1555G mutation, strongly suggesting the presence of additional causative mutations [7]. Here, we report the identification of a gene, never described before as involved in HHL, by linkage study followed by exome sequencing carried out in a NSHHL Qatari family with second degree consanguinity.

Ethics Statement
Mice. Mouse studies were carried out in accordance with UK Home Office regulations and the UK Animals (Scientific Procedures) Act of 1986 (ASPA) under a UK Home Office licence and the study was approved by the Welcome Trust Sanger Insitute's Ethical Review Committee. Mice were culled using methods approved under this licence to minimize any possibility of suffering.
Human. Consent forms for clinical and genetic studies were signed by each participant and all research was conducted according to the ethical standards as defined by the Helsinki Declaration. The study was approved by the Institutional Review Board of Hamad Medical Corporation (Human subjects ethical  The downward slope indicates that high frequencies were more severely affected than low frequencies, and all patients show a moderate to severe hearing loss. In particular, II:3 is a female patient diagnosed with bilateral moderate to severe NSHHL mainly at High Frequency since she was 4 y.o. II:4 is a male patient diagnosed with bilateral moderate to severe NSHHL since he was 3 y.o. II:5 is a female patient diagnosed with bilateral moderate to severe NSHHL since the girl was 4 y.o. II:6 is a female patient diagnosed with bilateral moderate to severe NSHHL since she was 2 y.o. doi:10.1371/journal.pone.0080323.g001 compliance document approved 08/06/2009). The research project has been conducted within Qatar (Hamad Medical Corporation) with the strong technical support of the Italian research team that led the data analysis and writing of the manuscript.

Family Ascertainment and Clinical Diagnosis
A consanguineous family consisting of 8 family members (4 patients, 2 healthy siblings and their healthy parents) was selected for the analysis and included in the study ( Figure 1A). Written informed consent was obtained for all study participants after approval from the Unit of Audiology at the Hamad Medical Hospital, Doha, Qatar. The family is characterized by a recessive pattern of inheritance. Affected subjects showed bilateral, sensorineural, early onset, post-lingual, progressive hearing impairment. Pure tone audiometry and otoscopy were performed for all 8 individuals by standard procedures. Individuals II:3, II:4, II:5, II:6 have a moderate to severe form of hearing impairment ( Figure 1B). The reported age at which hearing impairment was first noticed was 4 years old (y.o.) for individual II:3, 3 y.o. for individual II:4, 4 y.o. for individual II:5 and 2 y.o. for individual II:6. In particular, a mild hearing loss was initially present involving medium and high frequencies. The progression of the disease led to a gradual involvement of low frequencies and to a moderate-severe clinical phenotype mainly affecting medium and high frequencies. The audiograms in Figure 1B

Genotyping and Linkage Analysis
Genomic DNA was extracted from 300 ml of peripheral whole blood, collected from all family members.
Genotyping was performed using the HumanCytoSNP-12 BeadChip 300K SNPs array. Linkage calculation was performed using the software Merlin for linkage analysis and in particular, a check for Mendelian errors and unlikely recombinations was computed by the Merlin error check, Pedcheck and Pedstats software [8], [9], [10]. Any SNPs found by these error check procedures, were set to missing for all family members. A parametric linkage analysis under a recessive model was performed (using a risk allele frequency 0.00001 and complete penetrance) using Merlin version 1.1.2. The resulting linkage region was then used to filter exome sequence data.

Exome Sequencing and Data Analysis
Starting from 3 mg of genomic DNA, the exome of five family members (II:3, II:4, II:5, II:6 and I:2) was enriched using SureSelectXT Human All Exon V5 (Agilent Technologies, Inc.) and a whole exome fragment library was then constructed following the manufacturer's protocols (SureSelect Target Enrichment System for the Applied Biosystems SOLiD System-Version 2.0.1). The library quality was assessed using Agilent 2100 Bioanalyzer (Agilent Technologies, Inc.) and then the whole exome library was single-end sequenced (see the manufacturer's protocols, Applied Biosystems SOLiD TM 4 System Templated Bead Preparation Guide and Instrument Operation Guide) on a SOLiD4 platform (Life Technologies).
The exome data analysis was limited to the genomic region with a linkage LOD score greater than 3.0. Single nucleotide variants (SNVs) and small insertions and deletions (INDELs) were called by Samtools V0.1.18 [12] and filtered according to a recessive model of inheritance. Low quality variants were filtered out in accordance with the following exclusion criteria: a) sequencing depth coverage less than 20X; or b) quality sequencing score less than 20 (Q20); or c) variation detected on a single DNA strand. All SNVs and INDELs annotations were estimated with ANNOVAR [13] using the RefSeq gene model [14], dbSNP ver137 [15] and the 1000 Genome Project dataset [16].

Sanger Sequencing
Sanger sequencing was carried out according to standard protocols to confirm SNVs from exome sequencing analysis. Moreover, the whole coding region and 100 nucleotides upstream and downstream of each exon (intron-exon boundaries) of MARVELD2 gene have been sequenced.

Expression studies in the Mouse Cochlea
For immunohistochemistry, wild-type mice from the C57BL/6J strain carrying an albino mutation (C57BL/6Brd-Tyr c-Brd ) [17] at postnatal day five (P5) were culled and dissected in PBS before fixation for two days in 10% formalin at 4uC, washing, dehydrating and embedding in paraffin wax. Embedded samples were cut into 8 mm thick sections along the sagittal plane. Immunohistochemistry was then carried out on slides using the Ventana Discovery machine with the manufacturer's reagents CC1 (cat. Sections covering the entire inner ear for at least three different mouse samples at P5 were stained, and the observed expression patterns were considered reliable only if present in all three samples. Stained sections were examined and images obtained using an AxioCam HRc camera mounted on a Zeiss microscope. Images were then processed in Photoshop CS5 extended.

Results
Previous molecular studies showed that the family under investigation was negative for the presence of mutations in the most common worldwide NSHHL genes (GJB2, GJB6 and A1555G mitochondrial mutation) [7], suggesting that this family should harbor a mutation in another HHL gene or in a new gene involved in the HL phenotype. Taking into account the power of linkage studies coupled with massively parallel sequencing technologies, we decided to apply these approaches to identify the causative mutation underlying NSHHL in this Qatari family. Linkage analysis revealed a LOD score of 3.8 (corresponding to the estimated maximum achievable LOD score) in a single region spanning approximately 40 Mb on chromosome 5q13 (see Figure 1A and 2A) and containing 270 genes. The region consists of two peaks, the first one ranging from rs853803 to rs623297 (chr5: 67,761,853-72,408,339) and the second one from rs12153302 to rs17371964 (chr5: 73,777,378-107,296,793). All the remaining regions were completely negative in terms of LOD score apart from one with a non-significant LOD score of 1.2 (Table S1 and Figure S1). Within the linkage region, the only gene already known to be involved in NSHHL, MARVELD2, was then analysed. Sanger sequence of its coding regions plus intron-exon boundaries (,100 bp) revealed no pathogenic variants but only a common polymorphism (rs1185246, MAF = 0.477), suggesting that MARVELD2 was unlikely to be involved in the deafness in this family. Thus, the targeted region identified by linkage analysis (40 Mb) was analyzed by exome sequencing and a total of 8 Gb of sequence data per sample were produced, ensuring an adequate coverage of exons and of approximately 20 nucleotides upstream/ downstream of the intron-exon boundaries for each gene. On average 89% of the targeted region had at least 42-fold coverage. After the exclusion of both PCR-duplicates and low quality-reads, an average of 204913 reads were mapped to the targeted region of each sample. Each individual carried on average 195 high-quality genetic variants in this region and, after filtering according to a recessive model of inheritance, 50 exonic SNVs were detected. 44 out of the 50 SNVs have a minor allele frequency (MAF) .2% as reported in dbSNP (NCBI build 137, http://www.ncbi.nlm.nih. gov/projects/SNP/) and therefore were excluded. The remaining 6 SNVs have been further investigated as reported in Table 1. Four of these were excluded because they were synonymous variants. The remaining two genetic variants in ZNF366 and in BDP1 genes were respectively a non-synonymous and a stop-loss nucleotide substitution. Only nucleotide variation c.7873 T.G (p.*2625Gluext*11) in the BDP1 gene was confirmed by Sanger sequencing (Figure 2B), and this mutation segregated with the hearing loss in the whole family (both parents and six siblings, Figure 1A). BDP1 (B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB) is located at 5q13.2. The c.7873 T.G (p.*2625Gluext*11) mutation appears to be a deleterious mutation leading to the disruption of the stop codon and introducing the codons for 11 additional amino acids ( Figure 2C). Both parents were heterozygous and all the affected offspring were homozygous for the mutation while the healthy siblings were heterozygous (WT/MUT) and homozygous (WT/ WT) respectively. This mutation (rs199721728) was present in the heterozygous state with a MAF = 0.0007 in the ESP6500 database To identify any relevant homology of the extended protein with known motifs/domains, we used the SMART protein database (http://smart.embl-heidelberg.de/). Smart analysis of the fulllength protein predicted the presence of a SANT domain (from residue 299 to 347 E score 1.52e-04) and some putative coiled coil and low complexity regions, but nothing related to the eleven amino acid extension sequence. The BLAST protein algorithm (http://blast.ncbi.nlm.nih.gov/Blast.cgi) was also used to search for homologies of the tail, but did not detect any significant hit in Eukaryotes. However, a subset of 5-6 amino acids showed some non-significant homology with proteins of simple organisms such as Polysphondylium pallidum etc.. The C-terminal extension contains residues of mixed features: five are bulky hydrophobic residues of which three are the aromatic phenylalanine. The other six comprise four hydrophilic and two acidic amino acids. The overall hydrophobic character of this short stretch containing three aromatic residues suggests that the extension could modify the properties (e.g. fold, aggregation tendency and/or interactions) of the rest of the protein.
To investigate a possible cochlear role of BDP1, an expression study in the mouse inner ear at postnatal day five was carried out demonstrating a strong pattern of expression in specific cell types in the cochlea. In particular, clear expression is present in the endothelial cells of the stria vascularis capillaries, and in mesenchyme-derived cells and surrounding extracellular matrix around the cochlear duct including the spiral ligament and basilar membrane ( Figure 3A). To better understand the localization of this protein in the stria vascularis, we used two different markers: Kcnj10 [18] and laminin [19] ( Figure 3C and Figure 3D respectively). Kcnj10 is a membrane channel normally expressed in the strial intermediate cells while laminin marks the basal lamina around strial blood vessels. Comparison of these labelling patterns suggested that Bdp1 was expressed in the endothelial cells (arrows in Figure 3 B-D).

Discussion
As previously described [7], there is no major gene associated with HHL in the Qatari population. In this report, using a combined approach of linkage studies with massively parallel sequencing, we analyzed a second degree consanguineous family not linked to any known HHL gene. Our results suggested a new gene involved in HHL, named BDP1, a member of the TFIIIB complex. Transcription factor TFIIIB plays key roles in transcription by RNA polymerase III. Its three components (TBP, BRF1 and BDP1) participate in crucial molecular events that include RNA polymerase recruitment, formation of the open initiation complex [20] and recycling of transcription. It has been reported that TFIIIB assembled with certain deletion mutants of its BRF1 and BDP1 subunits is competent in pol III recruitment [21] but the resulting preinitiation complex does not open the promoter. The elongation of the BDP1 protein that we detected could affect the formation of the complex or the opening of the promoter in a similar way, leading to altered transcriptional activity of the protein. BDP1 expression has been reported in many different tissues (see UniGene/ESTProfile database), but in this case the abnormality of the protein apparently affects only the auditory system. We found expression of Bdp1 in specific cell types in the cochlea including endothelial cells of the blood vessels of the stria vascularis, further supporting a role for this gene in auditory function.
Little is known about the molecular architecture of BDP1 except the conservation of its segment of amino acids 299-347 that contains the SANT domain. This domain seems to be the major interaction interface between the TFIIIB components BDP1 and BRF1 and is the most highly conserved sequence in the protein.
Apart from this domain, located in the middle of the protein, BDP1 does not show any other specific domains. The predicted mutant BDP1 protein is characterized by an elongation of eleven amino acids, not conserved across species, resulting from a loss of the termination codon. Although the molecular structure of the normal BDP1 protein has not yet been solved we can hypothesize that this extension might influence the structure of the protein possibly leading to a failure to fold correctly as has been described for other abnormal proteins [22]. The BDP1 gene maps in a hearing loss locus DFNB49, 2Mb away from MARVELD2, a gene already known to be involved in hearing loss [23] but containing no predicted pathogenic mutations in this family. All MARVELD2 mutations so far described are located within the coding region or splice sites, which were included in our sequence analysis. The exclusion of any pathogenic variation in these regions together with the finding of a predicted stop-loss mutation in BDP1 supports the involvement of the BDP1 gene mutation in the hearing loss phenotype. Conclusive proof of the involvement of the BDP1 gene in hearing loss will require the finding of additional mutations in other families associated with deafness or evidence from animal models with Bdp1 mutations, but these are not yet available.
In conclusion, the identification by a combined approach of linkage analysis and exome sequencing of p.*2625Gluext*11 mutation of BDP1, a gene in which very few truncating mutations have been described (http://evs.gs.washington.edu/EVS/), its segregation within the large family and the specific cochlear expression pattern of the gene indicate a role for this gene in causing HHL. Moreover, as inherited hearing loss is a highly heterogeneous trait and there are many genes involved that have not yet been identified, the BDP1 gene could be involved in HHL in other populations as well as in the Qatari population. This finding also illustrates the importance of massively parallel sequencing technologies for disease gene identification combined with immunohistochemistry to add tissue-specific expression data. Additional biochemical and functional studies as well as an appropriate mouse model are required to understand better the molecular mechanisms underlying deafness associated with BDP1 mutations. Figure S1 Genome wide linkage LOD score. The genome wide LOD score showing the only significant region on chromosome 5 and containing BDP1 gene is shown. In the xaxis is reported the position of the chromosomes and in the y-axis the LOD score. (TIFF)