Expression and Replication Studies to Identify New Candidate Genes Involved in Normal Hearing Function

Considerable progress has been made in identifying deafness genes, but still little is known about the genetic basis of normal variation in hearing function. We recently carried out a Genome Wide Association Study (GWAS) of quantitative hearing traits in southern European populations and found several SNPs with suggestive but none with significant association. In the current study, we followed up these SNPs to investigate which of them might show a genuine association with auditory function using alternative approaches. Firstly, we generated a shortlist of 19 genes from the published GWAS results. Secondly, we carried out immunocytochemistry to examine expression of these 19 genes in the mouse inner ear. Twelve of them showed distinctive cochlear expression patterns. Four showed expression restricted to sensory hair cells (Csmd1, Arsg, Slc16a6 and Gabrg3), one only in marginal cells of the stria vascularis (Dclk1) while the others (Ptprd, Grm8, GlyBP, Evi5, Rimbp2, Ank2, Cdh13) in multiple cochlear cell types. In the third step, we tested these 12 genes for replication of association in an independent set of samples from the Caucasus and Central Asia. Nine out of them showed nominally significant association (p<0.05). In particular, 4 were replicated at the same SNP and with the same effect direction while the remaining 5 showed a significant association in a gene-based test. Finally, to look for genotype-phenotype relationship, the audiometric profiles of the three genotypes of the most strongly associated gene variants were analyzed. Seven out of the 9 replicated genes (CDH13, GRM8, ANK2, SLC16A6, ARSG, RIMBP2 and DCLK1) showed an audiometric pattern with differences between different genotypes further supporting their role in hearing function. These data demonstrate the usefulness of this multistep approach in providing new insights into the molecular basis of hearing and may suggest new targets for treatment and prevention of hearing impairment.


Introduction
The auditory system is one of the most complex senses in humans. The mammalian inner ear is an intricate structure functionally organized into auditory and vestibular components designed to transform mechanical energy into electrical stimuli, which will eventually be translated in the brain [1]. The hearing system is characterized by three structures: a) the outer ear, b) the middle ear, and c) the cochlea of the inner ear, that all play a role in hearing function. The cochlea contains sensory hair cells (HC), mechanoreceptors that initiate action potentials in response to sound, as well as surrounding supporting cells. All of these highly differentiated cells perform specialized functions, requiring the expression of unique combinations of genes [2].
Given the complexity of the hearing mechanism, it should come as no surprise that many genes are involved in hearing. So far, more than 140 loci associated with non-syndromic hearing loss have been mapped, and approximately 65 genes identified in humans (http://hereditaryhearingloss.org/). In mouse models more than 230 genes have been so far described (http:// hearingimpairment.jax.org/index.html) to cause inner ear malformations or dysfunction. Despite the identification of these genes involved in deafness, the molecular basis of variation of normal hearing function is still largely unknown. We recently reported results from a genome-wide association study (GWAS) on normal hearing function in humans, giving an insight into some candidate genes that may be involved in the trait [3]. Several audiometric measurements were tested as quantitative traits and hundreds of SNPs had suggestive p-values (,10 25 or less). Here we report our follow-up investigations of the SNPs with suggestive p values found by the previous GWAS [3]. The approach is based on four sequential steps: a) selection of a short list of candidate genes from the previous study, b) evaluation of their expression patterns in the mouse inner ear, c) genetic association replication of those genes clearly expressed in the cochlea in a new independent cohort from the Silk Road, d) examination of genotype/phenotype relationships for each of the replicated genes or SNPs.

Ethics Statement
Human. Consent forms for clinical and genetic studies were signed by each participant and all research was conducted according to the ethical standards defined by the Helsinki declaration. The study was approved by the Institutional Review Board of IRCCS Burlo Garofolo PROT CE/v-78.
Mice. Mouse studies were carried out in accordance with UK Home Office regulations and the UK Animals (Scientific

Sample Collection, Preparation and Genotyping of the Replication Cohort
We collected samples from 493 people aged .18 years old from several rural communities located along the Silk Road during the Marco Polo Scientific Expedition [4] (see Figure S1). Saliva samples were collected using the Oragene kit (DNA Genotek Inc.) and DNA extracted according to supplier's protocols. DNA samples were genotyped with the Illumina 700K platform. In order to perform a consistent replication study, the collection of genotype and phenotype data followed the same protocols as reported for the previously published work [3]. After standard quality control, all genotypes were checked to ensure that they were reported with the coordinates of the 1000 Genome Project (releases build 37) reference data and on the forward strand of the reference genome [5].

Imputation
Genotype imputation was conducted using SHAPEIT2 [6] for the phasing step and IMPUTE2 [7] for the imputation to the 1000 Genomes phase I v3 reference set [5]. After imputation SNPs with minor allele frequency (MAF) ,0.01 or imputation quality score (Info) ,0.4 were excluded from the statistical analyses.

Audiometric Evaluation
Audiometric tests and a careful clinical examination (i.e. psychological, neurological, cardiological, etc.) were carried out for each individual of the Silk Road cohort. Thresholds for different frequencies (0.25, 0.5, 1, 2, 4, 6, 8 kHz) were measured and the pure-tone averages of air-conduction thresholds (PTA at the lower 0.25, 0.5 and 1 kHz, middle 0.5, 1 and 2 kHz, and high frequencies 4, 8 kHz) were calculated. Furthermore the first three Principal Components (PC1, PC2 and PC3) for all the frequencies combined were computed in order to summarize the audiometric data. Single frequencies, PTAs and PCs were used to run the association studies. Although many different measurements were collected and analyzed, they were not independant; in fact PC1 alone accounted for approximately 80% of the total variance so we are not analyzing multiple phenotypes. Familial forms of inherited hearing loss as well as subjects affected by diabetes or other systemic disorders facilitating the development of hearing loss were excluded from the study.

Selection of Candidate Genes
From our earlier GWAS analysis, we started with a list of 683 SNPs that were associated with the hearing trait with a p value of 10 25 or lower. This number of SNPs was reduced to 84 by a) examining a region 250 kb upstream and downstream of each SNP and selecting the most significantly-associated SNP, removing the less significant SNPs in the region; b) removing SNPs that marked gene deserts, where there was no annotated gene in the region +/2250 kb; c) removing SNPs that marked only genes with unknown functions such as LOC or FAM designations; and d) removing SNPs that marked genes already known to be involved in diseases unrelated to hearing function. The 84 SNPs were then reduced to a total of 19 genes prioritized by the availability of antibodies for expression studies plus at least one of the following criteria a) the lowest p values of 10 27 ; b) genes belonging to gene families previously associated with hereditary hearing loss; c) expression in the inner ear from genome-wide approaches reported in databases including Eurexpress, MGI and the NCBI EST list; or d) genes with potential functional links to the genes selected by the above criteria occurring in the same in silico pathways. For each SNP the closest gene was selected, but in some cases, more than one gene was included based on linkage to a single SNP (see Table 1).

Expression Studies
Wild-type mice at postnatal day 4 or 5 (P4 and P5) from the albino C57BL/6J-Tyr c-Brd or pigmented C3HeB/FeJ inbred strains were used for all experiments. This age was selected because most genes known to be required for normal hearing are expressed by this stage but the bone is soft enough to avoid decalcification -a prolonged step that can affect antigen preservation. The heads of all samples were dissected in PBS before fixation for two days in Table 3. List of genes with specific labeling patterns in the cochlea with a summary of expression results. For immunofluorescence analysis of whole mounts, cochleae were dissected at P4-P5 and fixed in 4% paraformaldehyde for 2 hours. They were then washed in PBS, permeabilized in 1% PBS/ Triton-X-100 (PBT) and blocked with 10% sheep serum. Then, they were incubated with the goat polyclonal primary antibody anti-CSMD1 (Santa Cruz, sc-68280, 1:200) overnight at 4uC. After washes with PBT, samples were incubated with anti-goat Alexa Fluor 488 secondary antibody (Invitrogen, anti-goat, diluted 1:300) and Rhodamine/Phalloidin (Invitrogen, diluted 1:100). Samples were mounted in Prolong Gold Antifading reagent (Invitrogen). Images were acquired on an LSM 510 Meta confocal

Statistical Analysis
Genes showing robust expression patterns in the cochlea were further analyzed for replication. Association analysis of GWAStype was carried out in the Silk Road cohort using the GRAMMAR-Gamma method as implemented in the GenABEL suit [8] for genotyped SNPs and MixABEL [9] for imputed data. Data were adjusted to account for sex, age and genomic kinship as previously described [3] and different genetic models were tested (additive, recessive, dominant and overdominant). Moreover, in order to account for different population-specific effects, that are difficult to replicate at the SNP level, a gene-based test was performed as follows: 1) all the hearing traits were adjusted for covariates and genomic relatedness, using GRAMMAR-Gamma; 2) for each gene only intragenic SNPs were selected; 3) principal components (PC) of the genotypes coded as numerical values (0,1,2) were computed and only those explaining the largest amount of the genetic variance were taken into account (i.e. if N is the total number of intragenic SNPs then PCs explaining more than 1/N variance were chosen); 4) the selected PCs were used as multiple regressors for the corrected traits. This methodology was described and tested by Wang and Abbott [10]. Significance level considered was nominal significance (5%), which is usually accepted for replication studies, especially when generalizing findings to non-European populations as in our case [11]. Furthermore we did not correct for the different genetic models tested because these are not four independent tests [12].

Genotype-phenotype Correlations
The audiometric phenotype was compared in a subset of 2904 subjects from Italy, Caucasus and Central Asia (Silk Road cohort) by focusing on the three genotypes (AA, AB and BB) of each replicated top SNP to study any differences among them. Among these SNPs, the homozygotes for the minor allele group ranged from 34 to 642 people. After computing mean values and standard errors for each frequency (0.25, 0.5, 1, 2, 4, 8 kHz) and for each genotype, results were plotted to form three different audiometric curves. For each plot, the Y-axis represents age-adjusted threshold values measured in decibels sensation level (from 50 dB SL to 210 dB SL) and the X-axis represents frequency measured in kHz. The points represent the mean values for each genotype at each frequency and standard errors of the traits are shown. The profiles were then visually inspected.

Results
In this study, a list of 19 genes were selected from a list with suggestive significance recently reported in a GWAS meta-analysis of different quantitative hearing traits [3], for further investigation of their possible role in hearing function. Those with evident staining within the cochlea (expression step) underwent a replication study in an independent cohort from the Silk Road and, for the replicated ones, genotype-phenotype correlations were carried out (validation step) (Fig. 1). The list of 19 genes was selected following the criteria described in the methods. Details of each gene are given in Tables 1 and 2, including information about expression reported in several databases and descriptions of any mouse mutant phenotypes available. In order to define their expression pattern in the mouse cochlea, which has not been reported before, immunohistochemistry studies in mice at P4-P5 were performed. Results from these studies demonstrated specific cochlear expression patterns for 12 of them (63%) at this age. All 12 genes gave the same levels of staining from the apex to the base of the cochlea. Furthermore, in 9 cases clear labelling was also evident in the vestibular system.
A summary of the distribution of labeling is given in Table 3. According to the inner ear distribution we could divide the pattern of expression into three different categories.

1) Expression in Marginal Cells of the Stria Vascularis
Doublecortin-like kinase 1 (Dclk1) displays a strong pattern of expression in the marginal cells including projections towards the basal cells in the stria vascularis (Fig. 2 A,B). In the vestibular system, staining of Dclk1 could be seen in the dark cells adjacent to the cristae (data not shown).

2) Expression in the Hair Cells of the Organ of Corti
Arylsulfatase G (Arsg) shows striking specific expression at the top of sensory hair cells in the organ of Corti (Fig. 2 C, D). No staining in the vestibular system has been detected.
Solute carrier family 16, member 6 (Slc16a6) shows expression at the top of the outer hair cells in the organ of Corti (Fig. 2 E, F), but weak staining is also present in the hair cells and supporting cells of the maculae and cristae of the vestibular system (data not shown). The staining of this gene is variable between mice of different genetic backgrounds: it is very clear in outer hair cells of C3HeB/ FeJ mice (n = 4), while a more faint and less specific pattern of expression was detected in C57BL/6J-Tyr c-Brd mice (n = 4).  Gabrg3 is a member of the GABA receptor gene family, a group of proteins involved in the GABAergic neurotransmission of the mammalian central nervous system. It shows a striking specific expression in the outer and inner hair cells. In particular, the inner hair cells have the strongest staining (Fig. 2 G). Heavy expression was also noted in the hair cells in the cristae of the vestibular system (data not shown).
Csmd1, CUB and Sushi multiple domains 1, expression was noted at the top of outer and inner hair cells in the organ of Corti (Fig. 2 H). To better understand the precise localization of this expression, a confocal microscopy analysis of whole mount preparations was also performed. The expression is strongly localized in the stereocilia of the inner hair cells but faint staining is also present in the stereocilia of the outer hair cells (Fig. 2 I). Strong expression was also noted in the hair cells in the maculae and cristae of the vestibular system (data not shown).

3) Expression in Multiple Cell Types in the Cochlea
Protein tyrosine phosphatase, receptor type, D (Ptprd) is expressed in outer and inner hair cells, in the stria vascularis (more strongly in the marginal cells), in Kölliker's organ and the spiral ganglion (Fig. 3 A, B, C), and in vestibular hair cells and supporting cells (data not shown).
Cdh13, a member of the cadherin superfamily, is mainly expressed in Claudius cells, outer and inner hair cells, Deiters' cells and pillar cells, Kölliker's organ and interdental cells; staining was also noted in the spiral prominence and external sulcus cells (Fig. 3 D, E).
Ankyrin 2 is a member of the ankyrin family, a group of molecules that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton; Ank2 expression was mainly noted in Hensen's cells, Deiters' cells, pillar cells and in Reissner's membrane (Fig. 3 F, G).
Glutamate receptor, metabotropic 8 (Grm8) gave positive signals in the outer and inner hair cells, Claudius and Hensen's cells, Kölliker's organ and spiral ganglion neurons (Fig. 4 A, B, C, D). Staining of Grm8 was also noted in the root cells, spiral prominence and stria vascularis, and there was strong expression in vestibular hair cells, supporting cells and neurons (data not shown).
RIMS binding protein 2 (Rimbp2) expression was noted in outer and inner hair cells, the lateral edge of the tectorial membrane, root cells, spiral prominence, spiral ganglion and in the marginal and intermediate cells of the stria vascularis (Fig. 4 E, F, G, H). Heavy expression was also noted in the vestibular hair cells, supporting cells and neurons (data not shown).
Centrosomal protein 104 kDa (GlyBP/KIAA0562) expression could be seen in the outer and inner hair cells and in Hensen's cells; strong expression could also be seen in the spiral prominence and in the tectorial and basilar membranes (Fig. 5 A, B, C, D).
The expression of Evi5 was noted in outer and inner hair cells, Deiters' and pillar cells, cells of Kölliker's organ, spiral ganglion, root cell processes, in the spiral prominence and in the stria vascularis, and weak staining could also be seen in Claudius and Hensen's cells (Fig. 5 E, F, G). Strong staining was noted in hair cells, supporting cells and neural dendrites of the vestibular system (data not shown).
As regards the remaining genes, results were not conclusive, showing either no detectable labeling (Amz2, Rpl5 and Cmip), or widespread labeling that appears non-specific (Dffb, Fos). Two different antibodies were tested for Dffb, Amz2 and Cmip, but no specific pattern of labeling was seen with either. In addition, the RNA genes Snora66 and Snord21 were tested by in situ hybridisation using custom-designed Locked Nucleic Acid probes, but no labeling was seen in the ear.
These 12 genes expressed in specific patterns in the mouse cochlea were then included in a replication step by analyzing their genetic association using an independent cohort from the Silk Road, genetically and geographically very distant from the European population. A total of 9 genes (75%) were replicated; 4 genes were replicated at the SNP level (CSMD1, ANK2, CDH13, DCLK1) while an additional five (ARSG, EVI5, GRM8, RIMBP2, SLC16A6) were replicated at the gene level (see Table 4 and 5). In particular, the following SNPs were replicated: rs10091102-C (CSMD1, lowest p-value at PC1 = 0.022) and rs17045059-G (ANK2, lowest p-value at 4 kHz = 0.010) under an additive model; rs17195859-A (CDH13, lowest p-value at 6 kHz = 0.024) and rs9574464-G (DCLK1, lowest p-value = 0.008 at 0.25 kHz) under a dominant model. The strongest associations from the gene-based test were obtained for GRM8 at 4 kHz (p-value = 0.006), and EVI5 (p-value = 0.002) at 6 kHz.
Finally the 9 genes with positive replication data were investigated to check for the possible presence of genotype/ phenotype relationships by comparing the average audiometric curves. As shown in Fig. 6, seven genes (77%) (GRM8, ANK2, CDH13, SLC16A6, ARSG, RIMBP2 and DCLK1) showed a distinct audiometric pattern for each genotype, with differences at some specific frequencies and intensities. GRM8, ANK2 and CDH13, expressed in multiple cells type in the cochlea, showed a downward slope towards high frequencies for the homozygous BB individuals with a deterioration in all the thresholds compared with AA and AB genotypes (see Fig. 6A). Three other genes SLC16A6, ARSG and DCLK1 with a specific expression pattern (in the hair cells in the organ of Corti or marginal cells of the stria vascularis) displayed a slope in which the homozygous BB subjects present an improvement in thresholds (ranging from 0.5 to 8 dB) at all frequencies, in particular for SLC16A6 and ARSG genotypes. A slight improvement at all frequencies for the homozygous genotype AA was present for the RIMBP2 gene (see Fig. 6B). As shown in Fig. 6C, CSMD1 and EVI5 did not show any specific pattern of genotype/phenotype relationship.

Discussion
Despite recent progress, almost nothing is known about the molecular bases of variation of normal hearing, apart from genes identified as being directly involved in hereditary hearing loss and one gene (GRM7) recently described associated with age-related hearing loss (ARHL) in humans [13,14]. Here, we report a strategy based on a) selection of genes from a previous GWAS [3], b) evaluation of their expression in the mouse, c) genetic replication of those showing a specific inner ear expression pattern in different and distant populations and, d) genotype-phenotype relationship of the replicated genes. This strategy proved to be an effective filtering and discovery approach, allowing us to uncover Figure 6. Genotype-phenotype relationship. A, B, C) The figure displays genotype-phenotype relationship for genes with evident differences among the three profiles. The x-axis represents frequencies (kHz), while the y-axis represents sex and age adjusted hearing thresholds (dB SL) with standard deviations, three different colors represent the corresponding genotypes (AA = black, AB = red, BB = green) and the number of subjects for each genotype is reported in brackets. A) GRM8, ANK2 and CDH13 show a deterioration of all the threshold levels for the homozygous BB. B) SLC16A6, ARSG and DCLK1 display an improvement (ranging from 0.5 to 8 dB) at all frequencies for the homozygous BB. Similarly, RIMBP2 presents an improvement for the homozygous genotype AA. C) CSMD1 and EVI5 do not display any particular pattern. doi:10.1371/journal.pone.0085352.g006 novel weakly-associated genetic variants. However, we cannot discard the remaining genes which could also be interesting candidates.
After the selection of 19 genes (see Materials and Methods) with suggestive GWAS p-values plus other criteria described above, expression studies demonstrated that 12 of them showed a clear expression pattern in the mouse cochlea. Nine of these were successfully replicated in an independent cohort of people geographically and genetically distant from those used to perform the first GWAS analysis [3]. This finding is extremely relevant taking into consideration that in four cases the replication was obtained using the same SNPs thus strongly suggesting that the genetic effect observed is of general relevance to multiple human ethnic groups [15]. In line with the criteria used by other large consortia [16,17], we used a nominal significance of p,0.05 for replication using non-European populations.
Finally, the clinical step looking at genotype-phenotype relationships highlighted, in 7 cases, the presence of different audiometric profiles of the three genotypes. Two of them (ARSG, SLC16A6) showed striking expression at the top of sensory hair cells in the cochlea (SLC16A6 only at the top of outer hair cells), DCLK1 was expressed only in marginal cells of the stria vascularis, while the others (CDH13, GRM8, ANK2 and RIMBP2) were expressed more widely in multiple cell types in the cochlea. DCLK1 is involved in several different cellular processes, including neuronal migration in the developing brain and in maturation of the nervous system [18]. Dclk1 expression within the cochlea is localized to the marginal cells of the stria vascularis, a structure that is essential for the secretion of K + into the endolymph and for maintaining its associated endocochlear potential, a feature that enhances the electrochemical gradient across the top of hair cells, making them more sensitive. Dclk1 could be involved in the development of the extensive baso-lateral processes extended by the marginal cells. In this light, the hearing improvement present in homozygous BB subjects might be due to a variation in endolymph homeostasis.
SLC16A6 and ARSG, which are located in the same locus overlapping each other and were both replicated under the additive model, display a downward slope for the homozygous BB genotype very similar to that shown by DCLK1. SLC16A6 belongs to the solute carrier (Slc) family, whose members have already been associated with different forms of deafness [19], and ARSG is involved in hormone biosynthesis, modulation of cell signaling, and degradation of macromolecules. Both are expressed at the top of the hair cells, where the mechanical forces evoked by sound are translated into an electrical signal, so any variants in these genes could alter the sensitivity of hearing.
A genotype-phenotype difference, with a slight improvement of threshold in the homozygous AA subjects, is evident for RIMBP2; interestingly, it maps within the DFNA41 locus associated with dominantly-inherited progressive hearing loss [20]. RIMBP2 is a member of a family of proteins that act as binding partners of the presynaptic active zone proteins RIMs [21] as well as for voltagegated Ca 2+ -channels, such as CACNA1B and CACNA1D, the latter already known to be involved in deafness [22]. In this light, the strong staining of this protein in different cell types in the cochlea and the possible interaction with CACNA1D could indicate an important role in hearing function.
A similar audiometric slope for the three genotypes and in particular for the homozygous BB subjects is also evident for three genes expressed in multiple cell types in the cochlea: GRM8, ANK2 and CDH13 (this last one also replicated under a dominant model, while the first two were replicated under a standard additive model). These three genes belong to families whose members have been already reported as expressed in the hearing system. CDH13 belongs to the cadherin superfamily genes [23]. ANK2 is a member of the Ankyrin family of proteins with protein domains found in TRP channels which may be important for the mechanosensitive channel responsible for hearing [24]. GRM8 is a member, together with GRM7, of the group III metabotropic glutamate receptors (GRMs) which are neurotransmitter receptors [25]. Furthermore, GRM8 and GRM7 proteins show 87% homology and 76% identity to each other [26] and GRM7 has been recently reported to be involved in ARHL [13,14]. Consequently the strong expression of Grm8 in the mouse cochlea supports a role for this gene.
As regards CSMD1 and EVI5, despite their strong expression in the inner ear and their replication in the Silk Road cohort (under the additive model), they didn't show any specific genotypephenotype audiometric pattern. Previous analysis of Csmd1 mRNA expression by in situ hybridization and immunolabelling of neurons indicated that the primary sites of synthesis are the developing CNS and epithelial tissues [27], while we have shown that the staining is localized at the top of sensory hair cells, including in stereocilia. EVI5 encodes a protein which binds to alpha and gamma tubulin, essential components of microtubules of the inner hair cells [28,29].
As regards genes that were not replicated (GABRG3, PTPRD and GlyBP), their expression patterns suggest they are worthy of further investigation, and GABRG3 and PTPRD both belong to families previously implicated in auditory function [30][31][32].
In conclusion, our approach confirms the usefulness of a multistep method, combining several known techniques in order to further investigate the role of genes identified after a GWAS for hearing function, and increases our knowledge of the genes involved in normal hearing function that might also play a role in hearing loss including presbycusis.