Profiling of Olfactory Receptor Gene Expression in Whole Human Olfactory Mucosa

Olfactory perception is mediated by a large array of olfactory receptor genes. The human genome contains 851 olfactory receptor gene loci. More than 50% of the loci are annotated as nonfunctional due to frame-disrupting mutations. Furthermore haplotypic missense alleles can be nonfunctional resulting from substitution of key amino acids governing protein folding or interactions with signal transduction components. Beyond their role in odor recognition, functional olfactory receptors are also required for a proper targeting of olfactory neuron axons to their corresponding glomeruli in the olfactory bulb. Therefore, we anticipate that profiling of olfactory receptor gene expression in whole human olfactory mucosa and analysis in the human population of their expression should provide an opportunity to select the frequently expressed and potentially functional olfactory receptors in view of a systematic deorphanization. To address this issue, we designed a TaqMan Low Density Array (Applied Biosystems), containing probes for 356 predicted human olfactory receptor loci to investigate their expression in whole human olfactory mucosa tissues from 26 individuals (13 women, 13 men; aged from 39 to 81 years, with an average of 67±11 years for women and 63±12 years for men). Total RNA isolation, DNase treatment, RNA integrity evaluation and reverse transcription were performed for these 26 samples. Then 384 targeted genes (including endogenous control genes and reference genes specifically expressed in olfactory epithelium for normalization purpose) were analyzed using the same real-time reverse transcription PCR platform. On average, the expression of 273 human olfactory receptor genes was observed in the 26 selected whole human olfactory mucosa analyzed, of which 90 were expressed in all 26 individuals. Most of the olfactory receptors deorphanized to date on the basis of sensitivity to known odorant molecules, which are described in the literature, were found in the expressed olfactory receptors gene set.


Introduction
Analysis of published mammalian genomes indicates that olfactory receptor (OR) genes constitute by far the largest gene family. Initially, Buck and Axel identified this extremely large multigene family based on the observation that OR genes were expressed in olfactory epithelium of rat [1]. Later, other members of this family were identified by sequence homology with the first set of OR genes [2][3][4]. Currently it is accepted that the human genome contains 851 OR loci. More than 50% of the loci are annotated as nonfunctional due to frame-disrupting mutations, leaving approximately 400 potentially functional OR genes.
In the quest to develop industrial applications based on the use of human odorant receptors, ChemCom is committed to the systematic identification of ligands for these chemoreceptors. To fulfill this ambitious deorphanization project, considering the huge number of anticipated functional OR genes, it is mandatory to obtain clues about the involvement of the targeted ORs in the olfactory perception. Moreover, the expression of several predicted OR genes has been detected in non-olfactory tissues, suggesting that a subset of predicted OR genes could have functions unrelated to olfaction. Indeed, expression of OR transcripts has been described in various tissues, including testis and spermatozoa [19,[21][22][23][24][25][26], prostate [27][28][29][30], enterochromaffin cells [6], pulmonary neuroendocrine cells [31], brain [32][33][34][35], tongue [36][37][38], Table 1. Patient's data. erythroid cells [39], placenta [40], breast [41] and kidney [42]. In addition, systematic expression profiling of ORs in non-olfactory tissues using EST data, microarray or deep sequencing analysis [43][44][45] have shown that a large number of putative human OR genes are expressed in these tissues. The analysis of the entire olfactory subtranscriptome in a variety of different human tissues provides a list of several OR genes that are highly expressed in non-olfactory tissues [44]. At least some of these ORs could play a role in spermatozoa chemotactism [19], in muscle regeneration [46] or in blood pressure regulation [47]. Although, it cannot be excluded that OR may present double olfactory and non-olfactory functions; it remains possible that some members of the reported odorant receptors family could be solely non-olfactory G proteincoupled receptors.
Another issue pertaining to ORs deorphanization results from the significant allelic variation observed for human ORs. A recent data mining of the sequence repository of the 1000 Genomes Project, has estimated that the number of variants per OR locus is on average about ten. However, some variants may be nonfunctional missense haplotypic alleles [48]. Furthermore, as it has been demonstrated in mice that functional olfactory receptors are required for proper targeting of olfactory neuron axons to their corresponding glomeruli in the olfactory bulb [49], one may suppose that alleles of OR genes predominantly expressed in the olfactory epithelium correspond to functional haplotypes.
Taken together, a study of OR gene expression in the whole human olfactory mucosa (WHOM) provides an opportunity to define ORs specifically involved in olfaction, allowing choosing frequently expressed and potentially functional ORs for deorphanization campaigns. OR gene expression in WHOM has been seldomly studied, probably due to the difficult access to human material. Two publications have reported the characterization of the expression of the human OR gene family in 3 individuals only using DNA microarray and only in one individual using deep sequencing [45,50]. Therefore, we designed an innovative approach based on a TaqMan Low Density Array (TLDA) containing probes for 356 predicted OR loci to investigate more thoroughly the OR gene expression profile in human olfactory mucosa. Real-time reverse transcription PCR (qRT-PCR) is frequently used for gene expression quantification, at the transcriptional level, due to its reproducibility and sensitivity. The method has also become the preferred method for validating results obtained by other techniques, such as microarrays or deep sequencing.
Herein we present our data obtained using an innovative high throughput transcriptome profiling approach of human OR genes, in WHOM of much larger set of 26 individuals.

Patients and tissues specimens
WHOM were collected 27612 hours post-mortem from 26 individuals (13 women and 13 men; aged from 39 to 81 years, with an average of 67611 years for women and 63612 years for men). Most individuals were of European origin. For each patient, the clinical information is summarized in Table 1. Patients with a history of dysosmia or rhinologic diseases, including allergic rhinitis and chronic sinusitis, were excluded. We also investigated the history of smoking, associated with smell's disorder probably related to alterations of the olfactory mucosa [51,52]. Amongst the 26 subjects, 8 were smokers and the smoking status was unknown for 4 of them.
The WHOM was accurately dissected from the septum, the cribriform plate, the middle and the superior turbinates. As the boundaries between the olfactory and the respiratory epithelium are not clearly defined in humans [53], the septal mucosa was dissected up to the lower limit of the middle turbinate. A control tissue was taken from the mucosa of the inferior turbinate. The dissected tissue samples (about 3.565 cm from each side of the olfactory cleft mucosa) were collected, frozen in liquid nitrogen and stored immediately at 280uC.

Total RNA isolation
Frozen WHOM was crushed in liquid nitrogen. Total RNA was purified and treated with DNase using the RNeasy kit (Qiagen) according to manufacturer instructions. DNase treatment is mandatory as intron spanning primers is not possible due to lack of introns in the OR genes. Quantitative and qualitative assessment of RNA samples (pooled from each side) was performed by NanoDrop spectophotometry (Thermo Scientific) and by microfluidic analysis using a 2100 Bioanalyser (Agilent Technologies). This latter technique produces an electropherogram allowing the evaluation of the integrity of the 18S and 28S ribosomal RNAs ( Figure S1A and S1B) and the algorithm assigns a RNA integrity number (RIN) ranging from 1 to 10, where 10 corresponds to ideally intact RNA and 1 to highly degraded RNA (Table 1).

cDNA synthesis
Total RNA (1 mg per sample-loading port of the 48 PCR reaction channels) was used in the reverse transcription (RT) Quantiscript reaction (Qiagen), performed with a combination of oligo-dT primer and random hexamers following the manufacturer's protocol. Each RNA sample was additionally run on one port (feeding 48 PCR assays) of the TaqMan Low Density Array (TLDA) in the absence of reverse transcriptase (RT-) to assess its potential contamination by genomic DNA. The latter, resulted for all samples, in a borderline amplification for a small subset of the large panel of intronless genes tested. On average, 90% of the PCR yielded a quantification cycle (C q ) value labeled as undetermined or above 35 cycles. The remaining 10% gave an average C q of 34.161.1 indicating a potential low residual genomic DNA contamination.

TLDA design and preparation
A customized TLDA was designed in collaboration with Applied Biosystems. The design process for the assays is described in the White Paper TaqMan Gene Expression Assays from Applied Biosystems. The software TaqExpress was used for the design. Whenever possible, the assays were designed to amplify part of the gene coding sequence. The context sequence determines approximate assay position and the assay IDs allows retrieving the details from Applied Biosystems website (Table S1).

Real-time PCR with genomic DNA
One TLDA card was run with 150 ng (per port) of a pooled human genomic DNA from Clontech to evaluate the efficacy of the assays.
Real-time PCR with plasmid DNA One TLDA card was run with 30 pg (per port) of a pool of 30 OR coding plasmids cloned by ChemCom to evaluate the specificity of the assays. The receptors chosen to perform this experiment were spread throughout the different families of OR genes represented by an unrooted tree based on similarity of amino acid properties. One pg of each plasmid represents about 3000 molecules of specific plasmid per PCR.

TLDA analysis and Statistical analysis
The real-time PCR focuses on the exponential phase, where amplification doubles target templates, following the exponential amplification (2 n where n is the number of cycles). The real-time PCR instrument calculates a C q value representing the PCR cycle at which the reaction reaches a fluorescent intensity threshold above background. For C q calculation, the threshold was manually set at DRn = 0.1 for all samples and all targets (threshold set within the 2 n exponential amplification phase). The results were analyzed using the Sequence Detection Systems (SDS) version 2.4 and Qbase + software packages [55]. Determination of the optimal number of reference genes for the normalization of qPCR data was performed using the geNorm algorithm [56].
Association analyses were performed with R 2.14.1 [R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.Rproject.org.]. Significance Analysis of Microarray (SAM) [http:// www.pnas.org.gate1.inist.fr/content/98/9/5116.full] was performed using the samr package v2.0 Genes with q-value below 0.05 were considered significant. For each variable (i.e. age, sex and smoking status) SAM was performed to find the receptors presenting expressions individually associated with each variable. To assess whether the expressions of all receptors were globally associated with one of these variables, the sum of the square of scores of association (Pearson correlation coefficients for age, tscores for the two other variables) of all the receptors expressions was compared to a null-distribution of the sum of the scores of associations obtained after 10.000 permutations of the patient labels. Heatmap visualization was obtained with the heatmap.2 function within the gplots v2.10.1 package [gplots: Various R programming tools for plotting data (2011), Gregory R. Warnes, URL http://CRAN.R-project.org/package = gplots].

Results
A 384-customized TLDA was designed to investigate the gene expression of a large array of OR genes from 26 WHOM samples. All experiments were performed according to the MIQE (minimum information for publication of quantitative real-time PCR experiments) guidelines [57].

Analysis of RNA purity and integrity
The 260/280 and 260/230 OD ratios were measured for all RNA samples to assess respectively the purity of RNA with respect to protein contamination and residual organic solvent. All samples used showed a 260/280 and 260/230 OD ratios between 1.8 and 2.0, indicative of good quality RNA with minimal contaminations. RNA integrity was also assessed, and samples characterized by RIN (RNA integrity number) ranging from 5.9 to 9.0 were used, the average 6SD being 7.360.8 ( Figure S1A, S1B, Table 1). These RIN values are usually considered acceptable for qRT-PCR experiments [58]. No correlation between the RIN and the delay between death and sample collection was observed (Statistical analysis reveal a p value = 0.36 for a Pearson's correlation with a R 2 = 0.035).

Selection of candidate reference genes
Classical endogenous control genes, exhibiting minimal differential expression across different tissues (MRPL19, CASC3, POLR2A, CDKN1B, TBP, RPL30, PSMC4, YWHAZ, UBC, PPIA) were added to the TLDA, in order to perform a first technical normalization and to compare expression of genes from different tissues as for example WHOM and inferior turbinate. The structure of WHOM is often patchy and contains a significant proportion of respiratory epithelium [54,59]. Therefore, to compare expression of genes from different WHOM, assays for tissue specific olfactory epithelium reference genes were also added to the TLDA for a second biological normalization purpose. The six selected genes were CNGA2, GNAL, ADCY3, RIC8B, RTP1 and OBP2A&2B.
Expression profiling and stability analysis of candidate reference genes  (Table S1). Their stabilities were evaluated by the geNorm algorithm [56] and the geometric mean of 3 stably expressed classical endogenous reference genes (CASC3, PSMC4, CDKN1B) were selected to technically normalize the results. C q of the olfactory epithelium-specific reference genes (Ric8B, GNAL, RTP1, CNGA2, ADCY3, OBP) are shown in the box plots of Figure 1. The distribution of the olfactory epithelium-specific reference genes C q provides a global representation of the variation of reference gene expression as well as information on their relative abundance. More highly expressed genes are associated with lower C q . Average C q values ranged from 21.360.8 (ADCY3) to 30.661.1 (OBP, means 6SD, n = 26) ( Table S1). As suggested in Khan et al. [54], in view of obtaining a biologically relevant normalization, the geometric mean of the six specific reference genes provided a normalizing factor, rather than a factor from a single reference gene.

Real-time PCR with genomic DNA
One TLDA card was run with a pool of human genomic DNA to evaluate the assays (individual gene efficiency amplification). The C q average value of 351 detected OR genes is 24.560.8 (Table S2) suggesting similar amplification rates for all the genes, as expected as the number of targets is identical for all genes in a human genome. Furthermore 21 assays gave an expected undetermined C q values because they correspond either to reference genes or to 4 OR genes for which the assays are designed with intron spanning primers. These assays are identified by a suffix '_m' in the assayID. One out of 4 GAPDH assays gave a non-expected value of 36 and PPIA gave a non-expected value of 27 whereas these assays are designed with intron spanning primers. This reflects a slight non-significant amplification compared with results obtained on RNA (delta-C q of 18 for GAPDH and 9 for PPIA). Finally, one target (OR2A14) showed an abnormal C q value of 12.4, therefore this assays has not been taken into account for the qRT-PCR analysis.

Real-time PCR with plasmid DNA
One TLDA card was run with a pool of 30 OR coding plasmids to evaluate the specificity of the assays. The expected specific PCR amplification of the 30 targets gives an average C q value of 25.461.1, (mean 6SD, n = 30). However, we observe a nonspecific amplification for 15 additional receptors. For 11 of them, the average C q value is 32.561.8, (mean 6SD, n = 11) which reflects a delta-C q value of 7 as compared to specific amplification. In this case, the non-specific amplification is then negligible. For 4 of them, the average C q value is 25.060.8, (mean 6SD, n = 4) which reflects no difference as compared to specific amplification. These 4 couples of genes OR2L3 and OR2L8 (97% identity on the entire DNA sequence), OR52E6 and OR52E8 (90% identity), OR52I1 and OR52I2 (97% identity) and OR5D16 and OR5D18 (83% identity) cannot be discriminated by this TLDA card. Then for the 30 OR coding plasmids, 88% of the PCR amplifications are specific for the target.

Inter-run calibration
Three different experiments were conducted to run the 26 samples, to correct for possible run-to-run variation whenever all samples are not analyzed in the same run, identical sample have been tested in all runs. Figure 2 shows the correlation between C q values for all detected targets (ie. C q average ,35) from the same ARN sample in two different runs. C q values above 35 are not reliable because duplicates are not reproducible. The correlation coefficient reaches 0.83 and the intercept is 0.91 for the detected OR genes. The correlation coefficient reaches 0.99 for the olfactory epithelium-specific reference genes and for classical endogenous control genes.

Expression profiling of olfactory receptors genes in WHOM
On average, for the 26 samples, C q values computed from amplification plots of 355 OR genes range between 25.8 and 39.8 (Table S1). These results reflect a low expression of the OR genes compared to other genes involved in the olfactory cascade. One target (OR2A14) shows an abnormal amplification plot with a C q value of 16.667.3 (mean 6SD); this assays will not be taken into account for the analysis. On average, 62629 (mean 6SD) OR genes per sample gave an undetermined C q value which was arbitrarily assigned to 40 cycles to allow the calculation of an average C q values. 74634 (mean 6SD) OR genes per sample gave a C q value above 35 were considered as expressed at very low level or not expressed at all.
To make a more quantitative analysis, the C q values of each OR were converted into normalized relative quantities (NRQ) following the method previously described [55]. Briefly, we apply the delta-C q quantification model using the average C q obtained for all ORs in the 26 individuals as calibrator (here 32.7) which is transformed into relative quantities using the exponential function, so results are fully equivalent and thus only rescaled. Then the normalization of relative quantities was performed with the geometric mean of the multiple stably expressed classical endogenous reference genes (CASC3, PSMC4, CDKN1B) defined by the geNorm algorithm [56] and followed by the normalization with the geometric mean of the six reference genes specific for olfactory epithelium. Results obtained on genomic DNA and on specific OR coding plasmids allowed to calculate the approximate number of copies of target for 20 ng RNA engaged in the RT-PCR reaction (Table S3). For each individual, the detected OR gene number (.5 copies/ 20 ng RNA) was counted (Figure 3). This cut-off value corresponds to a C q value $35.3. On average, on the 355 OR gene targets, the detected OR gene number is 232628 for women and 238628 for men.
There is a substantial difference in the expressed OR gene repertoire of each of the samples. A set of 90 human OR genes were detected (.5 copies/20 ng RNA) in all tested individuals. Another set of 140 human OR genes were detected not in all tested individuals but in more than half of the population (in 13 individuals and more on 26) and a third set composed of 125 human OR genes were more rarely detected (in less than 13 individuals on 26) (Figure 4).
Globally, the OR gene expression was not associated with age (p value = 0.19), sex (p value = 0.23) or smoking (p value = 0.66, Pearson's correlation). Individually, 22 OR genes showed a decreased profile and 7 OR genes showed an increased profile related with age ( Figure 5). There is no significant association between individual OR gene expression and sex or smoking. Figure 6 shows OR genes ranked in function of their expression level, from the highest to lowest. It shows 273 (77%) human OR genes above the cut-off value of 5 copies/20 ng RNA. No significant enrichment in class I or class II ORs is observed in the expressed set. Indeed, 17.6% of OR genes expressed belong to class I while 15.2% of the OR genes tested belong to this class.
An inverse distribution is observed with potentially nonfunctional OR genes. Expression levels of 52 OR genes with mutations affecting positions in the consensus amino acid motifs specific for OR genes [60] were analyzed. These receptors, although regarded as intact OR genes, harbor a mutation affecting P or Y in the LHTPMY motif, or affecting M, R or the second A in the MAYDRYVAIC motif, or Y in the SY motif or finally, on H in the FSTCSSH motif. All known variants of these OR genes, correspond to a mutated haplotype of these highly conserved positions [48]. Presumably, these receptors are no longer functional (highlighted in blue in the Figure 6). We observed that 25% of non-expressed OR genes are potentially non-functional whereas 11.3% from the expressed set are potentially nonfunctional (p value = 0.002, Fisher's exact test). In addition, the average RNA copy numbers of the 47 deorphanized receptors (1746247) and of the 52 potentially non-functional ORs (486115) are significantly different (p value = 0.002, Student's t-Test, twotailed distribution, two-samples with unequal variance).
As shown in Table 2, the average RNA copy number varies drastically among the 47 reported deorphanized receptors. The most expressed OR gene corresponds to OR7C1 with an estimate average copy number of about 1108/20 ng RNA. A huge difference has also been noted in the expression of the four closely related paralogs, OR10G3, OR10G4, OR10G7 and OR10G9 that respond to ethyl vanillin and eugenol [5]. Indeed OR10G3 is well expressed in 25/26 WHOM samples with an average of 487 copies/20 ng RNA. OR10G4 and OR10G7 are moderately expressed (with an average of 29 and 13 copies/20 ng RNA respectively) and OR10G9 is detected only in 3 WHOM samples above the cut-off of 5 copies (with an average of 2 copies/20 ng RNA). We can count 19 ORs expressed by the 26 individuals among the 47 genes. In others words, 21% of the group of 90

Expression profiling of olfactory receptor genes in inferior turbinate sample
One sample of inferior turbinate (IT) has been tested in a TLDA and gives 210 undetermined C q values compared to 62 on average in WHOM tissues. This observation reflects the non-detection of OR gene expression, expected for a non-olfactory tissue. To make a more quantitative analysis, the C q values were converted into normalized relative quantities with classical endogenous stably expressed reference genes (CASC3, PSMC4, CDKN1B) defined by the geNorm algorithm. Figure 7 shows results obtained for OR gene expression ratio between WHOM and inferior turbinate. We observe 250 OR genes (70%) more expressed in WHOM than in inferior turbinate (ratio WHOM/IT$2) (Table S4). Some reference genes specific for olfactory sensory neurons (CNGA2 and Ric8B) are significantly expressed more in WHOM than in IT; RTP1 and ADCY3 are expressed a little more in WHOM than in IT while OBP and GNAL are detected in equivalent amount in both tissues.

Discussion
Although the OR gene family was discovered over 20 years ago by Buck and Axel, few data are available on their expression in human olfactory mucosa, contrasting with the recent significant increase of results on the genetic polymorphism of OR genes [48]. This probably reflects the difficulty to acquire human tissue and obtain good quality RNA from WHOM.
We report here the first extensive high throughput transcriptome profiling of OR gene expression directly by real-time reverse transcription PCR performed furthermore on WHOM from a relatively large population of 26 patients. Indeed, our study was focused on the expression of 356 predicted functional OR genes among the 851 OR loci scattered throughout the human genome.
Only a small percentage of the olfactory mucosa consists of olfactory sensory neurons. Moreover, the boundaries of the WHOM are unclear and the tissue is sometimes replaced by respiratory epithelium. Furthermore, the fraction of olfactory sensory neurons can vary significantly from one sample to another. Therefore, to allow OR gene expression comparisons between individual WHOM samples, normalization is mandatory. Consequently, we therefore normalized our expression data from the individual WHOM, with 6 so called tissue specific reference genes, that are expressed specifically by olfactory sensory neurons as previously described by Khan et al. [54].
Our results show that 77% of human intact OR gene repertoire are expressed with an average level above 5 copies/20 ng RNA. A set of 90 human OR messengers were detected in all tested individuals. In addition, 70% of the human OR gene repertoire were found more expressed in WHOM than in the inferior turbinate (ratio WHOM/IT $2). Along with the widespread genetic variation reported for human OR protein coding regions, that correlates to individual differences in odorous perception [8,9,13,14] and with genetic variations in auxiliary olfactory genes [50], this differential expression of ORs in the WHOM could  establish the basis to a well-documented inter-individual variation in olfactory sensitivity. Statistical analysis was performed for each clinical variable (i.e. age, sex and smoking status) to assess if receptor expressions were globally or individually associated. Our results indicate that OR gene expression is globally not associated with age, sex or smoking status. However, we are not able to detect if these clinical factors may reduce the absolute amount of olfactory sensory neurons. As we have normalized our data with specific references genes of olfactory sensory neurons, we have not taken into account the absolute amount of olfactory tissue. Therefore, the OR gene expression is described relatively to olfactory sensory neurons. Furthermore, at this point, our results cannot explain the decrease of olfactory performance related to age or smoking [51,52,61].
Nevertheless, our results show that individually, the expression of 22 OR genes seems to decrease significantly with age, and the expression of 7 OR genes seems to increase significantly. These results can be compared to those of Khan et al. [54] where the majority of OR gene expression (58.4%) in mice remained stable during aging, while 32.8% presented downward profiles, 7.2% upward profiles and 1.7% of convex or concave profiles. We found no correlation between individual OR gene expression and sex or smoking, although some clinical observations show that these two conditions may influence smell abilities. These differences in smell abilities may occur at another level of the olfactory system than the OR gene expression.
Prior to the present work, only two studies had focused on the expression of the human OR gene family. In these studies, DNA microarray [45] or deep sequencing [50] were used as experimental approaches. Latter report focused on accessory proteins and presented results in the supplementary data only for one human olfactory epithelium biopsy. Over the 261 intact ORs overlapping with our study, 174 OR genes were found to be expressed in this olfactory epithelium biopsy (threshold set at a FPKM$0.1) whereas we found 202 OR genes expressed in the WHOM (threshold set at a number of copies $5). 145 genes (i.e.72% of our expressed OR gene set) turned out to be common to both studies and 30 ORs are not expressed in WHOM according to both studies (Table S5). Therefore, our approach is in agreement with the previous study by Keydar et al. concerning expressed OR genes (p value = 0.0012, hypergeometric test) and for the expression levels of each OR genes (p value,0.001, Spearman's correlation) [50]. With respect to the Zhang et al. study [45], there are 319 intact ORs overlapping with our study. 202 OR genes were found to be more expressed in human olfactory epithelium than in other tissues by Zhang et al. (threshold set at a p value,0.01) whereas we found 225 OR genes preferentially expressed in WHOM (threshold set at a ratio WHOM/IT$2). Furthermore, 142 genes (i.e. 63% of our expressed OR gene set) turned out to be common to both studies and 34 ORs are not expressed in WHOM according to both studies (Table S6). The overlap of 63% is exactly what would be expected if the two datasets are completely random with respect to each other (p value = 0.3, hypergeometric test). Moreover, no correlation between the expression levels of each OR genes could be observed between the two studies (p value = 0.96, Spearman's correlation). It is noteworthy that the non-olfactory tissues used to estimate a difference in gene expression are not the same. We compared the expression in the WHOM to the one of inferior turbinate, while human liver, lung, kidney, heart and testis were used in the study by Zhang et al. The latter therefore must be taken with caution, as non-olfactory expression of ORs have been reported for these different tissues, and could therefore bias the comparative results they obtained. We tried to exclude the set of non-olfactory expressed ORs from our comparison. Even excluding these receptors, we could not reveal a correlation between the current data and the Zhang et al. data. Another source of discrepancy between both studies relies on the proportion of the olfactory epithelium used to determine OR expression. In the publication by Zhang et al., it is not clear whether the analyzed tissues cover the entire olfactory mucosa or only a determined anatomical section. This difference might be highly significant as the distribution of olfactory receptors is not homogeneous in the olfactory epithelium of rodents [62,63]. Importantly, though no data currently exist in humans. Another difficulty relies on the fact that the boundaries between the olfactory and the respiratory epithelium are not clearly defined in humans. Different publications report that the human olfactory epithelium is located on the nasal septum, the cribriform plate, the superior and the middle turbinate [53,64,65]. Consequently, this motivated us collecting all these anatomical regions. The mucosa of the nasal septum was dissected along the projection of the middle turbinate. Thus, our samples represent practically the totality of the olfactory mucosa.  Interestingly, we observe an enrichment of functional deorphanized receptors in the set of expressed OR genes and an enrichment of potentially non-functional receptors into the set of non-expressed OR genes. This corroborates the observations of Zhang et al. The latter reports that 80% of intact OR genes and 67% of OR pseudogenes were found to be expressed in WHOM and moreover intact ORs appear to be expressed at a higher level on average than OR pseudogenes.
Taken together, these observations support the hypothesis predicting that if a gene is expressed, it is more likely to be functional. Indeed, a non-functional OR can lead to a defective targeting of olfactory sensory neurons in the olfactory bulb and therefore reduces the survival of these neurons [66]. More precisely, the proper targeting seems more related to OR-derived cAMP signals rather than the OR ability to bind an odorant [67]. The relation between OR genes functionality and expression could be further explored by studying the variants of expressed OR genes. Indeed, for known ORs already deorphanized, both functional and non-functional haplotypes have been described; consequently, it would be worth determining whether expressed allelic variants correspond preferentially to functional haplotypes.
A systematic study of OR gene expression profiles, expressed in non-olfactory tissues using deep sequencing analysis has been recently reported and provides a list of highly expressed OR genes [44]. From this list, 32 intact OR genes are common with our study. We confirmed that the majority of the non-olfactory tissues expressed OR genes (28/32; 87%) are also expressed in WHOM (more than 5 copies/20 ng RNA). 24 out of 32 (75%) nonolfactory tissues expressed OR genes are expressed more in WHOM than in inferior turbinate (ratio WHOM/IT $2). The remaining 8 OR genes are detected similarly in both tissues (0.96, ratio WHOM/IT1 ,2). From this comparison, it does not seem that non-olfactory tissues expressed OR genes make up a separate group, with a putative non-olfactory function, that would clearly segregate from ORs expressed in the WHOM. However, for one particular receptor, OR2W1, we observed a very low expression level in the WHOM whereas it is well detected in pulmonary neuroendocrine cells [31]. Upon its deorphanization, this receptor was found have a broad spectrum of stimuli [15]. This OR is activated by more than 200 molecules and interacts with a large variety of chemical structures eliciting very different odors [Veithen et al., unpublished data]. Together, our results and those of Gu et al. [31] suggest a role for OR2W1 in the detection of volatile irritants in the human airways. Therefore, this receptor may offer an example of an OR family member that would actually not be only an olfactory mucosa odorant receptor.
Although our study represents the most extensive analysis of human OR expression in the olfactory mucosa, it does present some limitations. The considered population is relatively old. Indeed, because of the difficulty to obtain human material, samples from patients presenting characteristics that may affect the olfactory mucosa such as age and smoking, have not been discarded. However, these conditions do not appear to change the OR genes expression. On another hand, our study includes almost exclusively subjects of European origin and therefore does not explore the possible ethnic related variations of OR gene expression. Logically, therefore, we acknowledge that it would be worth extending the analysis to samples from other origins, if available.
Notwithstanding these points and since the majority of human olfactory receptors are not deorphanized, the information on the expression of OR genes in WHOM collected in this study offers an essential preliminary and lacking understanding that will allow focusing future research on frequently expressed and potentially functional olfactory receptors identified. Figure S1 Analysis of RNA integrity. A. Electropherogram showing the integrity of 9 human olfactory epithelium RNA samples (lanes 1 to 9). B. Example of profile showing a RIN of 8.5 (sample 8, RNA from a woman of 72 years old) and the integrity of the 18S and 28S ribosomal RNA. (TIF)