Implications for human odor sensing revealed from the statistics of odorant-receptor interactions

Binding of odorants to olfactory receptors (ORs) elicits downstream chemical and neural signals, which are further processed to odor perception in the brain. Recently, Mainland and colleagues have measured more than 500 pairs of odorant-OR interaction by a high-throughput screening assay method, opening a new avenue to understanding the principles of human odor coding. Here, using a recently developed minimal model for OR activation kinetics, we characterize the statistics of OR activation by odorants in terms of three empirical parameters: the half-maximum effective concentration EC50, the efficacy, and the basal activity. While the data size of odorants is still limited, the statistics offer meaningful information on the breadth and optimality of the tuning of human ORs to odorants, and allow us to relate the three parameters with the microscopic rate constants and binding affinities that define the OR activation kinetics. Despite the stochastic nature of the response expected at individual OR-odorant level, we assess that the confluence of signals in a neuron released from the multitude of ORs is effectively free of noise and deterministic with respect to changes in odorant concentration. Thus, setting a threshold to the fraction of activated OR copy number for neural spiking binarizes the electrophysiological signal of olfactory sensory neuron, thereby making an information theoretic approach a viable tool in studying the principles of odor perception.


Introduction
Olfaction, ubiquitous among all animals, is a key sensory process that is used to detect a vast number of chemicals in the external world [1,2]. From the perspective of molecular recognition, the physicochemical principle germane to the early layer of olfactory process is not significantly different from that of unicellular organisms' chemotactic response [3][4][5]. Similarly, olfactory signals are initiated upon the recognition of odorants by the receptors that are expressed in the nasal epithelium [6].
Physiological and biochemical studies of olfaction to date offer strong evidence that the majority of mammalian OR signalings are associated with G-protein dependent pathway [7]. Thus, except for a few cases [8][9][10], their activation mechanism is mostly related to that of Gprotein coupled receptors (GPCRs) [11][12][13][14][15]. Upon binding odorants, a set of ORs adopt their structures into active forms and catalyze G-proteins to initiate downstream signal cascades. Although the original input signal is modified passing through multiple layers of neural circuits [16,17], elucidating the information encoded to the pool of ORs is a key component for understanding the principles of odor sensing. In particular, the receptor code, or the responses of a repertoire of ORs encoding the chemical features of odorant(s), constitutes the first layer of dataset to be mapped by the olfactory process.
Unlike the conventional GPCRs that are deemed 'fine-tuned' to endogenous agonists such as hormones and neurotransmitters [14,[18][19][20], ORs tend to be 'broadly tuned' to multiple odorant types. Conversely, a single odorant can also be recognized by multiple ORs. In fact, there is increasing evidence indicating that even the conventional GPCRs can adopt multiple active states and yield different signaling pathways [21,22], replacing the old view that GPCR activation is described by the two-state conformational selection model [23]. The many-tomany interactions of ORs with odorants enable a limited number of ORs to efficiently encode a huge chemical space represented by the odorants and their mixtures [24,25]. The human olfactory system is known to employ N r % 330 OR subtypes [26], whereas the odors are made of an estimated pool of M ≳ 10 4 monomolecular odorants [27,28] and their m-component mixtures. To be able to discriminate potentially a vast number of natural odors (N r < M ( M m ), the olfactory system adopts a strategy based on combinatorial coding [29]. In a simplifying limit of ON/OFF switch-like response, a set of N r receptor types can, in principle, generate 2 N r % 2 330 ' 10 99 distinct receptor codes, which allow human neural systems to discriminate M m distinct natural odors for fairly large value of m. Traditionally, the odor perception has been interrogated by measuring the brain activity [30] or through psychophysical tests [31]; however the associative and nonlinear nature of neuronal codes [32] and genetic variation among human subjects [33] make it difficult to decipher a direct connection between odorants and the final odor perception. Recent developments of experimental techniques using high-throughput screening [34] have compiled dose-response data from a set of 535 interacting pairs of odorants and ORs. The combinatorial nature of odor coding was substantiated by the concrete, objective dataset demonstrating the responses of 304 human ORs against 89 odorants [34].
The aim of this work is to gain more quantitative insights into the physicochemical processes that take place in the early stage of odor sensing. More specifically, we aim to shed light on the statistics of odorant recognition and infer its relation with a cellular response of the corresponding olfactory receptor neuron (ORN). To this end, we adapt a minimal kinetic model for the activation of mammalian ORs [35], and quantitatively analyze a set of dose-response data for human ORs [34]. Analyzing the experimental data, we characterize the statistics of odorant-receptor interactions in terms of three empirical parameters-the half-maximum effective concentration (EC 50 ), efficacy, and basal activity-and obtain the distribution of these parameter values. In particular, we relate the three parameters with the rate constants and the binding affinities, and specify the condition to be met by the kinetic parameters of OR activity. Because the number of ORs expressed in each ORN is large (L % 2.5 × 10 4 [36]) and the signals from on average 10 4 (% 6 × 10 6 /350) ORNs [37] of the same type are converged to a glomerulus [38], the response of ORN can be best understood as an outcome that integrates stochastic OR signals over the population. Resorting to the law of large number and the concept of spike firing threshold, we propose that the early stage of olfactory neural signals can be binarized, making an information theoretic approach as a viable tool for studying the odor sensing.

Minimal kinetic model for OR activation
In order to model the OR activation we consider the following minimal kinetic scheme [35]: where the first and second lines represent binding/unbinding of the G-protein with the OR in the absence and presence of an odorant (O) in its binding site, respectively. k f G is the rate of Gprotein (GDP-bound heterotrimeric complex) binding to OR without odorant, and k f OG ðlÞ is the rate of G-protein binding when an odorant of the type λ is bound to the OR. C G is the intracellular concentration of G-protein. K O is the dissociation constant of the odorant from OR that can be related to the odorant-OR binding affinity (A) as are the dissociation constants of G-protein from OR, with and without odorant in the OR, respectively (see Fig 1a). In Scheme 1, the intracellular downstream signal cascade is initiated by the kinetic steps marked with a "−G", indicating a Gα GTP subunit released from a heterotrimeric G-protein complex. Even without an odorant in the binding pocket of OR, OR can relay weak downstream signal cascades which are defined as the basal activity. For an OR that binds a cognate odorant (OR O ), the binding of G-protein is expected to be further facilitated (k f OG ) k f G ), evoking a stronger downstream signal cascade (Fig 1a). Based on the minimal signaling scheme (Scheme 1), the OR activity S(C O ; λ), evoked by an odorant of the type λ, can be shown to be a hyperbolic function of the odorant concentration C O [35]: Because the activity of each receptor in Mainland et al.'s assays [34] was measured by means of the fluorescence emitted from cAMP-mediated luciferase activity, we assume here that the strength of cellular (ORN) response is proportional to the amount of Gα subunit released from the complex and consequently the cAMP-mediated gene expression. The proportionality constant A reflects all of these kinetic details. A schematic of olfactory signaling cascades. G olf , olfactory G-protein binding to and the subsequent release of Gα GTP subunit from the receptor via GDP/GTP exchange (OR G À ! À G OR) elicit downstream signal cascades: production of second messengers (cAMP) via stimulation of adnenylyl cyclase III, followed by depolarization of membrane potential via cAMP-induced opening of ion-channels [39,40]. The downstream signal is stronger (the thicker arrow in purple) when odorant is bound to the OR. b. The four-parameter model (Eq 3) to describe the dose-response data. Fitting the dose-response data using the Hill curve It is useful to cast the above model (Eq 2) into the following Hill equation: The three key parameters of the dose-response relationship (see Fig 1b), EC 50 [= log 10 (K 1/2 / [M])] where K 1/2 is the half-maximum concentration in molarity unit, the efficacy (E = B + δS max ), and the basal activity (B), are expressed in terms of the rate and dissociation constants defined in the reaction scheme for OR signaling (Scheme 1 and Eq 2): We have specified λ in the argument of parameters to make it explicit that the parameters depend on a specific odorant-OR pair. Note that for a given receptor, K 1/2 and E change with the odorant type λ, whereas B is a property of the receptor only. Eq 3 is equivalent to Eq 2 when the Hill coefficient is fixed to H = 1. Empirically, however, we find that the dose-response data can be best fitted by treating H as a free parameter.

A universal response curve for human olfactory receptors
We fitted the dose-response data of [34] to the Hill curve partly based on our kinetic model (Eq 3). Among 535 odorant-OR pairs in the dataset, 475 pairs showed activation (δS max > 0), and the rest showed deactivation (δS max < 0). Furthermore, 317 out of 535 pairs could be fitted to Eq 3 reliably with correlation coefficients greater than 0.9. Fig 2a-2d show examples of odorant-OR pairs. By using the parameter values fitted for respective odorant-OR pairs, and using the rescaled variablesĉ ðC O =K 1=2 Þ H and f δS/δS max = (S − B)/δS max we could collapse the dose-response data on a universal curve, f ¼ĉ=ð1 þĉÞ (Fig 2e). This justifies our use of Eq 3 for the analysis of the odorant-OR dataset.

Distribution of the empirical parameters over the ensemble of receptors
In order to gain a rough statistical estimate of the sensitivity of human ORs we examine the range of odorant concentration to which human ORs are responsive, and the variation in the magnitude of intra-cellular response elicited by OR activation. The odorant-OR interaction dataset [34] allowed us to obtain the distributions of the two key empirical parameters, the effective concentration EC 50 and the efficacy E, over the ensemble of receptors. First, we constructed the histogram of the effective concentrations of odorants from all the data of interacting odorant-OR pairs (see Fig 3a), and obtained the distribution ψ ens (EC 50 ) by normalizing the histogram, where the subscript "ens" indicates that the distribution was constructed from the entire ensemble of odorant-receptor pairs. The concentrations exhibit a fairly broad distribution, ranging from nM (EC 50 = −9) to M (EC 50 = 0), with an average hEC 50 i = −4.1 and a standard deviation of s EC 50 ¼ 1:8. This corresponds to a range of 1 μM < K 1/2 < 5 mM. Most of the odorant-OR pairs with strong affinities (EC 50 < −9) are deactivating pairs, in which the odorants act as the inverse-agonists of the respective ORs (Fig 3a). Compared to the proportion of all the odorant types tested by Mainland et al. [34], the odorants contributing to the deactivating pairs with strong affinities mostly belong to certain specific chemical types. They are mainly identified to be acidic (thioglycolic acid, isobutyric acid, 3-methyl-2-hexenoic acid), aromatic (cumarin, quinoline, 2-methoxy-4-methylphenol), acetate (n-amyl acetate, butyl acetate), steroid (androstenone), and ester (pentadecalactone, amyl butyrate) (see S1 Fig). Presently, the molecular cause for the inverse agonism by these strongly bound odorants is not clear, but it has been suggested that sidechains of GCPRs interacting with inverse agonist are generally more rigid [41].
We also determined odorant-specific differential response, ψ λ (log 10 C O ), defined as: This measures the change of cumulative signal from all N r receptor types (ΔS tot ) that can be elicited as the log-concentration of the given odorant λ increases by Δ(log 10 C O ). Under an assumption that the odorant λ activates a continuous spectrum of receptors, we calculated ψ λ (log 10 C O ) for each of the top five broadly-interacting odorants (cis-3-hexen-1-ol, granyl acetate, eugenol, eugenol acetate, butyric acid) that have a sufficiently large number of responsive OR partners (≳ 20). For these broadly-interacting odorants, ψ λ (log 10 C O )'s are sharply peaked around an odorant concentration of C Ã O % 100mM (Fig 3b). Incidentally, C Ã O % 100mM is comparable to the odorant concentration of * (10 − 500) × 10 14 molecules/mL used in the olfactometer measurement [43].
Next, we constructed the distribution of the efficacy, ρ(E), as a histogram over the ensemble of all odorant-OR pairs. We note that the maximum-entropy distribution of E, for an ensemble of odorant-OR pairs characterized with an average efficacy hEi, exhibits an exponential functional form ρ(E) * exp(−E/hEi), as was pointed out in [17] to explain the firing rate distribution of the glomeruli. In the case of Mainland et al.'s dataset, ρ(E) was characterized by a sum of two exponential distributions: with ϕ = 0.79, and with the two average efficacy values hE 1 i = 1.50 and hE 2 i = 10.06 (Fig 4a). This double-exponential distribution implies that the population of odorant-OR pairs are divided into two subgroups, one characterized with a low efficacy (79% of total population) and the other with a high efficacy (21% population). On the other hand, analysis on the relative amplitude of activation (δS max ) gives rise to qualitatively the same distribution ρ(δS max ) as Eq 8 with ϕ = 0.57, hδS 1, max i = 0.97, and hδS 2, max i = 5.77. Whether the spiking rates over the population of human ORNs (labeled by their OR subtypes) also constitute two exponential distribution is an interesting question that may be examined by carefully designed experimental measurements.
Finally, we report the distribution of Hill coefficients. While a majority of the data are indeed well described with H = 1, some are better fitted with larger values of H (Fig 4b). Within our two-state activation model, the phenomenological value of H > 1 could arise from multiple sources (See S1 Text for details): (i) a cooperative activation of the receptors, which could be linked to the recent studies on the effect of GPCR dimers or higher-order oligomers on the signaling [44][45][46][47][48][49]; (ii) the sensitivity amplified in the process of signal cascades accompanying covalent modifications, such as GTP hydrolysis and phosphorylation/dephosphorylation [50,51]. Given that G-protein pathway is associated with GTP hydrolysis, a scenario of the amplified sensitivity can be considered; (iii) inhomogeneity of the odorant-OR kinetics.
We note that an analytic expression for the activity can still be obtained for either of the foregoing three mechanisms, albeit more convoluted than the Hill equation (Eq 3), and be used to fit the data. Given the expression of the activity, its sensitivity with respect to the variation of control parameter (or the sharpness of transition) can be evaluated by using where f and θ are normalized response and the control parameter, respectively [51]; For the case of Eq 3, θ = (C O /K 1/2 ) and f = θ H /(1 + θ H ), and it is easy to confirm that n H = H.

Determination of microscopic rate and binding constants
From Eqs 4-6, we observe that all three kinetic parameters {E, B, K 1/2 } depend on the concentration of G-protein (C G ). By eliminating the C G -dependence from these expressions, we can relate the three parameters in pairs. For example, the efficacy E of an odorant-OR pair has a hyperbolic dependence on B: EðlÞ

Statistics of odorant-receptor interactions
Similarly, B is related to K 1/2 : and E to K 1/2 : In addition, an interesting triadic relation between the three parameters E, B, and K 1/2 define a quantity ω that can be expressed in terms of the microscopic rates k f OG , k f G , and K O : oðlÞ log 10 EðlÞ À log 10 B À log 10 K 1=2 ðlÞ Using Eqs 10-13, we can in principle determine microscopic quantities such as k f OG ðlÞ=k f G , K O (λ), Ak f G =ðK À 1 OG ðlÞ À K À 1 G Þ, and Ak f OG ðlÞ=ðK À 1 OG ðlÞ À K À 1 G Þ for a given pair of odorant and receptor types.
Here we report the odorant-specific estimates for some of these quantities, averaged over the observed spectrum of receptors that interact with a given odorant λ. For example, we obtained hk f OG ðlÞ=k f G i as the amplitude of the E(λ) versus B curve using Eq 10 (Fig 5a). It was then plugged into Eq 13 to estimate hlog 10 K O (λ)i from the knowledge of hω(λ)i, a value that can be obtained from the histogram over all interacting pairs involving this odorant λ (Fig 5b). The estimated average values for the top five broadly-interacting odorants are summarized in Table 1.

Statistics of odorant-receptor interactions
From the kinetic parameters determined here, we make three points that are noteworthy: for all the five odorants tested. This is consistent with our expectation that agonist (odorant) binding facilitates the accommodation of G-protein.

K O (λ)
, the threshold concentration of the odorant λ for binding the receptor, is greater than K 1/2 (λ), the effective threshold for the odorant to eventually elicit the signal (the release of Gα-subunit). Together with an augmented sensitivity (H > 1) discussed above, the relationship of K O > K 1/2 appears to be a natural outcome of signal cascade in biochemical reactions [50].
3. Eq 4 with the condition K O > K 1/2 suggests that K OG < K G ; the binding of G-proteins to a receptor is stronger when there is an odorant bound to the receptor. This is equivalent to , an inequality that can be derived by rearranging Eq 4. Therefore, it follows that This specifies a necessary condition to be satisfied, within our kinetic model, for the odorant-evoked activity. In other words, for the odorant-dependent activation mechanism to function properly, the difference in free energies of G-protein binding for the two cases with and without odorant in the binding site, ought to be greater than that between OR activation and odorant binding. The data reported in Table 1 indeed confirm our analysis.

Non-uniform sensitivity of ORs to odor concentration
Our analysis on Mainland et al.'s data [34] provides a new insight into the issue of odor sensitivity. Since qualitatively similar trend is observed in the histogram for the entire ensemble of odorant-OR pairs as shown in Fig 3a, we assume that it effectively represents a general sensitivity profile of any odorant against the pool of human ORs; in other words, we hypothesize that the variable log 10 C O displays the identical distribution ψ ens (log 10 C O ) as ψ ens (EC 50 ) though ψ ens was constructed as a distribution of EC 50 values. The range of threshold concentration of odorants, at which the olfactory system starts to respond, is finite. To be specific, Fig 3a suggests that R ¼ ðlog C O Þ max À ðlog C O Þ min % 2 Á ð2s EC 50 Þ Á log 10 % 16 along the natural logarithmic scale, where (log C O ) max and (log C O ) min denote, respectively, the maximum and  50 and s EC 50 is the standard deviation of the distribution ψ ens (EC 50 ). It has been argued previously that the minimal concentration change (ΔC O ) that gives rise to a detectable difference, of at least one bit of the receptor code, is dictated by the Weber ratio ΔC O /C O % R/N r % 0.042 [52]. This is to say that, in order for humans to be able to sense a change in odor strength, the concentration difference should exceed 4% in the concentration range between (log C O ) min and (log C O ) max . However, this estimate of the Weber ratio was obtained under an assumption that the threshold concentration of odorant is uniformly distributed over the entire range R [16,52]. As shown in Fig 3b, the differential response of receptors against each of the five broadlyinteracting odorants displays a non-uniform distribution, ψ λ (log 10 C O ), sharply peaked at C Ã O ðlÞ % 100 μM. the expected number of ORs that could recognize an odorant at a concentration C O , denoted n O , would satisfy Therefore, the condition for a minimal change in the odorant concentration to activate an additional OR type (Δn ! 1) can be expressed using the following inequality: The lower bound for the Weber ratio is minimized at the peak of the sensitivity curve , which in turn maximizes the sensitivity χ defined above. This implies that the olfactory system is most sensitive at C O % 100 μM. Using N r % 330 and s EC 50 % 1:5, the mini- ðlog10=N r Þ s EC 50 % 0:03; the detectable difference at the sensitivity peak can be as small as 3%.

Response of the OR cell is effectively binary
According to our analysis of the dose-response data, the sensitivity of OR response is maxi- On the other hand, there is another source of "sensitization" at the level of the ORN, arising from the fact that an ORN is activated when a sufficient number of its ORs are activated together. It is known that a monogenic OR expression (i.e., the expression of only one out of N r -OR types) is ensured in each ORN, regulated by a feedback mechanism [53][54][55][56]. Therefore, the cellular response of the ORN results from the collective action of L * 2.5 × 10 4 receptors of the same type [36]. Even when the activation of individual receptors is stochastic, the collective signal from L receptors is much "sharper" at the ensemble level, effectively eliminating the noise in the cellular response. Since the action potential of neuron is switched on and off around a threshold membrane potential (V y m ), which in turn can be related to a threshold value in the fraction of activated ORs (ℓ θ /L), the firing probability of the neuron is effectively binarized as: where Θ(z) = 1 for z > 0, and Θ(z) = 0 for z 0 (see S1 Text for the details of derivation). Because the condition L ) 1 eliminates the fluctuations in the copy number of activated . The effective binarization of electrophysiological signals in the OR cells projects the chemical representation of the odor (response at the sub-cellular level) onto an N r -bit digital signal (response at the neural level). This N r -bit information transmitted to the postsynaptic neurons is further processed through the brain circuits for various computational tasks [1,32,57]. Although synaptic transmission is still subject to noise, for example due to the stochasticity in the number of vesicles discharged at synapses [58], our argument for the binarization of ORN responses based on the law of large number and activation threshold can still be applied to information transmission that occurs in the upper brain.

The OR signal is almost fully exploited in odor perception
Putting together our observations so far, the maximal amount of olfactory information that can be encoded into the ensemble of OR cells is N r bits, corresponding to 2 N r distinct receptor codes. Therefore, the number of distinct odors O that can be discriminated by the human olfactory system is upper-bounded as O ≲ 2 N r . Provided that all natural odors could be represented as a mixture of non-overlapping, equal-intensity odorant components, the ability to discriminate any pair of m-component odorant mixtures without ambiguity is subject to the following condition: Using the estimated sizes for the human olfactory system, N r % 330 [26] and M % 10 4 [27,28], we obtain m max % 35 as the maximal size of such odorant mixtures. In other words, the space of all odor mixtures composed of more than m max % 35 odorants would exceed the capacity of the human olfactory system. Taking the analogy of the visual system with three receptors (R,G, B), one could further postulate an "olfactory white", where well-mixed odors with more than m max components, each of which elicits an equal intensity response, are indistinguishable from one another [59]. Remarkably, our rough estimate of m max % 35 from a simple informationtheoretic argument is in reasonable agreement with an experiment [59], where it was shown through an ingenious set of psychophysical tests that well-mixed odors of m max % 30 odorant components are indeed indistinguishable to humans.

Concluding remarks
Sensing of smell is ultimately a mapping from the molecular information space of odorants to patterns of neuronal activity, which are then perceived as particular kinds of odor in the brain. Understanding the nature of this mapping remains challenging due to factors such as the complexity in the molecular space of odorants, lack of information on the molecular recognition by ORs, and the difficulty of deciphering neuronal signal processing. The present work makes an important step forward in this direction by analyzing the sub-cellular process of olfactory sensing within the ORN cell, at a scale larger than the individual molecular interactions but smaller than the multi-cell signal. Employing a minimal kinetic model for odorant-OR interaction at single molecule scale, we quantified the statistics of interactions between odorants and human ORs, and discussed how the response of individual ORNs can be effectively binarized.
The quantitative insights provided in this work can lead to the next level of understanding of human olfaction at multicellular scale. While the analyses of OR responses in this study are limited to the response of single olfactory receptor at the individual or population level, each odorant is in general recognized by a finite number of multiple ORs. Thus, it is essential to study the combinatorial nature of odor signal processing in a more systematic way based on concrete data and bring the current understanding of olfaction to a systems level. More specifically, since Mainland et al.'s doseresponse data [34] offer such opportunity, we plan to quantify which set of ORs are sampled for a given odorant or odorant mixture and then generate the corresponding receptor code, and also address how discriminable these receptor codes are in the early layer of information processing in the human olfaction. In particular, our model will enable the prediction of receptor responses to mixtures of multiple odorants at possibly different concentrations, which is typical in natural odors. Given that the sensory world through olfactory process is only two synapses away from the cortical neurons [60], addressing these questions will provide better glimpses into the neurobiological principle of signal processing in the human brain.
Supporting information S1 Text. Derivations. We provide details for two observations that were mentioned in the main text: (i) deviation from the Michaelis-Menten kinetics, (ii) binarized cellular response to odor concentration. (PDF) S1 Table. Kinetic parameters determined for all interacting odorant-receptor pairs. A list of kinetic parameters determined by fitting the dose-response curve from each interacting odorant-receptor pair reported in the dataset of Mainland et al. [34] using our model. Table is Table. List of odorants tested in the measurement and their chemical types. A list of all odorants in the dataset of Mainland et al. [34] is provided, in a comma-separated values (csv) file with 7 columns. The first five columns are: odorant index, CAS registry number, odorant name, PubChem Compound Identification (CID) number, and SMILES; all as reported in the original dataset [34]. The sixth column shows the chemical type of the odorant. The last column indicates whether this odorant was counted as a strong deactivator in our analysis (S1  [34] (also see S2 Table). b. Odorants with strong affinities (EC 50 < −9) contributing to the deactivating odorant-OR pairs in Fig 3a. (EPS)