Computational Approaches for Decoding Select Odorant-Olfactory Receptor Interactions Using Mini-Virtual Screening

Olfactory receptors (ORs) belong to the class A G-Protein Coupled Receptor superfamily of proteins. Unlike G-Protein Coupled Receptors, ORs exhibit a combinatorial response to odors/ligands. ORs display an affinity towards a range of odor molecules rather than binding to a specific set of ligands and conversely a single odorant molecule may bind to a number of olfactory receptors with varying affinities. The diversity in odor recognition is linked to the highly variable transmembrane domains of these receptors. The purpose of this study is to decode the odor-olfactory receptor interactions using in silico docking studies. In this study, a ligand (odor molecules) dataset of 125 molecules was used to carry out in silico docking using the GLIDE docking tool (SCHRODINGER Inc Pvt LTD). Previous studies, with smaller datasets of ligands, have shown that orthologous olfactory receptors respond to similarly-tuned ligands, but are dramatically different in their efficacy and potency. Ligand docking results were applied on homologous pairs (with varying sequence identity) of ORs from human and mouse genomes and ligand binding residues and the ligand profile differed among such related olfactory receptor sequences. This study revealed that homologous sequences with high sequence identity need not bind to the same/ similar ligand with a given affinity. A ligand profile has been obtained for each of the 20 receptors in this analysis which will be useful for expression and mutation studies on these receptors.


Introduction
The sense of smell has been the least understood of all the five human senses known till recent times. The detection of odorants is essential for survival of an individual. The discriminatory power of olfactory receptors (ORs) is such that it can perceive thousands of volatile chemicals as having different odors. It is known that the olfactory system uses a combinatorial receptor coding scheme to decipher the odor molecules. One OR can recognize multiple odorants and one odorant is recognized by multiple ORs [1]. A slight structural change in the odorant or a change in the concentration of the odorant in the environment results in a change in the odorcode of these receptors.
Each mammalian olfactory receptor neuron encodes only one OR [2][3][4]. The axons of the neurons expressing the same olfactory receptor converge to one olfactory bulb, which then processes the information to the brain [5]. ORs are structurally similar to G-Protein Coupled Receptors (GPCRs) and contain seven transmembrane (TM) domains connected by loops. The functionally important residues are present on the transmembrane helices 2-7 [6][7][8]. In insects, the detection of odorants is performed by a smaller set of about sixty odorant receptors [9]. Due to the lack of X-ray crystal structures of olfactory receptors and the difficulties in heterologous expression of ORs, very few ORs have been "de-orphaned" i.e. associated with their ligands (odors). Odorant-OR binding studies are limited to a small number of ORs that can be tested at one condition. The number and mixture of odorants that can be used in a single study are also limited.
Odor molecules belong to a variety of chemical classes: from alcohols, aldehydes, ketones and carboxylic acids to sulphur-containing compounds and essential oils. The physicochemical descriptors of odor molecules play an important role in the prediction of odor response by the OR [10] [11]. Very identical OR sequences can have a structural bias for ligand specificity on the basis of the number of carbon atoms present in the ligands [12]. About 8000 odorants have been identified in food. KFO (Key Food Odorants) has identified about 400 odorants which have been characterized and this number approximately equals the number of ORs found in humans [13]. The response of mixture of odorants is neither the additive nor an average of its components [14]. Mixing some odorants lead to the emergence of novel perceptual qualities that were not present in each component, suggesting that odorant mixture interactions occurred at some levels in the olfactory system [15]. Odorant molecules in a mixture could act as an antagonist and hinder the response of the receptor to agonists. Thus, deciphering the complex coding mechanisms requires large scale analysis to compare and consolidate odorant-OR interaction across several receptors.
Molecular docking, an in silico approach, can be used to model the interaction between a small molecule and a protein at atomic levels. This method allows us to characterize the binding properties of the small molecule to the receptor and the discriminatory mechanisms, as well as helping to elucidate fundamental biological processes [16]. Docking involves two steps--predicting of binding conformation of the ligand, and predicting the binding affinity of the ligand to the receptor. Knowing the location of the binding site increases the efficiency of the docking tool. This information about the binding site can be obtained from experimental and mutational data. The earliest method of docking assumed a lock-and-key model for ligandreceptor interaction [17]. Since the functional protein is actively re-shaped, the induced fit theory of protein-ligand docking was used to induce flexibility to both receptor and ligand which would result in an accurate prediction of their interactions [18]. At a large scale, docking tools help analyze the interactions of receptors to a large set of ligands, and in scoring the best ligand out of the set. Several in silico docking tools have been developed in the recent past, which helps us analyse protein-ligand interactions [19][20][21][22][23][24][25][26].
One of the major challenges in the field of docking is handling the flexibility of protein receptors efficiently. Proteins are in constant motion between different conformational states with similar energies and this fact is still disregarded in many docking studies due to the large computational time required and the inherent limitations of such methods to sample alternate conformations accurately. The use of an ensemble of protein conformations as a starting point helps to sample various functional states of the receptor protein. The computational time for this approach scales linearly with the number of protein structures that constitute the ensemble [27]. Lack of imparting complete protein flexibility in docking approaches still remains a bottleneck in justifying the outcome of a docking analysis. The X-ray crystallographic structures reveal the buried surface area of a ligand as being between 70 to 100% and thus the binding site orientation can be greatly influenced by protein flexibility and solvation [28]. Inducing flexibility at the ligand binding site can lead to the sampling of a wide range of ligands, instead of discarding them at the initial stages of docking as non-binders. The scoring functions that accompany a docking tool might be simplified to compromise between speed and accuracy. Certain scoring functions tend to provide better scores for certain type of binding sites [29] [30]. This dependence of scoring function on the binding site should be properly weighted. GLIDE [31] appears to be one of the best docking suites and provides most consistent results with respect to diversity of binding site, ligand flexibility and overall sampling. Glidescore (gscore) is an effective scoring function and shows maximum accuracy when compared to other docking tools such as GOLD and ICM [32]. GLIDE provides an opportunity to minimize the receptor in the membrane environment before docking, which proves to be very helpful in the case of membrane bound systems such as ORs.
Computational docking approaches can be useful in understanding odor-receptor interactions, since very few ORs have been de-orphaned experimentally. Of the huge set of mammalian olfactory receptors, 400 in H. sapiens and 1000 in M. musculus, only~50 receptor-odorant interactions are known [8]. The known interactions are based on studies with a limited set of odorants and their mixtures. OR orthologs respond to similar odors with dramatic differences in efficacy and potency even if OR orthologs respond to similar set of odors more frequently than paralogs [7].
This study aims at building an odorant profile for a chosen set of mammalian ORs using a receptor dataset of ten human and mouse homologous pairs of ORs and 125 known odorants as the ligand data set. The analysis helped build an odorant profile for an OR and compare the odorant profiles across homologous ORs. We employed the induced-fit docking (IFD) protocol to obtain the binding energy scores of odorants to the ORs. The odorant profiles for single ORs have been obtained using a limited set of odor molecules earlier [33][34][35][36][37][38]. The earliest analysis was on mouse OR that responds to eugenol. One study reports the importance of residue Ser 113 as the most important residue required for ligand binding [35]. The analysis on the same OR under different experimental conditions, shows residue Phe 182 to be important in ligand binding. The mutation of this Phe residue results in a loss of response to eugenol [37]. We present a longitudinal study in this paper, where we have developed a reliable computational pipeline to study more than one OR against a large number of ligands. The methodology in this study has been standardised using the binding site information of the mouse eugenol receptor (mOR-EG) and validated using known experimental data, where possible.

Receptor Dataset
A subset of twenty mammalian olfactory receptors, out of the 100 ORs that were modeled using homology modeling protocol in our previous analysis [39], were used for the analysis to decipher the odorant profile of the twenty ORs. From phylogenetic analysis on human and mouse ORs, ORs are known to form ten distinct clusters [40] (B.Nagarathnam, Ph.D thesis, 2013) which can further be divided into smaller subclusters. One human olfactory receptor was selected from every cluster of human OR phylogeny for this analysis. Thus, ten human ORs obtained were aligned to the 338 mouse OR sequences [40]. The ten mouse ORs, which clustered very close to each of the ten human ORs used in this study, were selected for docking analysis, thereby leading to ten pairs of closely related OR sequences from the human and mouse OR repertoires which were used for unravelling the odorant profiles.

Ligand Dataset
One hundred and twenty-five odorant molecules were chosen for this study (Table 1). These molecules were selected from earlier studies which have proven them to be odorants that can elicit response from olfactory receptors by in-vitro or in-vivo analysis [41][42][43][44][45]. The odorants included mammalian and insect-specific odorant molecules. The molecules belonged to different chemical classes like alcohols, ketones, acids, aldehydes and sulphuric compounds. Known antagonists of ORs were also present in the collection of odorants. The three-dimensional coordinates of ligands were obtained from PubChem3D [46] and prepared for docking studies using the Ligprep suite of Schrodinger GLIDE software (Schrödinger Release 2013-1: LigPrep, version 2.6, Schrödinger, LLC, New York, NY, 2013).

Grouping of Ligands by Clustering
MOLPRINT2D [47] [48] and Tanimoto co-efficient [49] of the CANVAS module (Schrödinger Release 2013-1: version 2.6, Schrödinger, LLC, New York, NY, 2013) in Schrödinger software [50] [51] were used to cluster the ligands based on their molecular and chemical features. The molecular descriptor calculates numerical binary values such as log P, molecular weight, electronic and valence states, 3-D pharmacophore interactions and the distance between molecular fingerprints from their molecular features. The Tanimoto-coefficient calculates the chemical fingerprint of each odorant using fragment-based binary representation. Given two molecules, the method calculates similarity upto a given bond along a linear path. The branching points and cyclic patterns from each of the linear paths are then detected. Using a proprietary hashing method, a given bit number is set for each pattern. Fourteen known repellents [52] were included in the clustering for associating their relationship to known odorant molecules through clustering approach ( Table 1). The repellents were not used in the current docking analysis and could form a basis for a future study on comparing the binding patterns of odorants and repellents. The resulting ligand clusters were used to analyse the docking results.

Induced-Fit Docking
The induced-fit docking module of Schrodinger GLIDE software (Schrödinger Release 2013-1:, version 2.6, Schrödinger, LLC, New York, NY, 2013) was employed for docking 125 ligands to 10 pairs of closely related human and mouse olfactory receptors [23], [53]. The Schrödinger suite provides the opportunity to analyse GPCR-like membrane proteins in implicit and explicit membrane environments, thus mimicking the biological environment of these proteins. The homology models of ORs were energy-minimized in implicit membranes for further use in docking studies [39]. The docking protocol was standardized using the prior information on mouse eugenol receptors. The binding pocket of class A GPCRs are known from several studies [6], [54]. Table 2 shows the parameters that were employed to standardize the protocol of induced-fit docking for ORs using the mouse eugenol receptor and its ligand. The fifth parameter was chosen as the best, since it yields the best score for the known receptor-ligand complex. On the basis of this standardization, large-scale induced-fit docking was carried out using the grid made of all the residues in TM 3, 4, 5 and 7 in the upper half of the receptor in the membrane bi-layer which covers the known binding site of any given OR/GPCR protein (Fig 1). The side chains of residues, which are within 7Å of the initial ligand binding pocket, were given flexibility so as to induce conformational flexibility in the receptor. The residues in the receptor were scaled to 0.70 for Van der Waals interaction while for ligands it was scaled to 0.50 of the existing Van der Waals interaction scores. The XP scoring [31] of Schrödinger IFD module was used to score the final ligand-receptor complexes. The receptor-ligand pair with Table 1. List of odorants used in the docking analysis. The odorants have been classified based on their functional groups. Odorants which are known to bind to insect ORs are listed separately. There are two odorants specific to ORs from model organisms C. elegans (odr-10) and M. musculus (mOR-EG). Repellents were chosen for clustering along with the odorants (based on chemical similarity) to understand their similarity to the odorants.

Trans-anethol 637563
Cineol 10106 Estragol 8815 Safrol 5144 Citralva 1551246 Limonene 22311 the highest score (gscore) was selected to compare the best binding mode for the selected receptor pairs. All the 125 receptor-ligand poses for a given receptor were ranked in the descending order of gscore. The ligand profile for each olfactory receptor was used for comparison across ORs and validation of the protocol (Fig 2).

Molecular Dynamics (MD) Simulations of Mouse OR-EG
The methodology for molecular dynamics simulations is as follows. The mouse OR73 models, in both ligand-bound and unbound form, were energy-minimized in an implicit membrane environment until convergence. The energy-minimized structures were then used as the start point for MD simulations. The MD simulations were carried out using DESMOND module of the GLIDE software [55] for 20 ns using the OPLS_2005 force field in the presence of 1-palmitoyl-2-oleoylphosphatidylcholine (POPC) lipid bilayer and standard NPT conditions. The protein was solvated in an orthorhombic box with periodic boundary conditions by adding TIP3P water molecules. The initial equilibration was carried out using default protocol of restrained minimization followed by molecular dynamics simulations for 20 ns.

Optimization of Docking Protocol
Mouse eugenol receptor (MOR73/mOR-EG), isolated from olfactory receptor neuron was found to respond to eugenol, using calcium imaging studies, in heterologous cells, as well as in vivo studies [35]. The protein sequence of the receptor is available at NCBI (www.ncbi.nlm.nih. gov). Since it is a sufficiently well-characterised system, a homology model of the mOR-EG was built using the methodology described in our earlier analysis [39] [56] and used for evaluation GRID selected for Induced Fit Docking. Induced Fit docking protocol was standardized using the experimental data available on mouse ORs that responds to eugenol (mOR-EG). Different grid parameters and constraints were used to standardize the protocol as shown in Table 2. The use of the upper half of the receptor facing the extracellular milieu gave the best score for eugenol binding as compared to the other parameters. Thus similar grid parameters were used for all the IFD runs. The receptor TM helices 1-7 are coloured in VIBGYOR colour (Violet, Indigo, Blue, Green, Yellow, Orange and Red).  [35] [37]. Further, the mutation of the residue Phe 182 resulted in complete loss of response of receptor to eugenol [37]. Different grid parameters were used ( Table 2) to obtain a parameter which would yield highest score for binding of eugenol to the receptor and also include all the residues known to be involved in ligand binding in the binding pocket. In 90% of the poses, Ser 113 was not found to interact with the ligand, but was present at a distance of 6Å around the ligand. The receptor with different rotameric states of Ser 113 (S1 Fig) was used as the input for IFD, to check if change in rotameric state of the residue results in its interaction with ligands. However, changing the rotameric state of Ser 113 did not change its binding affinity to the ligand. Phe 182, however, was found to form H-bonds with more than 80% of the ligands including eugenol (Fig 3). All the other residues shown to be at the binding Induced Fit Docking Protocol. This figure represents the methodology followed for Induced Fit Docking. Ten pairs of human-mouse ORs were used as receptors and the 125 odorants as ligands and IFD was carried out using XP scoring. The odor profile for all the receptors obtained using IFD has been represented as heat map (Fig 8).
doi:10.1371/journal.pone.0131077.g002 site from earlier studies were similar in our evaluation too. The parameter 5 gave the best binding energy for eugenol and was thus selected for further large scale docking analysis (Table 2).
Olfactory Receptor 'Orthologs' with High Sequence Identity Do Not Share Similar Ligand Binding Profile Ten pairs of closely related human-mouse olfactory receptors were selected for docking analysis. The receptor pairs had varying sequence identity. The highest identity between a receptor  pair was 84%, while the lowest was 43% ( Table 3). The true orthologs (Pairs 2, 4, 5 and 7) have been marked with ' Ã ' (Table 3). This varying sequence identity in the dataset helped us analyse the possibility of whether highly similar OR sequences respond to similar ligands. The ligandbinding profiles for the first ten highest scoring ligands were compared for all OR pairs (Fig 4). It was observed that the OR pair with highest sequence identity (84%) has four common ligands out of ten best scoring ligands, while the OR pair with 72% and 76% sequence identity would respond to eight common ligands out of best ten scoring ligands. The ligand clusters were then analysed to check whether the ten high scoring ligands for the receptor pair with highest sequence identity belongs to the same cluster (Table 4), which wouldindicate that the response of receptors depends on chemical composition of the odorant and not on the odor emitted by the odorant. The receptors with highest sequence identity neither respond to common ligands nor to ligands belonging to similar clusters. This confirms that subtle changes at binding site compositions could result in differential odorant binding and odor detection. Such conclusions have been arrived at by several studies involving OR response to odorant under different circumstances. OR genetic polymorphism is known to alter function and, on an average, two individuals have functional differences at over 30%, suggesting that a given olfactory receptor with minor allelic variations across individuals of the same species could exhibit  difference in responses to similar ligands [57]. Eighty seven percent of human-primate orthologs and 94% of mouse-rat orthologs showed differences in receptor potency to an individual ligand [7]. Despite high overall sequence identity (of 84%), only four residues are identical at the binding site of OR pair 2, while other residues are different. This difference in the local chemical environment could explain the varied response to a given set of odors of two closely related ORs (Fig 5). The electrostatic surface representation of the binding site of two receptors clearly shows the variation in the local chemical environment which could lead to different ligand binding profiles for the two receptors (S2 Fig). This difference in binding profiles may not be reflected by marked differences in gscores between human and mouse OR homologous pairs. The distribution of gscores (maximum, minimum and spread of gscores) for all the OR

Ligand Dataset
The ligand dataset consists of 125 odorant molecules belonging to various chemical classes like alcohols, ketones, carboxylic acids, aldehydes and sulphur containing compounds (Fig 6). The ligands were clustered using the canvas module of Schrödinger software into 36 unique clusters. The clustering was analysed at merging distance ranging from 0.1 to 1.0 at regular intervals of 0.5 (Table 5). At each of the merging distances, the clusters were manually checked to confirm that ligands with similar features were clustered into a group. The clustering that resulted in maximum number of similar ligands in a given cluster was selected for further analysis. The merging distance of 0.85 yielded 36 clusters and was used for further studies. The cluster 33 had 55 aliphatic odorant members in it and it was further divided into 11 sub-clusters based on the number of carbon atoms. The number of ligands in each cluster is given in Table 6. Based on MOLPRINT2D, the ligands were classified based on their molecular weight, number of rotatable bonds, number of aromatic rings and number of hydrogen bond donors and acceptors. More than 60 of the ligands have a molecular weight between 100-150 Daltons. The ligands contain 1-11 rotatable bonds while 75% of the dataset contains aliphatic chains. Seventy odorants contain at least two hydrogen bond acceptor groups, while 80 ligands contain at least one hydrogen bond donor group (Fig 7). The ligand clusters were further used to compare odor-binding profiles of OR proteins under study. Binding of similar odorants or odorants belonging to the same clusters to a given OR will indicate common binding modes.

Induced Fit Docking
Induced fit docking of 10 homologous human-mouse OR pairs. One hundred and twenty five odorant molecules, as mentioned earlier, were docked to each of the twenty olfactory receptors individually using the IFD module of GLIDE Schrödinger software (Schrödinger Release 2013-1:, version 2.6, Schrödinger, LLC, New York, NY, 2013). Each IFD run takes upto six days on an I7 Linux machine with 2 processors. For each receptor, 125 or more complexes were generated based on the different tautomeric states of ligands. The table of energies has been reproduced as a heat map for visualization (Fig 8).
The average energy of OR binding to odorants is in the range of -4kcal/mol to -6kcal/mol. The average energy of interaction between human ORs and the odorants is -4.85kcal/mol, while for mouse ORs it is slightly higher, -5.09kcal/mol. The average difference in the binding of odorants between closely related human and mouse OR pairs was calculated. The pair with the highest sequence identity (Pair 2) has the minimum difference of average binding (Table 7) scores indicating overall similarity in binding mode between closely related OR sequences. The odorants with the highest scores for the 20 ORs belong to varied clusters of the ligand clustering data, perhaps since chemically similar odorants exhibit different odors and a given OR recognizes odorants based on shape or odor similarity.
Validation of docking protocol. In this section, we summarize the data on OR-odorant interactions known till date in the light of our computational study of OR-ligand modeling.
The mOR-EG receptor is known to respond to eugenol and compounds belonging to various chemical classes (vanillin-like compounds, polycyclic compounds and benzene derivatives etc.) [37]. The results using the IFD protocol mentioned above identifies similar vanillin-like, polycyclic and aromatic compounds (Helional, Ethyl-vanillin and Piperonyl acetone) to be high scoring ligands as compared to eugenol (Tables 8 and 9). Human OR1A1 (belonging to OR pair 2) responds to citronellol and helional even at lower concentrations when compared to aldehydes with 6-9 carbons atoms (Table 9) [58]. Among the different stereoisomers of citronellol, the receptor is more responsive to (-) citronellol than (+) citronellol. The hydrophobic binding pocket is very similar to the one observed in mOR-EG receptor. The TM 3, 4, 5, 6, and 7 are involved in interactions with the ligand. Gly 108, Asn 109 and Ser 112 are involved in interactions with the ligand and mutation of these residues results in a reduction of response to these odorants. These residues are found in the binding pocket of OR1A1, derived from the current OR-ligand docking protocol. Helional is the highest scoring ligand (-8.51kcal/mol) in this mini-virtual screening exercise, while (-) citronellol obtains a GLIDE score of -5.76kcal/mol, though the binding pockets for both ligands are similar in our analysis (Fig 9). (+) citronellol scores lower than the two above mentioned odorants. Comparing the residues at the binding pocket for the close homologue of OR1A1 in mouse (Fig 5), we observe that the four residues known to be important in ligand binding are Fig 8. Heat map of the odorant profile of 10 human-mouse OR pairs. X-axis shows the human-mouse OR pair and the sequence identity between human-mouse OR pairs. Y-axis indicates the number of 130 odorants used in this study. The heat map is obtained using the gscore (kcal/mol) of interaction of each ligand to the given receptor. The scores have been normalized between 0 to 1 as shown in the scale. The odorants for which experimental data are available (Steroids, Helional, Undecanal, Eugenol and Citronellol) have been marked with a red arrow. Insect ORs have been marked in a green rectangular box. The heat map has been generated using R software.
doi:10.1371/journal.pone.0131077.g008 common, while the rest of the binding pockets differ in the composition of residues. This variability at the functional site allows the closely related OR sequences to bind to myriad odorants.
Human OR1D2 is a receptor found in human spermatozoa [59]. It is known to respond to bourgeonal and is suppressed by undecanal ( Table 9). The OR1D2 receptor is evolutionarily related to the human receptor 7D4, that detects steroids such as androstenone and androstadienone. Point mutations in OR7D4 result in variations in response to the known odorants across different individuals [60]. It is reported that OR1D2 also responds to steroid hormones with lesser efficacy as compared to OR7D4 (Table 9). In the docking analysis, androstenone and androstadienone are observed as the best scoring ligands for 1D2, with a GLIDE score in the range of -10kcal/mol, while bourgeonal binds with a score of -4.48kcal/mol. The binding pockets remain similar for both the odorants. This study confirms the fact that by subtle changes at Table 7. Average difference in binding energies (kcal/mol) of odorants to 10 human-mouse OR pairs. The average binding energies of 125 odorants to each of the ORs were calculated and the difference in the average energy between each human-mouse OR pair has been reported. The OR pair 2 (with the highest sequence identity of 84%) has the minimum difference in binding energy.  Undecanal is found to bind better than many selected odorants and in the same binding pocket as the known ligands suggesting competitive inhibition.

OR pair
Eugenol and compounds structurally similar to it such as ethyl vanillin have been ranked in the top 10 best binding odorants to mOR-EG. (Please refer Table 8).
[8], [15] and [35].  The interaction between OR1A1 and the odorant (-) Citronellol is found to score better than the interaction between OR1A1 and (+) Citronellol in the current study. The review reports (-) Citronellol to be an agonist for OR1A1 while the other stereoisomer do not activate the OR (please see text for details). M71 OR (Mouse) acetophenone and benzaldehyde - [69] (Continued) the receptor binding site, the receptor can accommodate similar ligands. The difference in ligand binding scores could be because of the difference in the functional groups of the odorants (Fig 10).

Human ORs Aldehydes and Helional
In this review psychometric function test was used to show that Helional is the most potent Aldehyde at a low odorant concentration. In the current study we find that of all the aldehyde group of odorants, Helionalreceptor complexes have the highest average gscore (please refer Table 9). [62] OR7D4 and OR1D2 (Human) Bourgeonal, androstenone,androstadionone.
The human OR7D4 and OR1D2 and known to be evolutionarily related and found to be expressed ectopically in testis. Bourgeonal is known to be the endogenous ligand for these receptors, while they do respond to androstenone and other testicular odorants. In the current study androstenone is reported to be the highest scoring OR-odorant complex for 1D2. Bourgeonal is found to bind in the same binding pocket as androstenone but with a lower binding score. [60] OR1D2 (Human) Bourgeonal and Undecanal Bourgeonal is the known agonist for OR1D2 while undecanal is the antagonist. In the current study both bourgeonal and undecanal bind in the same binding pocket of the OR1D2. The OR-undecanal complex scores higher than the OR-bourgeonal complex indicating that undecanal may act as a competitive inhibitor. [59] doi:10.1371/journal.pone.0131077.t009 respond to bourgeonal, the agonist [15]. The gscore of undecanal interaction with the receptor 1D2 in this study is -5.2 kcal/mol, which is lower than binding affinity of the highest scoring pair (-10.73kcal/mol), but better than bourgeonal (-4.48Kcal/mol) which is the known agonist (inhibited by undecanal). The binding site of undecanal is same as the other high scoring ligands, suggesting competitive inhibition of these receptors by undecanal (Fig 11).
Aldehydes of varying carbon length show high response by human olfactory receptors (Table 9) [62]. Helional is the most potent aldehyde when compared to butanal, hexanal, heptanal, octanal, nonanal and decanal. Helional has the highest average docking score (-5.66kcal/ mol) for the twenty olfactory receptors under study by IFD. The average score for other aldehydes are as reported in Table 10. The highest score of helional is -12.26kcal/mol and it forms three H-bonds, one salt bridge and one pi-pi interaction (Fig 12), which results in the most stable interaction as compared to other ligands. Invariably, larger ligands would score better than eugenol due to higher extents of hydrophobic interactions.
Insect olfactory receptors are known to detect odorants with lesser numbers of carbon atoms (2-5 carbon atoms), while mammalian odor detect odorants with higher numbers of carbon atoms (5-12 carbon atoms) [42] [63]. The ligand dataset consisted of 25 insect OR specific odorants, comprising about one-fifth of the whole dataset of odorants. The average energy score of insect odorants upon docking to the 10 human-mouse pairs is -4.15 kcal/mol, as compared to mammalian specific odorants which have an average energy score of -5.18 kcal/mol. This confirms that mammalian specific odors form better interactions to mammalian ORs and thus have a high binding score. Similar studies on insect ORs require the availability of homology models for many insect ORs and their co-receptors. Due to the inverted topology and varying loop lengths, it is difficult to obtain high quality homology models for several insect ORs [64]. Such a study will, however, be very important in understanding how insect vectors detect human hosts using the sense of smell.

Antagonistic Activity of Odorants
Olfactory receptors exhibit a combinatorial code of response [1]. Odorant response varies when presented as single odorant and as a mixture of odorants. In a mixture, some odorants are known to antagonize the effect of other odorants and the response is the cumulative effect of all the odorants in the mixture. The antagonistic effect also depends on the neuron in which the OR is expressed. There has been no study to differentiate the perception of agonists and antagonists [15] as antagonists also bind to the ORs unlike the non-binders (which can be differentiated using the free energy of binding). The odorant which acts as an agonist for one OR could behave as an antagonist for another OR [41]. The response to antagonists may not necessarily lead to an inactive state of the receptor. It may result in a decreased response of the agonist and thus cannot be differentiated at the receptor expression levels. Antagonists tend to be structurally related to agonists. For example undecanal (an antagonist) is structurally similar to bourgeonal (an agonist) [41]. In nature, odorants exist as a mixture and very rarely as a single compound. Thus, in the docking studies to understand one to one OR-odorant relation, it becomes difficult to differentiate the antagonists from the agonist until one studies the activation of ORs using these ligands [65,66]. In this study, it is observed that undecanal, a proven   Table 10. Average gscore (kcal/mol) for interactions between aldehydes in the odorant dataset to the 10 human-mouse OR pairs. The average binding energy of each of the aldehyde to the 20 ORs was calculated. Helional is known to be the most potent aldehyde as compared to aldehydes with 5-10 carbon atoms.

Odorant (aldehyde)
Average gscore (kcal/mol) of binding to 10 human-mouse OR pairs antagonist for the human OR1D2, scores higher than the endogenous ligand, bourgeonal. It is known to inhibit the response of OR1D2 to bourgeonal by binding to the same ligand binding pocket as that of bourgeonal [41].

MD Simulations of Mouse OR-EG
MD simulations of the mouse OR-EG was performed as mentioned in methods. The energy drift of the ligand-bound form in the initial 10 ns is~-300kcal/mol while in the last 5 ns it is-100kcal/mol. This suggests that the receptor in the ligand-bound form remains in a stable state throughout the simulation, without huge differences in the energy of the system. Overall, the ligand-bound form has lower energy throughout the simulations as compared to the unbound form (S4 Fig). We find the ligand-binding pocket is made up mostly of hydrophobic residues and few polar residues that form H-bonds. Ser 113, which is shown to be important in ligand binding [35,37], is found to form a H-bond in about 35% of the overall simulation time (S5 and S6 Figs). We find that the residues at the binding site are spatially clustered and remain so throughout the MD simulation indicating that the ligand is bound firmly in a particular binding pocket and does not switch positions (S7 Fig).

Database of Olfactory Receptors-Access to Receptor-Ligand Complexes
The Database of Olfactory Receptors (DOR database) (http://caps.ncbs.res.in/DOR) [67] contains information on sequences, phylogenetic analysis and homology models of olfactory receptors from five eukaryotic organisms. Models of the Receptor-Ligand complexes for the 20 olfactory receptors with each of the 125 ligands have now been included in the DOR database. The files are available in the 'LIGAND DOCKING' tab of the database. The user can download a compressed 'tar file' for each olfactory receptor and its ligand complexes. The olfactory receptors are labelled as per their 'GI Ids' ( Table 3). The receptor-ligand complexes are in the PDB format and they are labelled based on the Pubchem code for each of the ligand used in the study ( Table 1). The availability of all the protein-ligand complexes in the public domain will be helpful for a wide range of analysis on these classes of proteins.

Conclusion
Previously, we had exploited distant relationships between ORs and GPCRs to arrive at threedimensional models of 100 ORs using tools like homology modelling [39]. Olfactory receptors are known to have a combinatorial response to odors and OR-ligand discrimination has been recorded in literature only for a few ORs through careful experiments. In this paper, we selected 20 ORs of both human and mouse origin, and used docking and virtual screening of 125 known ligands to arrive at OR-ligand profiles. To the best of our knowledge, this is the first longitudinal large-scale computational study using docking to arrive at OR-ligand profile. Further, docking scores that correlate well with OR-ligand affinities known from experiments have been obtained. Eugenol and eugenol-like ligands were recognised as top-ranking ones by the current docking protocol. We have shown the selective non-affinity of Drosophila OR-ligands by mammalian ORs. The current docking protocol and scores are sensitive even to identify better ligands between stereoisomers like (+) and (-) citronellol. Known ligands and inhibitors could be correctly identified for MOR73, human OR1A1 and 1D2 using docking scores. We are currently predicting a large number of OR-ligand pairs whose relative affinities are yet to be tested. Using a well-validated protocol, methods have been standardized to obtain an odorant profile, through mini-virtual screening, for a given olfactory receptor protein for a limited number of odorants. Olfactory receptors bind to myriad of odors and it is difficult to decode this complex combinatorial response process. Many OR-odorant profiles still remain undeciphered. In silico tools like homology modeling and induced fit docking provide us the advantage of inducing flexibility to both receptor and ligand. This creates a scenario very similar to the one that occurs biologically in a cell, wherein a receptor undergoes conformational changes to accommodate a given ligand. OR sequences exhibit great diversity. Homologous OR sequences do not respond similarly to a given set of odorants. A small change in residue composition at the binding site results in different odor profiles, which cannot be realised from the overall sequence identity of two ORs under question. The binding site and binding mode vary greatly across ORs. This helps the ORs recognize numerous odors in the environment. Another method that could be pursued is to introduce flexibility to the ORs using Molecular Dynamics (MD) simulations. The different conformations obtained for a receptor can then be used to identify a set of odorants that would bind to the receptor above a given energy threshold. Molecular Dynamics simulation study for a large data set of receptors (400-1000 mammalian ORs) is very computer intensive and time consuming. In this regard MD simulations were carried out for mouse OR-EG in the eugenol-bound and unbound form. We find that key interactions between this ligand and OR remain the same throughout 20 ns simulations.
ORs are known to be expressed in tissues other than oro-nasal cavity i.e. testis, lungs and pancreas [68]. Obtaining the odorant profile for such ORs will help us in understanding their role in the given tissue. ORs are known to be over-expressed in certain types of cancer and diabetes and it will be interesting to decode the function of such ORs that could then be used in pharmacological studies. From the current study, known ligands are observed to bind with a energy threshold greater than -4.5kcal/mol. This can be used as a cut-off to obtain ligand profile for a given odorant. Highly related OR pairs show least difference in average binding energy to the given set of odorants. This can be used to compare the odorant profiles of similar ORs, especially in cases where one of the OR has been de-orphaned. Induced fit docking protocol can thus be systematically used to understand the structural and functional divergence of olfactory receptor class of proteins.
Supporting Information The distribution of gscores for OR proteins and the odor molecules has been represented as a Box-Whisker plot (prepared using R-scripts). The plot represents the spread of gscores for 125 odor molecules against each of the 20 OR proteins. The gscores range from -4 to -6 kcal/mol for all the ORs under study. For the OR pair (Pair 2) with highest sequence identity (84%) the median value of gscore is equal. The circles outside the plot represent the outliers. The OR proteins are numbered based on the pair they belong to.