Conceived and designed the experiments: ADA JRB. Performed the experiments: ADA. Analyzed the data: JT AVM. Wrote the paper: ADA AVM.
The authors have declared that no competing interests exist.
Combinatorial sensor arrays, such as the olfactory system, can detect a large number of analytes using a relatively small number of receptors. However, the complex pattern of receptor responses to even a single analyte, coupled with the non-linearity of responses to mixtures of analytes, makes quantitative prediction of compound concentrations in a mixture a challenging task. Here we develop a physical model that explicitly takes receptor-ligand interactions into account, and apply it to infer concentrations of highly related sugar nucleotides from the output of four engineered G-protein-coupled receptors. We also derive design principles that enable accurate mixture discrimination with cross-specific sensor arrays. The optimal sensor parameters exhibit relatively weak dependence on component concentrations, making a single designed array useful for analyzing a sizable range of mixtures. The maximum number of mixture components that can be successfully discriminated is twice the number of sensors in the array. Finally, antagonistic receptor responses, well-known to play an important role in natural olfactory systems, prove to be essential for the accurate prediction of component concentrations.
Mammalian and insect olfactory systems are combinatorial in nature - instead of activating a single specialized receptor, each analyte invokes a complex pattern of responses across the receptor array. The advantage of such systems lies in their ability to detect a large number of analytes with a relatively small number of receptors. However, the complexity of array responses to mixtures of analytes makes quantitative prediction of component concentrations a challenging task. Here we show that combinatorial output from an array of four engineered G-protein-coupled receptors can be used to predict the concentration of each component in mixtures of highly related sugar nucleotides. We employ a physical model of ligand-receptor interactions and carry out Bayesian analysis of the array output. Furthermore, our
Mammalian and insect olfactory systems are capable of recognizing tens of thousands of odors – mostly organic compounds with diverse chemical structures and properties
The idea of combinatorial recognition has been adapted to artificial arrays in which multiple sensors with partially overlapping selectivities respond to a given analyte
Here we describe a physical model of receptor-ligand recognition that explicitly relates observed response patterns to component concentrations and receptor properties, making it easier to quantify mixture constituents. We use Bayesian inference to predict absolute concentrations of each ligand in arbitrary mixtures of uridine diphosphate (UDP) sugar nucleotides applied to a combinatorial array of four GPCRs. Furthermore, we develop a universal metric of receptor array performance, and use it to study the fundamental limits imposed on the accuracy of ligand recognition by the physics and biology of receptor-ligand interactions. Finally, we provide design guidelines for constructing cross-specific arrays optimized for mixture recognition, and demonstrate that inhibitory responses are essential for simultaneous detection of all components in a complex mixture.
Our sensor array is comprised of four engineered receptors (L-3, H-20, K-3 and 2211) with distinct but overlapping specificities for four types of nucleotide sugars: UDP-glucose (UDP-Glc), UDP-galactose (UDP-Gal), UDP-glucosamine (UDP-GlcNAc) and UDP. The receptors were evolved
We start with the simplest case in which a receptor interacts with a single ligand. We assume that the observed signal in our receptor-bearing reporter strain is proportional to the probability that the receptor is bound by the ligand. This proportionality value,
(
Once all receptor-ligand interaction parameters have been determined through the analysis of single-ligand calibration experiments, we can proceed to interrogating mixtures of ligands with receptor arrays. In considering the response of receptor-bearing strains to ligand mixtures, we note that each ligand contributes to the overall receptor occupancy and that each receptor molecule on the cell surface activates the reporter with an efficacy specified by the ligand to which it is bound, which is often different for different ligands (
We have tested our approach using a series of assays in which a known combination of ligands was applied to the receptor-bearing strains. As an initial test, we mixed equal proportions of two, three and four ligands in all possible combinations and predicted absolute ligand concentrations. We used a model in which four ligands interacted with four receptors, even if only one, two or three ligands were actually present in the mixture. As can be seen in
We used nested sampling of a four-receptor, four-ligand model to estimate means and standard deviations for the relative concentrations of all ligands in the mixture and the total ligand concentration at the
Our second test involved combining UDP-Glc and UDP-Gal in several unequal proportions and applying the resulting mixture to the four-receptor array (
We used nested sampling of a four-receptor, four-ligand model to estimate means and standard deviations for the relative concentrations
Increasing the number of receptors should improve prediction accuracy by providing additional information about the mixture. To see the extent of these improvements, we have used a variable number of receptors to infer component concentrations in six equal-proportion mixtures of two nucleotide sugars from
Shown on the log-scale are means and standard deviations for
As evident from the activation profile of each receptor in response to each ligand (
Our Bayesian approach estimates posterior probabilities for the concentration of each component in an arbitrary mixture. With sufficient data, variation of the posterior probability with model parameters is determined by the corresponding log-likelihood (Eq. (8)), which can be visualized as a multidimensional landscape. The global maximum on this landscape corresponds to the model that best describes the data, while the curvature at the maximum shows how sensitive the likelihood is to the change in each parameter. Narrow peaks result in precisely defined parameter values, whereas wide plateaus yield many nearly equivalent predictions and therefore sizable uncertainties in parameter estimates. Expanding the log-likelihood in the vicinity of its maximum yields a Hessian matrix (Eq. (9)), which contains information about standard deviation
Hessian analysis relies on the quadratic expansion in the vicinity of the log-likelihood maximum and hence it is important to check how well it captures the behavior of the more general but computationally intensive nested sampling approach. To create a test case for which the answer is known, we have used Eq. (6) to generate synthetic data for 15 equal-proportion mixtures from
Not all receptors are equally good candidates for inclusion into biosensor arrays – for example, receptors with similar sets of efficacies and binding affinities should be less useful than receptors with more orthogonal binding and activation patterns. Here we make such qualitative insights precise by developing a Hessian approach to biosensor array design. That is, given a certain number of measurements with an array of fixed size (typically, a series in which the total concentration is changed step-by-step within a certain range), we wish to derive the most optimal choice of receptor properties for deciphering the mixture. From the Hessian point of view, the best array will have the smallest errors in predicting component concentrations (Eq. (10)). Because each error is inversely proportional to the determinant of the Hessian, we maximize the determinant instead of minimizing the errors directly. Similarly to the prediction of constituent concentrations, the maximization is carried out by nested sampling
To demonstrate our approach, we first optimize parameters of a single receptor discriminating a mixture of two ligands. By maximizing the determinant of the Hessian, in this case a
(
The fine-tuning of binding energies is not necessary if either the total concentration
The agonist-antagonist pattern observed in the one-receptor, two-ligand case plays the role of a basic building block when two or more receptors interact with multiple ligands: nested sampling maximization of the Hessian determinant with respect to binding energies
In the light of the observed agonist-antagonist behavior, it is not surprising to see that each receptor can identify concentrations of at most two ligands (
The patterns shown in
In general,
The agonist-antagonist rules described above create readout patterns that are not a simple sum of array responses to single-ligand binding. For one receptor optimized to discriminate two ligands (
The design guidelines described above can be used to predict which parameter changes lead to most significant improvements in performance compared to our currently implemented array. Although we do not have direct experimental control over the values of
For the experimentally implemented four-receptor GPCR array, nested sampling errors are consistently larger when UDP is present in the mixture (
We have developed a Bayesian algorithm that allows determination of all the constituents in an unknown mixture from the output of a cross-specific sensor array. Our algorithm employs a physical picture of sensor-analyte interactions to model the non-linear relationship between ligand concentrations and the reporter response. After appropriate calibration of each sensor's response to each analyte of interest, the algorithm interprets the integrated output of the entire array and, with a sufficient number of variably tuned sensors, reliably returns the amount of each chemical in a complex mixture.
We also provide quantitative guidelines for designing optimal sets of sensors. Three general principles emerged from our computational and theoretical studies of array design. First, the optimal parameters of the sensors exhibit weak dependence on the relative amounts of compounds in a mixture. Thus a given set of optimal sensors will remain near-optimal through a sizable range of ligand concentrations. Nonetheless, analyzing a mixture where both compounds are present in roughly similar amounts is better accomplished with a set of sensors different from those fine-tuned to measure a small amount of one compound in the presence of a large excess of the other.
Second, the maximum number of ligands in a mixture whose levels can all be determined simultaneously is simply twice the number of sensors in the array. This linear relationship is different from the exponential relationship between ligands and receptors in olfactory systems
Third, the optimum design of receptors for the array demands that one of the ligands function as a strong agonist of a receptor and a second ligand as a strong antagonist of that receptor. Antagonists sharpen the discriminatory powers of the array by heightening the differences in the receptor response to individual compounds. As a result, a mixture of chemicals produces an array readout which is not a superposition of responses to individual ligands, and whose intensity pattern may be fine-tuned for maximum recognition through receptor-ligand binding energies. Accordingly, odors that function as antagonists to a subset of olfactory receptors could potentially increase the discriminatory power of the olfactory system, and in particular enable it to resolve mixtures that contain those odors. Recent analysis of olfactory receptors suggests that some odorants do possess antagonist activity
The L-3 mutant was isolated using a procedure similar to that previously employed with the H-20 and K-3 mutants
Our
Microtiter plate-based assays are often subject to edge- or plate-bias due to uneven heating or discrepancies in timing across a single plate or among plates. While no obvious plate effects were seen, it is very difficult to control for all possible variations in a single experiment. Due to the number of samples and the need to make efficient use of materials, each of the mixture experiments was split across two plates per receptor. In mixtures of equal proportions, samples containing UDP, UDP-Gal and UDP-GlcNAc but lacking UDP-Glc were on Plate 1, while all mixtures containing UDP-Glc were on Plate 2. In the UDP-Gal/UDP-Glc binary mixtures of unequal proportions, samples containing 90%, 80% or 60% UDP-Glc were on Plate 1, while samples containing 40%, 20% or 10% UDP-Glc were on Plate 2.
For each single ligand or combination of ligands, a series of measurements was performed at several values of the total concentration
For a single receptor interacting with a single ligand, we model the normalized reporter fluorescent intensity as:
We compute the log-likelihood of the data by assuming that fluorescence measurements are Gaussian-distributed around values from Eq. (1):
The reporter response to a mixture of ligands is given by
The log-likelihood of the observed pattern of fluorescence intensities from multiple receptors interrogated by a mixture of ligands is given by
We estimate all posterior probabilities by nested sampling
The Hessian matrix in the low-noise limit can be written as (
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(XLS)
(PDF)
(PDF)
(PDF)
(PDF)
(PDF)
JT wishes to thank Allan Haldane for technical advice.