Bayesian Estimation of the Active Concentration and Affinity Constants Using Surface Plasmon Resonance Technology

Surface plasmon resonance (SPR) has previously been employed to measure the active concentration of analyte in addition to the kinetic rate constants in molecular binding reactions. Those approaches, however, have a few restrictions. In this work, a Bayesian approach is developed to determine both active concentration and affinity constants using SPR technology. With the appropriate prior probabilities on the parameters and a derived likelihood function, a Markov Chain Monte Carlo (MCMC) algorithm is applied to compute the posterior probability densities of both the active concentration and kinetic rate constants based on the collected SPR data. Compared with previous approaches, ours exploits information from the duration of the process in its entirety, including both association and dissociation phases, under partial mass transport conditions; do not depend on calibration data; multiple injections of analyte at varying flow rates are not necessary. Finally the method is validated by analyzing both simulated and experimental datasets. A software package implementing our approach is developed with a user-friendly interface and made freely available.


Introduction
Surface plasmon resonance (SPR) was first applied to the study of molecular binding reactions in biology and chemistry in the 1980s [1][2][3]. The first commercial SPR instrument was introduced by BIACore in 1991 [4,5]. Since then, instruments using SPR have gained increasing popularity due to their high sensitivity and simple construction, and have become the accepted standard for measuring the kinetic rate constants for molecular binding interactions. SPR technology has many practical advantages: 1) it allows real-time detection of binding events; 2) no labeling is required; 3) it can generally be applied to binding reactions of many types, such as protein-protein, protein-peptide, protein-DNA, and protein-small molecule; 4) analysis can be carried out on colored or turbid samples without interference from absorption and scattering.
Since SPR measures function-binding between analyte and ligand-it is able to determine active concentration, which is not necessarily identical to the total concentration as measured, for example by optical density. Two approaches have been proposed. The more widely used approach is the "calibration-dependent" method [6][7][8]. Each quantification run, however, requires a new calibration curve, thereby increasing the time and total material cost. The other approach does not rely on explicit calibration. Christensen [9] developed the mathematical theory and computed the analytical solutions under the partial mass transport conditions for molecular interactions within a two-compartment model. This approach has proven to be useful in applications [8,10]. The method, however, has a few restrictions. It only exploits the initial binding phase of the sensorgram, requires several injections of analyte at various flow rates for each unknown sample, and assumes low noise levels on response signals in order to accurately determine the rates of change. These restrictions limit the applicability of this method. Sigmundsson et al. in 2005 [11] proposed a more general solution for the same process, and derived a complete analytical solution over the entire binding phase [9]. The solution involves Lambert's W function, a special function that is not available in many common statistical software packages. Moreover, no user-friendly software implementation has become available as a result of either effort.
In this work, we develop a Bayesian approach to estimating the active concentration and kinetic rate constants from SPR experimental data using the same two-compartment model and quasi-steady state approximation used by Christensen and Sigmundsson [9,11]. Bayesian statistical inference has been widely used in many biomedical applications and has been proved useful for accounting for multiple sources of uncertainty arising from experimental variation, instrumental noise, etc. [12][13][14][15][16]. In addition, it does not require point estimation or linearization of variations, either of which can lead to inconsistency in the estimation in nonlinear models [17,18]. Inference of model parameters is given by the posterior density conditional on observed data. Here, we supply prior probabilities on the parameters and derive a likelihood function which together with the SPR data themselves allow us to compute the posterior probability densities on the active concentration and kinetic rate constants. This approach exploits information from the duration of the process in its entirety, including both association and dissociation phases, under partial mass transport conditions. Compared with the two approaches mentioned above, ours does not need multiple injections of analyte at varying flow rates as long as the partial mass transport conditions are met. Finally our methods have been implemented in software and are freely available.

Theory
The theory for molecular binding under conditions of partial mass transport in a two-compartment model was worked out by Christenson [9] and is briefly recapitulated here. If A 0 is the analyte in the bulk flow, A s is the analyte on the biosensor surface, B is the ligand immobilized on the surface and AB is the analyte-ligand complex, the binding process in the SPR biosensor can be described by the subprocesses, and where k M is the mass transport coefficient, and k a and k d are the association and dissociation rate constants, respectively. Process (1) describes the mass transport of the analyte between the bulk flow (A 0 ) and surface (A s ); process (2) describes the binding and unbinding of analyte and ligand on the surface matrix. The mass action equations for these processes are where h diff is the characteristic height of the diffusion layer, given [19] approximately by D is the analyte diffusion coefficient, F is the bulk flow rate and h, w and l are the height, width and length of the flow chamber of the SPR system, respectively [19]. The diffusion coefficient D can be estimated by Stoke's law and Einstein-Sutherland equation [20] (See details in S1 Appendix).
Under the quasi-steady state condition (where d[As]/dt = 0 in Eq (3) is assumed) [21], the analyte concentration at the biosensor surface matrix can be approximated by: which describes the average of the concentration gradient that forms along the surface [21]. Combining Eqs (4) and (6)and rearranging give which has been evaluated and demonstrated to be a reasonable approximation for most conditions with the partial mass transport effect [9,11]. Under experimental conditions, the analyte bulk concentration [A 0 ] remains constant by a continuous injection of fresh analyte solution at a fixed flow rate. The amount of free ligand [B] decreases with time until steady-state is reached. Furthermore, the density of free ligand can be expressed through the conservation relation as where [B max ] is the density of total ligand (bound and unbound) on the surface and [AB] is the density of bound complex. The mass transport coefficient, k M , is approximately proportional to the cube root of the flow rate [22], and where l 1 and l 2 are the lengths to the start and end, respectively, of the detection area from inlet of the flow cell [22]. For a BIAcore system, these values are known. Therefore, if the molecular weight and the bulk flow rate of the analyte are also given, k M can be obtained. When l 1 and l 2 or other parameters (such as h, w or l) are not easily obtained for a biosensor system other than BIAcore instruments, k M can still be estimated empirically (see discussions in the Results section). The response signal, R, output by an SPR sensor is proportional to the amount of complex formed at the biosensor surface with an empirical factor given by where G is the response per mass per area for proteins with an approximate value of 1000 RUÁmm 2 /ng [23] and W M is the molecular weight of analyte. Therefore, Eq (7) now can be rewritten as where R max is the maximum value of the response signal when all the immobilized ligands have been fully bound into complexes, and k M is rescaled to have unit of RU/M/s instead of m/ s. The mass transport limitation for a specific analyte-ligand system is determined by both the amount of free immobilized ligand and the bulk flow rate of the analyte (k M ). A limit coefficient, k a [B]/k M , can be obtained from Eq (7) to identify conditions for mass transport-limited binding and for kinetic binding (see S1 Appendix for details). Practically, zero and full mass transport limitation are not easily obtained with a given analyte-ligand system, since to approach these extreme conditions it is necessary to use a wide range of ligand concentrations. On the other hand, it is relatively easy to obtain partial mass transportation limitation. In the next section we describe a Bayesian approach to determining the active concentration as well as the rate constants under the partial mass transport conditions of SPR responses.

Bayesian Inference
We use Bayesian methods to obtain the posterior distributions of active concentrations, A 0 , association/dissociation rate constants, k a and k d , the theoretical maximum response signal, R max , and the initial response level at the beginning of the dissociation phase, R 0 . As discussed in the previous section, the analyses are performed on SPR data assuming partial mass transport limitation conditions described by the differential equation Eq (12). The initial-value problem for Eq (12) does not have a closed-form solution, therefore we carry out inference on its numerical solution.
In addition to the process model, we must have a measurement, or statistical, model. The statistical model for the ith SPR sensorgram signal R i conditional on the parameter vector θ where the errors ε i are independent, identically-distributed Gaussian random variables with mean zero and variance σ 2 .
k a $ Nð5 Â 10 11 ; 10 12 Þ ð 15Þ The variances for the normal distributions are set so to make the prior distributions sufficiently diffuse/non-informative. Given the amount of observed data in a typical SPR experiments, the posterior distributions of these parameters are insensitive to modest variation in the prior distributions.
For estimation of the posterior density, we use the Markov Chain Monte Carlo (MCMC) method with the Gibbs sampler to generate samples from the posterior distributions of parameters as follows.
• Define the full conditional distributions [26] as One cycle of the Gibbs sampler is completed by drawing fy k g p k¼1 from these distributions successively updating the conditional variables Algorithm 1. define the initial values, y ð0Þ ¼ ðy 3. return values {θ (1) Following an equilibration period, the Markov chain approaches its stationary distribution and samples from the MCMC closely approximate samples from the posterior parameter distribution. Using these samples, the posterior means and independent 95% Bayesian credible intervals (BCI) from the posterior marginal densities are computed.
Combining the priors and observed data, the full conditional probability can be formulated as Because we are using the numerical solution to a dynamical system initial value problem, we employ the adaptive rejection with Metropolis sampling (ARMS) algorithm [27,28] with modifications [29], which allows one to draw samples from an arbitrary distribution. The overall inference processes have been summarized in Fig 1 flow chart.

Software Development
The software implementation of the proposed approach was developed in C# for the .NET framework. The executable is available for free distribution. The program has a simple and intuitive graphical interface. Fig 2 shows a screenshot of the interface and illustrates its various components. The input of the program is through text data files. Users can specify the initial values of the parameters through the interface. The results are then written to files with simulated posterior distributions, allowing further analysis such as MCMC diagnostics, hypothesis testing and the fitting of models for biological processes.
Because the proposed approach requires numerical solution of the initial value problem for differential equation Eq (12), the analysis is a computation-intensive task. Special optimization algorithms have been incorporated to make the software efficient, e.g. the forth-order Runge-Kutta algorithm for calculating numerical solution and the Nelder-Mead algorithm for finding maxima and minima of functions.

Surface plasmon resonance measurements
SPR measurements were carried out on a SensiQ Pioneer biosensor (SensiQ Technologies, Inc., OK). The CHOO2 two dimensional surface chip was used in the assay. The ligand CAII was immobilized via the common amine coupling chemistry. This involved activating the chip surface with freshly made 2mM EDC and 0.5mM NHS (as suggested by the manufacturer), injecting CAII in sodium acetate (pH 5.0) and then blocking excess reactive esters with ethanolamine. Subsequently, the analyte 4-CBS was injected with a concentration of 50μM at a flow rate of 50μl/min for 60 seconds and followed with a dissociation phase for another 60 seconds. The interaction was carried out in PBS with 0.05% TWEEN 20 (pH 7.0) (PBST) at 25˚C.

Data analysis
The SPR response data were first processed in SensiQ Qdat (version 2.2) software for background subtraction, and then were exported as text files for analyses using the software development in this work.

Results
We validated the method using simulated SPR response data, experimental SPR data generated in our laboratory, and publicly available experimental data.

Validation on Simulated Data
The proposed Bayesian approach was validated by analyzing simulated SPR response data. To simulated SPR data, a strategy similar to the one employed by Karlsson were adopted [30]. Eq (12) was used to simulate interaction data for reactions with k M = 3.15 x 10 7 RU/M/S. This k M value is similar to that in the Inogatran/Thrombin system [30]. The concentration of analyte was set to 30nM, the maximum immobilized ligand in terms of the response signal (R max ) was 31.5RU, k a values varied between 1x10 5 and 3 x 10 8 /M/s, k d values were adjusted so that the affinity, K D , was held constant at 1nM. The response curves in Fig 3 represent binding curves with differing kinetic rate constants, but constant K D . The Gaussian noise at a level of 1.5RUs, which was equivalent to that (1~3RUs) seen on Bio-Rad Proteon XPR36 biosensors but bigger than that on BIACORE and SensiQ systems, was added to the simulated data as specified by Eqs (13) and (14).
The simulated data were then analyzed using the proposed Bayesian method. The parameter values returned were summarized in Table 1, and the chain trace plots for one set of data (k a Ã R max /k M = 3.0) are shown in Fig 4. The proposed approach did not return accurate k a , k d or A 0 under the situations of little or no mass transport effect (case i and ii, where k a Ã R max /k M <1), although R max and noise level σ 2 could be determined correctly under the same conditions. As the mass transport effect started to dominate, k a , k d and A 0 as well as others (R max and noise level) could all be determined accurately (cases iii~vi, where k a Ã R max /k M was between 1~100). In case of very high mass transport effect conditions (case vii and viii, where k a Ã R max / k M >100), k a and k d estimations became inaccurate while the affinity constant K D and other parameters were still correctly determined. Similar behavior has been reported previously [30].

Validation on experimental data from our laboratory
The proposed approach was further tested with experimental data. The enzyme carbonic anhydrase (CAII) and its small molecule inhibitor 4-carboxybenzenesulfonamide (4-CBS) were chosen to be studied on a SensiQ Pioneer biosensor. In the experiment, about 3000RU of CAII was amine coupled to the chip surface as the ligand and then 4-CBS was injected at a flow rate of 50μl/min. This experiment setting resulted in a partial mass-transport limited condition for the interaction, and generated small but detectable responses (Fig 5). The data were then exported and analyzed to determine the active concentration and the kinetic constants of the analyte. First, the mass transport coefficient k M , which primarily depended on the diffusion coefficient, the dimensions of the flow cell and the flow rate, had to be determined. Under the current experimental condition, the diffusion coefficient D of the analyte, 4-CBS (molecular weight = 201.2 Dalton), was calculated to be 4.76x10 -10 m 2 /s [20] (Table A in S1 Appendix). Furthermore, the flow cell dimensions of the SensiQ Pioneer biosensor were given with a width  (12), (13) and (14) with no measurement error. In simulations R max was 31.5 RU, A 0 was 30nM, k a values varied between 1x10 5 and 3x10 8 /M/s, k d values were adjusted so that K D was constant 1nM and k M was 3.15x10 7 RU/M/s. The mass transport limiting coefficients (MTLC) of these responses varied between 0.1 (little/no mass transport effect) and 300 (high/full mass transport effect). (B), same as in A, but measurement error with a constant standard deviation of 1.5 RUs was added to the simulated data. The level of noise was chosen based on empirical data obtained from the Bio-Rad Proteon XPR36 biosensor in our lab through a variety of antibody-antigen interactions (1~3 RUs in our experiments).
doi:10.1371/journal.pone.0130812.g003 Table 1. Parameter values estimated by the proposed Bayesian method using the simulated SPR response Data.  The SPR response data were simulated according to Eqs (12), (13) and (14) with k M = 3.15x10 7 as in Fig 1. The parameters used in the simulation were specified in the top section of the table. Then the proposed approach was applied to estimate the parameters, which were listed in the bottom section in the table with the 95% Bayesian credible intervals (in square brackets). The parameters in the simulation ranged from little mass transport limited conditions (k a * R max /k M = 0.1) to high/full mass transport limited conditions (k a * R max /k M = 300). doi:10.1371/journal.pone.0130812.t001 Estimation of Active Concentration and Affinity 0.635mm, a height 0.05mm and a length 3.0mm by the manufacturer. Then with a flow rate of 50μl/min, k M was determined to be 1.03x10 7 RU/(Ms) according to Eqs (9) and (10) ( Table A in S1 Appendix). Finally, the kinetic parameters and the active analyte concentration were estimated by the proposed Bayesian approach and shown in Table 2. The original sensorgram and the fitted values were overlaid in Fig 5, which showed a very good fit. The trace plots of each parameter in the analysis were shown in Fig A in S1 Appendix. The data were analyzed as described in the text. The trace plots are shown for six parameters in the analysis of the data set with k a * R max /k M = 3.0. The returned parameters are summarized in Table 1.
doi:10.1371/journal.pone.0130812.g004 The active concentration A 0 of the analyte was determined in our analysis to be 2.0 ± 0.22 μM, which was about 4% of the total concentration (50μM of the analyte). This percentage of the active concentration in the total concentration was similar to that of the active phosphorylated GST-CEACAM1-4L protein determined by both SPR and other radioactivity based method [11]. Moreover, our analysis showed that the dissociation rate constant k d was (4.23 ± 0.12)×10 −2 s −1 , which is consistent with what has been reported in the literature ((3.65 ± 0.06)×10 −2 s −1 ) [31]. On the other hand, the estimated association rate constant k a , (4.67 ± 0.66)×10 5 M −1 s −1 , was about ten times higher than what was reported ((4.8 ± 0.2)×10 4 M −1 s −1 ) in an analysis that did not take into account the difference between active concentration and total concentration [31]. Thus, the discrepancy is not a surprise since the active concentration was more than ten times lower than the total concentration. When we used the total concentration of analyte (50μM) rather than the active concentration, we get a value for k a (1.4×10 4 M −1 s −1 ) that was similar to the previously-reported value. We argue that it is as more appropriate to use the active concentration under these circumstances, and that this comparison illustrates the importance of doing so.

Validation on publicly available data
In order to further validate the proposed approach, we analyzed two publicly available experimental data sets generated using other popular biosensors. The first dataset was from Castonguay et al. [32], in which the bone morphogenetic protein 10 (BMP10) interacted with immobilized human endoglin extracellular domain-Fc chimera (hEng ECD -FC) in a Biacore 3000 system. The second dataset was generated on a Proteon XPR36 biosensor from the work of Abdiche et al. [33], in which human calcitonin gene-related peptide-α (CGRPα) reacted with immobilized 4901 IgG. These two datasets were selected because they were generated under partial mass transport-limited conditions and the molecular weights of the analytes in them were known. With the latter information, the diffusion coefficient and then the mass transport coefficient could be derived from Eqs (5), (9) and (10) (also see details in S1 Appendix about the diffusion coefficient D). Therefore, the proposed approach could be applied to estimate the active concentration of analyte A 0 and rate constants k a and k d . The results are listed in Table 2, together with parameter estimates from the original reference papers for comparison.
In the first case, the SPR response data between BMP10 and hEng ECD -hFC were generated on a Biacore 3000 system, in which the detection area of the flow chamber has been very well documented. The parameters, such as the width, length and height of the measured SPR, are known [9,11] (Table A in S1 Appendix). The diffusion coefficient was approximated using the available molecular weight of the analyte BMP10 (see details in S1 Appendix). The mass transport coefficient k M was thus determined to be approximately 8x10 8 RU/M/s (Table A in S1 Appendix) and used to estimate the active analyte concentration and the kinetic rate constants ( Table 2 and Fig B in S1 Appendix). The fitted curve was overlaid with the raw sensorgrams in Fig 6A. In the second dataset, the interaction between hCGRPα and 4901 IgG was measured on a Proteon XPR36 system. In this case, the parameters for the detection flow chamber were not available. Similar parameters as in the Biacore systems, however, can still be assumed to approximate k M ; this has been verified empirically (data not shown). As shown in Table A in S1 Appendix, the mass transport coefficient k M between analyte hCGRPα and immobilized 4901 IgG was determined to be about 8.5x10 7 RU/M/s. Finally, the active concentration and rate constants were estimated ( Table 2 and Fig C in S1 Appendix) and the fitted curved were showed in Fig 6B. Overall, the fitted SPR responses based on the estimated parameters matched the observed sensorgrams very well in both cases. Table 2, our method obtained similar dissociation rate constant k d but different association rate constant k a compared with the ones estimated in the original reference papers in both tests. Again, this was reasonable, since k d estimation didn't depend on the actual analyte concentration and can be determined correctly without knowing it. On the other hand, k a estimation relied on the accurate information of the analyte concentration, more precisely the active concentration of the analyte. In the original papers, the total analyte concentrations were taken as a known input in the estimation. These values, however, were different from the active concentrations of analyte. In the first dataset, the active concentration was determined to be about 50% of the total concentration, and the difference between the two concentrations resulted in a higher and probably more accurate estimation of association constant k a . In the second dataset, the difference in k a estimation came in not only due to the difference between the active concentration in our approach and the total concentration in the original, but also due to the way in which the data analysis was carried out in the original paper. In the reference paper, a Langmuir model that assumed no mass transport effect was used. Under these conditions, one expects to underestimate the kinetic rate constants. The proposed Bayesian method assumed a model with non-negligible mass transport and was thus more likely to produce more accurate estimates.

As shown in
In conclusion, the proposed Bayesian approach in this work, assuming a partial mass transport-limited model and exploiting the activate analyte concentration, led to more accurate estimation of kinetic rate constants than was obtained using other methods

Software Availability
Project name: SPR_MCMC; Projection home page: https://sourceforge.net/projects/sprmcmc/; Operating systems: Windows; Programming Language: C#; License: Free for academic use. Three tests were carried out. In the first experiment, the interaction data between CAII and 4-CBS were studied on a SensiQ Pioneer biosensor under a partial mass-transport limited condition. The data were analyzed and the resulted kinetic parameters and active concentration. In the other two tests, two datasets were collected from the literature. Dataset #1 was from the work done by Castonguay et al. [32], in which BMP10 interacted with immobilized human Eng ECD -Fc in a Biacore 3000 system. Dataset #2 was from a different work done by Abdiche et al. [33], in which human CGRPα bound immobilized 4901 IgG in a Proteon XPR36 biosensor. The two datasets were then analyzed by the proposed approach to estimate the active concentration of analyte A 0 , rate constants k a and k d , and the dissociation constant K D . The returned parameters as well as the parameters estimated in the original reference papers were listed. The 95% Bayesian credible intervals for the estimation were also provided in the square brackets. The biosensor used to collect the experimental data was indicated in the parentheses.
doi:10.1371/journal.pone.0130812.t002 The kinetic analysis of BMP10 with hEnd ECD -hFC in BiaCore 3000 and hCGRPα with 4901 IgG in Proteon XPR36 by the proposed Bayesian approach. Two sets of data were extracted from literature. (A), the interaction between BMP10 and immobilized hEngECD-hFc was measured in a Biacore 3000 system [32]. (B), the analyte human CGRPα and immobilized 4901 IgG were studied in a Proteon XPR36 biosensor [33]. The values of mass transport coefficient k M were derived according to Eqs (9) and (10) for both cases. The two datasets were then analyzed by the proposed approach to estimate the active concentration of analyte A 0 , rate constants k a and k d , and the dissociation constant K D . The returned parameters were listed in Table 2. The raw sensorgrams (solid lines) and fitted curves (dashed lines) were overlaid. doi:10.1371/journal.pone.0130812.g006 Estimation of Active Concentration and Affinity