Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Development of an Agent-Based Model (ABM) to Simulate the Immune System and Integration of a Regression Method to Estimate the Key ABM Parameters by Fitting the Experimental Data

Correction

31 May 2016: Tong X, Chen J, Miao H, Li T, Zhang L (2016) Correction: Development of an Agent-Based Model (ABM) to Simulate the Immune System and Integration of a Regression Method to Estimate the Key ABM Parameters by Fitting the Experimental Data. PLOS ONE 11(5): e0156823. https://doi.org/10.1371/journal.pone.0156823 View correction

Abstract

Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data.

Introduction

Currently, system biologists employ agent-based models (ABM) [15] and differential equation models (DE) [69] to simulate the immune system. Detailed definitions of ABM and DE are illustrated in the S1 File.

Recently, researchers did develop several ABMs for the immune system simulation. For example, The Basic Immune Simulator (BIS) [10] is an agent-based model (ABM) that can be used to study the interactions between cells of the innate and adaptive immune systems. The BIS demonstrated that the degree of the initial innate response was a crucial determinant for an appropriate adaptive response [10]. Also, the ImmunoGrid project [11] is to develop a natural-scale model of the human immune system using an ABM, that can reflect both the diversity and the relative proportions of the molecules and cells. This model will be of great value for specific applications in the field of immunology[11].

ABM has several significant advantages. First, its natural representational formalism can be employed to denote a cell’s biological properties and behavior in detail [1]. Second, its flexible features can be employed to reflect the real complex dynamic environment [12]. However, it is difficult for ABM to incorporate experimental data, because ABM describes the system at the level of its constituent units but not at the top level [13].

DE is broadly employed to approximate experimental data and predict the progression of the immune system. For example, researchers have applied it to the case of influenza A virus (IAV) infection. Miao et al., [14] developed a differential equation model to describe the dynamic interactions among the components (i.e., epithelial cells, virus, CD8 CTLs, and antibody) in the lung. The model was used to quantify the immune responses and to estimate the key parameters in primary infection. Not limited to IAV infection, DE can also be widely used for other virus infections, such as HIV in the study of Miao et al. [9]. The researchers developed statistical estimation, model selection, and multi-model averaging methods for in vitro HIV viral fitness experiments using a set of nonlinear ordinary differential equations and addressed the parameter identifiability of the model [9].

The DE has been the focus of a great deal of attention due to its great potential as a new optimization technique to solve complex nonlinear problems and widespread use in various areas [15]. Compared to ABM, DE can be easily employed to solve the optimization problem by estimating a few control parameters [15]. However, it has difficulty describing the details of biological systems because DE falls short in constructing a biological model to a sufficient degree, especially when faced with the simulation of complex phenomena.

To integrate the advantages of these two commonly used models, we developed an integrated ABM regression model (IABMR) and employed the IAV data set [14] to evaluate its efficiency and accuracy. IABMR employed ABM to denote each cell as an agent with three phenotypes (i.e., quiescence, proliferation and apoptosis). Then, it employed Loess regression to build a Loess model based on the input and output of ABM. The model’s key parameters were optimized using the particle swarm optimization algorithm (PSO)[1621]. The concept of PSO is illustrated in the S1 File.

Next, we employed the classical greedy algorithm [2224] to optimize the ABM parameter and compare the efficiency of ABM with the greedy algorithm and IABMR. The results demonstrated that IABMR not only described the immune response at the cellular level using various cells’ phenotypes and possessed great potential for investigating interactions and special information for the cells but also overcame the limitations of ABM in parameter estimation.

Methods

2.1. Using ABM to describe the immune system

To describe the dynamic interactions among the components (i.e., epithelial cells, infected epithelial cells and virus) in the lung, Fig 1 is used to quantify immune responses in primary infections.

thumbnail
Fig 1. State transition diagrams of epithelial cells, infected epithelial cells and virus.

https://doi.org/10.1371/journal.pone.0141295.g001

An epithelial cell in a quiescent state Epq can be transited to three other states. Two of these states belong to the Ep cell, where and are the probabilities for Epq to change its state. Epq and Epp are two states of the Ep cell. The Epq cell can also be differentiated into another type of the cell (Ep*) with a probability . Once the Epq state transits to the Epp state with a probability of , it will have and probabilities to become Epp and , respectively.

With respect to the above state transition diagram (Fig 1 (Epithelial cells)), the state transition equations for epithelial cells are developed as follows. (1.1) (1.2) Here, V represents the infective viral titer and is used to represent the number of cells which will divided into two cells. The case of an infected epithelial cell is shown in Fig 1 (Infected epithelial cells).

The state can transit to itself and with the probability and , respectively. The transition equations are described as the following equations. (2.1) (2.2) Different from the epithelial cell and infected epithelial cell, the virus is too small to be described as a discrete variable. In Fig 1(Virus), the virus is described as a continuous variable with percentage of dying (Vd state) and percentage of living. Here, we set . Additionally, the virus can be produced by with respect to the rate of πv.

The case of the virus can described using the following equations.

(3.1)(3.2)

To simulate the process of cellular immunity among the epithelial cells, virus and infected epithelial cells, an agent based model (ABM) is developed based on the diagrams and equations provided above. The parameters listed in Table 1 agree with the following rules.

thumbnail
Table 1. Parameters and variables definitions for agent based model.

https://doi.org/10.1371/journal.pone.0141295.t001

(4.1)(4.2)(4.3)(4.4)(4.5)

2.2. Parameter Estimation

To estimate the parameters in this study, parameter vector space (H) is generated by the Sparse Grid method [25], which consists of a set of parameter vectors; each vector has 4 dimensions. The Sparse Grid method always chooses the most important points in the high dimension space to approximate the complicated surface [2628].

In what follows, the input parameter of ABM is denoted by a four-dimensional vector θ, where the components θk, k = 1,2,3,4 represents () respectively. Reported by the previous research [14], the input data θ are estimated as (6.2×10−9,2.42×10−7,5.98×10−2,4.23×10−1), which we call as the initial parameter θ0. In this study, we set the input parameter of ABM in the region (0,2θ0) = {(θ1, θ2, θ3, θ4)∈ R4,0≤θK≤2θ0k,k = 1,2,3,4}. However, according to the rules of the Sparse Grid, each component of parameter vector hH is between 0 and 1. Therefore, we need to map the parameter vector space H generated by Sparse Grid into the region (0,2θ0). The mapping function is: (5) Where h is a parameter vector in the space H, a = 0, b = 2θ0.

θ1 is employed as the input parameters for the ABM to generate L sets of output data (G1), which represents the number of cells in 5 days. To generate randomness for ABM, we performed Lr replicates for each set of θ1. Next, θ1 and G1 are employed to develop a Loess regression [2932] mode M0.

In our model M0, the Loess regression is described in Eq 6. (6) Here, w is a weighting function and θ1i is an input parameter of ABM, where i denotes the i-th sampling point in the parameter vector space. θ1i represents the points in the neighborhood of x to be weighted by w depending on the distance to x. g is a key parameter in the procedure called the "bandwidth" or "smoothing parameter” that determines how much of the data is used to fit each local polynomial. G1i is the output data value of ABM corresponding to the input data θ1i. α and β are two coefficients of the least squares method [33] that is employed to approximate their value by minimizing the value of χ2 in Eq 6.

Next, the particle swarm optimization algorithm (PSO) [16] is employed to locate the optimal parameter by fitting the real experimental data. PSO [1721] is illustrated in the S1 File in detail, and its key equations are described by Eqs 7.1 and 7.2.

(7.1)(7.2)

First, let S be the number of particles in the swarm. Then initialize the particle's position with a uniformly distributed random vector xiU(lb,ub), where lb and ub are the lower and upper boundaries of the search-space, here (lb,ub) = (0,2θ0). Obviously, xi can be considered as the input parameter. The particle's initial velocity is: viU(−|ublb|,|ublb|). Here, w is a weight function used to maintain the inertia force of each particle. Let pi be the best known position of particle i and let pg be the best known position of the entire swarm. Then, Eq 8 is employed as the object function for the parameter estimation. (8) Here, m is the time point, and n is the replicates at each time point, V1 is the real experimental data in five days. yi is the predictive value from the Loess model based on input value xi.

By using the PSO algorithm and Loess model, we can minimize the object function fobj to locate the local optimal parameter θ* in the region (0,2θ0).

Next, we reemployed the mapping function (Eq 5) to map parameter vector space H on region (0,2θ*) to generate L sets of input parameters θ2 and n replicates for each set of θ2. These data will be employed as input parameters in the ABM; then, we can obtain G2 output data with m time points. Next, θ* will be employed as the input data of ABM to generate the simulated experimental data set V2 with n replicates, which will replace V1 by adding random noise.

The normal distribution method (Eqs 9.1 and 9.2) [34] is used to add noise for each replicate of the V2 data set and develop the simulated experimental data set . (9.1) (9.2) N(0,αi) denotes a normal distribution with mean 0 and standard deviation αi.

Next, a new Loess regression model M1 is built based on θ2 and G2 in a process similar to M0. We used PSO [35] to explore the optimal local parameter Estθi by fitting the simulated experimental data . Finally, we can compute average relative error (ARE) [9] for each Estθi using Eq 10. (10) Here, M is the total number of ABM simulation runs for each sample. This parameter estimation process is illustrated in Fig 2.

Results

The IABMR model is developed using C++ and R program language and works in the Linux environment.

3.1. Primary data for model fitting

We used real experimental data V1[14] from infection of mice with the H3N2 influenza virus A/X31 strain to fit the model. This study employs data from the initial preadaptive phase constituting 0 to 5 days post-infection. The real experimental data contains 6 samples and each sample has 13 time points. The detailed experimental data information is listed in Table 2. The initial key parameters of ABM are also from the literature [14].

3.2. Obtain the sampling data using Sparse Grid function

We employed the “createIntegrationGrid” function of the R “SparseGrid” package to create three sampling data sets in the region (0, 1) (sample size: 41, 137 and 385) (listed in S1S3 Tables). Then, these sampling data are mapped to the input parameters sets of ABM (θ1) by Eq 5. The values of θ1 are listed in S4S6 Tables.

3.3. Estimate the parameter of ABM by fitting the real experimental data

To obtain randomness, we run data sample 41, 137 and 385 with 9,9 and 6 times. And then, we denote them as model 41×9, 137×9 and 385×6, respectively. The output data set G1 (S7S9 Tables) of ABM is obtained by inputting θ1. Eqs 7.1 and 7.2 is employed to explore the local optimum parameter θ* for each sampling data set listed in Table 3.

3.4. Generate the simulated experimental data by ABM

We can obtain an output of ABM V2 by inputting θ*. The simulated experimental data is developed from V2 by Eq 9 by adding three levels of noise (αi), such as , and regarding to our previous study [36]. Part of the simulated experimental data is listed in S10S12 Tables.

3.5. Average relative error computing

After fitting the model to the simulated experimental data using Eqs 7.1 and 7.2, we obtain the local optimal parameter Estθi. Then, Eq 10 is employed to compute the average relative error for each set of simulated experimental data. Here, we set the total number of ABM simulation runs as M = 100 and the three sample sizes as 5×3 (5 is time points (m), 3 is the replicates (n)),10×6 and 15×9. The values of ARE for each sample size are listed in Tables 46.

thumbnail
Table 4. The summary table of ARE values for model 41×9.

https://doi.org/10.1371/journal.pone.0141295.t004

thumbnail
Table 5. The summary table of ARE values for model 137×9.

https://doi.org/10.1371/journal.pone.0141295.t005

thumbnail
Table 6. The summary table of ARE values for model 385×6.

https://doi.org/10.1371/journal.pone.0141295.t006

3.6. Evaluate the accuracy and efficiency of the IABMR model

To evaluate the accuracy and efficiency of the IABMR model in parameter estimation, we employed the greedy algorithm [22,37] with ABM to estimate the parameters. Fig 3 compares their residual errors (RSS). Here, RSS1 is the residual errors of the greedy algorithm as well as RSS2, RSS3 and RSS4 are the residual errors of the three sampling data sets from IABMR (model 41×9,137×9 and 385×6).

3.7. Using IABMR to approximate primary data

Fig 4 illustrates that IABMR can approximate the primary data with a similar effect as the ODE model [14].

Discussion

In this work, we developed an agent-based model (ABM) to simulate influenza A virus (IAV) infection and integrated the ABM with Loess regression to develop an integrated ABM regression model (IABMR). This model can be employed to locate the key ABM parameter by fitting the real experimental data.

By inheriting the advantages of ABM, IABMR is capable of mimicking the biological system in detail. Here, IABMR not only showed quantitative changes in the system but also simulated the phenotypic switch for each cell type. Compared to the previous well-developed ODE model [14], it was possible to describe a multi-scale biological system in a very complicated external environment. IABMR Integrated with Loess regression [29] can employ classical numerical optimization methods such as the genetic algorithm [38,39] to estimate key parameter of the model, which is much faster than the greedy algorithm [2224] used by ABM. These two theoretical advantages made IABMR an attractive application to simulate biological systems, compared to the ODE and ABM.

The average relative error (ARE) is commonly employed to evaluate the capacity of parameter estimation for statistical models. The smaller the ARE, the better the model’s performance. Tables 46 showed the ARE values of four key probabilities of the IABMR under the control of the following two aspects: the number of time points collected from the preadaptive phase and the level of noise added to the simulated experimental data.

Table 6 showed two trends of ARE under different noise levels and numbers of time points. First, the ARE values decrease when the number of time points increases from 5 to 15 at the same noise level. For example, the ARE value of has the order under noise , which indicates more time points and replicates can obtain better parameter estimation accuracy.

Second, the ARE values increase when the noise level increases from to under the same number of time points, which demonstrates that the parameter estimation accuracy is higher with a smaller noise level. For instance, in the case of sample 5×3 (Table 6), the ARE value of has the order:

The additional three probabilities () in the parameter have similar trends to (Table 6).

Fig 3 compared the accuracy and parameter estimation speed between the IABMR and ABM models. IABMR is much faster than ABM in terms of locating key parameter. For example, it takes at least 54,600 runs for ABM with the greedy algorithm to make the RSS converge, but only 2310 runs for IABMR with the largest size of parameter space to make the RSS converge. Additionally, the size of the parameter vector space has high impact on the parameter estimation accuracy. The larger the size, the more accurate the estimated results. As described by the Fig 3, model 41×9 has the greatest RSS and model 385×6 has the least RSS. Meanwhile, the trends of the ARE values in Tables 4 and 5 are not as perfect as in Table 6. Lastly, Fig 4 demonstrated that the IABMR simulation results had high similarity like the ODE to approximate the real experiential data, which validated the efficiency and accuracy of the IABMR.

In conclusion, this study developed an IABMR method to simulate detailed biological systems and locate their key parameter using classical numerical optimization methods. By integrating the advantages of both the ABM and ODE modes, it not only described the complicated microenvironment of the biological system and the cell’s behavior in multiple scales in detail, but also easily to incorporate real experimental data. To evaluate the efficiency and accuracy of IABMR, we employed primary influenza infection data as the case study to exhibit the advantages of the IABMR. The validation results demonstrated that IABMR could mimic the immune system on multiple levels similar to ABM and approximate real experimental data similar to ODE with a reasonable parameter estimation cost.

Supporting Information

S1 File. The introduction of ABM, DE and PSO.

https://doi.org/10.1371/journal.pone.0141295.s001

(PDF)

S1 Table. Sample size 41 generated by Sparse Grid.

https://doi.org/10.1371/journal.pone.0141295.s002

(PDF)

S2 Table. Sample size 137 generated by Sparse Grid.

https://doi.org/10.1371/journal.pone.0141295.s003

(PDF)

S3 Table. Sample size 385 generated by Sparse Grid.

https://doi.org/10.1371/journal.pone.0141295.s004

(PDF)

S4 Table. Input data of ABM mapped from sample size 41.

https://doi.org/10.1371/journal.pone.0141295.s005

(PDF)

S5 Table. Input data of ABM mapped from sample size 137.

https://doi.org/10.1371/journal.pone.0141295.s006

(PDF)

S6 Table. Input data of ABM mapped from sample size 385.

https://doi.org/10.1371/journal.pone.0141295.s007

(PDF)

Author Contributions

Conceived and designed the experiments: XMT JHC HYM TTL LZ. Performed the experiments: XMT JHC HYM TTL LZ. Analyzed the data: XMT JHC HYM TTL LZ. Contributed reagents/materials/analysis tools: XMT JHC HYM TTL LZ. Wrote the paper: XMT JHC HYM TTL LZ.

References

  1. 1. Zhang L, Wang Z, Sagotsky JA, Deisboeck TS (2009) Multiscale agent-based cancer modeling. Journal of mathematical biology 58: 545–559. pmid:18787828
  2. 2. Charles MM, Michael J. Tutorial on agent-based modeling and simulation part 2: how to model with agents; 2006. pp. 73–83.
  3. 3. Jacob C, Litorco J, Lee L (2004) Immunity through swarms: Agent-based simulations of the human immune system. Artificial Immune Systems: Springer. pp. 400–412.
  4. 4. Mansury Y, Diggory M, Deisboeck TS (2006) Evolutionary game theory in an agent-based brain tumor model: exploring the ‘genotype–phenotype’link. Journal of theoretical biology 238: 146–156. pmid:16081108
  5. 5. Siddiqa A, Niazi M, Mustafa F, Bokhari H, Hussain A, et al. A new hybrid agent-based modeling & simulation decision support system for breast cancer data analysis; 2009. IEEE. pp. 134–139.
  6. 6. Øksendal B (2003) Stochastic differential equations: Springer.
  7. 7. Jones DS, Plank M, Sleeman BD (2011) Differential equations and mathematical biology: CRC press.
  8. 8. Miao H, Xia X, Perelson AS, Wu H (2011) On identifiability of nonlinear ODE models and applications in viral dynamics. SIAM review 53: 3–39. pmid:21785515
  9. 9. Miao H, Dykes C, Demeter LM, Wu H (2009) Differential equation modeling of HIV viral fitness experiments: model identification, model selection, and multimodel inference. Biometrics 65: 292–300. pmid:18510656
  10. 10. Folcik VA, An GC, Orosz CG (2007) The Basic Immune Simulator: An agent-based model to study the interactions between innate and adaptive immunity. Theoretical Biology and Medical Modelling 4.
  11. 11. Halling-Brown M, Pappalardo F, Rapin N, Zhang P, Alemani D, et al. (2010) ImmunoGrid: towards agent-based simulations of the human immune system at a natural scale. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 368: 2799–2815.
  12. 12. Fachada N, Lopes V, Rosa A. Agent-based modelling and simulation of the immune system: a review; 2007.
  13. 13. Chiacchio F, Pennisi M, Russo G, Motta S, Pappalardo F (2014) Agent-Based Modeling of the Immune System: NetLogo, a Promising Framework. BioMed research international 2014.
  14. 14. Miao H, Hollenbaugh JA, Zand MS, Holden-Wiltse J, Mosmann TR, et al. (2010) Quantifying the early immune response and adaptive immune response kinetics in mice infected with influenza A virus. Journal of virology 84: 6687–6698. pmid:20410284
  15. 15. Ho W-H, Chan AL-F (2011) Hybrid Taguchi-Differential Evolution Algorithm for Parameter Estimation of Differential Equation Models with Application to HIV Dynamics. Mathematical Problems in Engineering 2011: 14.
  16. 16. Kennedy J (2010) Particle swarm optimization. Encyclopedia of Machine Learning: Springer. pp. 760–766.
  17. 17. Kennedy J, Kennedy JF, Eberhart RC (2001) Swarm intelligence: Morgan Kaufmann.
  18. 18. Poli R (2007) An analysis of publications on particle swarm optimization applications. Essex, UK: Department of Computer Science, University of Essex.
  19. 19. Poli R (2008) Analysis of the publications on the applications of particle swarm optimisation. Journal of Artificial Evolution and Applications 2008: 3.
  20. 20. Clerc M (2012) Standard particle swarm optimisation.
  21. 21. Pedersen MEH, Chipperfield AJ (2010) Simplifying particle swarm optimization. Applied Soft Computing 10: 618–628.
  22. 22. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms: MIT press Cambridge.
  23. 23. Bang-Jensen J, Gutin G, Yeo A (2004) When the greedy algorithm fails. Discrete Optimization 1: 121–127.
  24. 24. Bendall G, Margot F (2006) Greedy-type resistance of combinatorial problems. Discrete Optimization 3: 288–298.
  25. 25. Ypma J (2013) Introduction to SparseGrid.
  26. 26. Heiss F, Winschel V (2008) Likelihood approximation by numerical integration on sparse grids. Journal of Econometrics 144: 62–80.
  27. 27. Yserentant H (2005) Sparse grid spaces for the numerical solution of the electronic Schrödinger equation. Numerische Mathematik 101: 381–389.
  28. 28. Nobile F, Tempone R, Webster CG (2008) A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM Journal on Numerical Analysis 46: 2309–2345.
  29. 29. Cleveland WS, Grosse E, Shyu WM (1992) Local regression models. Statistical models in S: 309–376.
  30. 30. Cleveland WS (1981) LOWESS: A program for smoothing scatterplots by robust locally weighted regression. American Statistician: 54–54.
  31. 31. Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. Journal of the American statistical association 74: 829–836.
  32. 32. Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association 83: 596–610.
  33. 33. Nocedal J, Wright SJ (2006) Least-Squares Problems: Springer.
  34. 34. Weisstein EW (2002) CRC concise encyclopedia of mathematics: CRC press.
  35. 35. Xie X-f, Zhang W-j, Yang Z-l (2003) Overview of particle swarm optimization. Control and Decision 18: 129–134.
  36. 36. Miao H, Wu H, Xue H (2014) Generalized Ordinary Differential Equation Models. Journal of the American Statistical Association 109: 1672–1682. pmid:25544787
  37. 37. Leiserson CE, Rivest RL, Stein C, Cormen TH (2001) Introduction to algorithms: MIT press.
  38. 38. Mitchell M (1998) An introduction to genetic algorithms: MIT press.
  39. 39. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. Evolutionary Computation, IEEE Transactions on 6: 182–197.