Mathematical modeling and quantitative analysis of HIV-1 Gag trafficking and polymerization

Gag, as the major structural protein of HIV-1, is necessary for the assembly of the HIV-1 sphere shell. An in-depth understanding of its trafficking and polymerization is important for gaining further insights into the mechanisms of HIV-1 replication and the design of antiviral drugs. We developed a mathematical model to simulate two biophysical processes, specifically Gag monomer and dimer transport in the cytoplasm and the polymerization of monomers to form a hexamer underneath the plasma membrane. Using experimental data, an optimization approach was utilized to identify the model parameters, and the identifiability and sensitivity of these parameters were then analyzed. Using our model, we analyzed the weight of the pathways involved in the polymerization reactions and concluded that the predominant pathways for the formation of a hexamer might be the polymerization of two monomers to form a dimer, the polymerization of a dimer and a monomer to form a trimer, and the polymerization of two trimers to form a hexamer. We then deduced that the dimer and trimer intermediates might be crucial in hexamer formation. We also explored four theoretical combined methods for Gag suppression, and hypothesized that the N-terminal glycine residue of the MA domain of Gag might be a promising drug target. This work serves as a guide for future theoretical and experimental efforts aiming to understand HIV-1 Gag trafficking and polymerization, and might help accelerate the efficiency of anti-AIDS drug design.

Introduction Gag protein (Gag) is the major structural polyprotein of HIV-1 and is synthesized in large amounts in the cytoplasm. Gag diffuses freely within the cytoplasm, hijacks the molecular motors, and moves along microtubules to the cytosolic side of the plasma membrane (PM) domain [1,2]. Underneath the PM, the immature HIV-1 Gag shell assembles in a radial arrangement. Gag is composed of six constitutive components: the N-terminal matrix (MA) domain, the capsid (CA) domain, the first spacer peptide (SP1), the nucleocapsid (NC) domain, a second spacer peptide (SP2) and the p6 (p6) domain. During the phase of HIV-1 maturation, Gag disconnects from the MA and reassembles to form the cone-shaped viral core. Therefore, Gag is necessary for HIV-1 replication, interfering with the trafficking and assembly of Gag has been a focus of research [3,4].
In the field of theoretical research, Liu et al. [5] first proposed a convection-diffusion equation model to explore the transport of Gag monomers in the cytoplasm. Based on the experimental finding of a monomer-dimer equilibrium in solution under certain biochemical conditions [6], Wang et al. [7] presented a model studying the transport of Gag monomers and trimers in the cytoplasm. These researchers analyzed the relationship between the timing of the initial appearance of HIV-1 capsid on the PM and the various model parameters. Sadre-Marandi et al. [8] and Liu et al. [9] simulated HIV-1 viral capsid assembly through dynamical systems.
A recent study on the events initiating HIV-1 Gag assembly was conducted by Kutluay et al. [10]. These researchers presented quantitative descriptions of monomers and multimers in the cytoplasm and PM, respectively, and demonstrated that only monomer and low-order multimers (e.g., dimer) of Gag were found in the cytoplasm, and that high-order multimers were formed only underneath the PM. In addition, these researchers studied two mutations of Gag: a mutated version of Gag-GFP that lacked the CTD of the CA of Gag (Gag-dCTD) and a mutation of the N-terminal glycine residue of the MA to alanine (Gag-G2A).
To the best of our knowledge, the reported models [5,7,[11][12][13] focus only on Gag trafficking in the cytoplasm and do not simultaneously consider the polymerization of Gag underneath the PM. Therefore, we developed a reaction-advection-diffusion equation model to describe the trafficking of two particles and the polymerization of various particles. In our model, we focus on the following aspects: • The transformation between monomers and dimers in the cytoplasm.
• The trafficking of monomers and dimers in the cytoplasm.
• The polymerization of monomers, dimers, trimers, tetramers, pentamers and hexamers of Gag underneath the PM.
We first estimated the parameters of the model based on experiment data [10] and assessed the robustness of the model. We subsequently applied this model to two mutation cases: Gag-dCTD and Gag-G2A. We also predicted the budding and release time of HIV-1 virus-like particles (VLPs). Using our model, we then analyzed the weight of the pathways involved in the polymerization reactions and deduced the key intermediates in hexamer formation. Moreover, we explored four theoretical combined methods for suppressing the Gag concentration and identified a promising drug target. This work will lead to a better understanding of the dynamics of Gag-Gag interaction and Gag trafficking, which are important in the emergence of HIV, and it might provide theoretical guidance for the design of antiretroviral drugs.

Mathematical model
This work aimed to assess the Gag trafficking in the cytoplasm and Gag polymerization underneath the PM. The schematic diagram used to develop the mathematical model is shown in Fig 1. Several assumptions were made to simplify the model: A1 The cytoplasm is an annulus [5,14]. A2 In the cytoplasm, monomers can aggregate into dimers, but can not form higher-order polymers [10,[15][16][17][18].
A3 Monomers and dimers are transported to the PM along microtubules by molecular motors (e.g., kinesin and dynein), and can also diffuse freely in the cytoplasm [5,11,13,14].
A4 Gag is synthesized by ribosomes attached to the endoplasmic reticulum (ER). A large amount of ER is located in the perinuclear region, and a slight amount of ER is found underneath the PM. We approximated that the density of ER decreased exponentially from the perinuclear region to the PM. In addition, newly synthesized Gag monomers are distributed throughout the cytoplasm, but concentrated in the perinuclear region [19]. Based on these assumptions, we assumed that the synthesis rate of Gag decreased (roughly) exponentially from the perinuclear region to the PM.
A5 Gag is observed on the PM within 5-10 minutes post-synthesis [20,21]. Typically, a period of 5-6 minutes is required to complete the assembly of a single VLP [22]. The budding and release time of a VLP is approximately 6 hours [22]. Furthermore, some researchers [23,24] have concluded that the hexamer is the building block of HIV-1. Therefore, we hypothesized that Gag monomers can only aggregate into dimers, trimers, tetramers, pentamers, and hexamers during the first 30 minuntes, and this assumption was mainly used to build the mathematical model (Eq (6)) in the boundary (PM).
A6 Gag polymers degrade with different degradation rates [21]. A7 Some molecular motors (e.g., kinesin) move unidirectionally from the microtubuleorganizing center to the cell periphery, whereas others (e.g., dynein) move toward the cell nucleus [25,26]. During the process of egress, the difference between the outward and inward speeds is denoted the velocity of egress [5,14].
In the cytoplasm (r N < r < r C ), the chemical reactions involving monomers and dimers can be described as follows: where Gag is a monomer and Gag 2 is a dimer. Definitions of the variables and symbols are provided in Table 1.
In rectangular coordinates, the transport velocity of monomer is v 1 = (v 1x , v 1y ), where v 1x and v 1y are the velocities along the x and y directions, respectively. For simplicity, we switched the problem to polar coordinates. Thus, the velocity of monomer along the radial direction is denoted by s 1 , and the angle between s 1 and the polar axis is denoted by θ 1 . Therefore, v 1 = (s 1 cos θ 1 , s 1 sin θ 1 ).
The total flux of monomer transportation includes both convective and diffusive transport: r Á (v 1 P 1 − D 1 rP 1 ). In polar coordinates, the above equation yields: The equation for total dimer flux has the same form. Based on the mass conservation law and Table 1. Nomenclature of the mathematical symbols.

Symbol Description
The monomer concentration at the radial position r and time t The off-rate constant of dimer The generation rate of monomer at the radial position r D 1 The diffusion coefficient of monomer mass action law [27], we obtain the following reaction-diffusion-transport equations: where t 2 (0, T), r 2 (r N , r C ).

Boundary and initial conditions
At the outer membranes of the nucleus (r = r N ), impermeable wall boundary conditions are considered as follows: Gag proteins gathered at the "Gag hotspots" underneath the PM, which has a thickness of approximately 20 nm [14]. This domain of "Gag hotspots" is considered a volume, and we set it as the boundary of our model, similarly to the strategy used in a previous study [28]. Therefore, the concentrations of polymers on the boundary reflect all the volume concentrations, which have the same units at P 1 and P 2 in the cytoplasm.
At the PM (r = r C ), monomer transport includes both convection and diffusion, and the same is true for dimer transport. However, underneath the PM, a myristoyl group of Gag can attach to the PM, resulting a weaker free diffusion of Gag compared with that in the cytoplasm. Therefore, we reduced the diffusion coefficient in the cytoplasm by k D to obtain the diffusion coefficient underneath the PM. Gag proteins are transported by molecular motors along microtubules, which are found throughout the cytoplasm, but are relatively rare underneath the PM. Therefore, the velocity of Gag loading to the PM is relatively small. We thus also reduced the velocity in the cytoplasm by k s to obtain the transport coefficient underneath the PM. Therefore, the monomer and dimer fluxes can be computed as follows: When the termination time T is approximately 30 minutes, Gag monomers can only aggregate into dimers, trimers, tetramers, pentamers, and hexamers based on assumption A5. Thus, the interactions among monomers, dimers, trimers, tetramers, pentamers, and hexamers underneath the PM (r = r C ) were studied, and all possible chemical reactions based on the step-growth polymerization [29] are the following: where Gag i is a polymer with i monomers, k ij is the on-rate constant, k 0 ij is the off-rate constant and d n is the degradation rate of n-mers.
By combining with the above mentioned chemical reactions and Eq (4), the following boundary conditions at the PM are obtained: The initial conditions are P i = 0, i = 1, 2, Á Á Á, 6, which are based on the experimental data [10].

A reduced model
The nine polymerization reactions (5) underneath the PM and one polymerization reaction (1) in the cytoplasm have 20 parameters, including k i,j , k 0 i;j , i j, i+j 6, i, j = 1, 2, Á Á Á, 6. To decrease the number of these parameters, we adopted the strategy described by Zlotnick et al. [30][31][32] in their study of the assembly kinetics of virus capsids.
Zlotnick et al. used a system of equations to simulate the sequential aggregation of free building blocks into virus capsids. To reduce the number of parameters, these researchers developed a formula [30,31] that mapped the on-rate constant to the off-rate constant. In our study, Gag proteins are aggregated to form a hexamer, and this process has a lot in common with virus capsid assembly. For example, the virus capsid and the subunit in the work conducted by Zlotnick et al. correspond to the hexamer and the low-order polymer serving as one of the two reactants in each polymerization reaction in our work, respectively.
For the polymerization reactions (5), the association constant K i+j of Gag i+j can be separated into two statistical components SI i,j and S i,j , and a non-statistical association constant K 0 iþj . These are related by the following function: where the statistical factor SI i,j describes the degeneracy of the incoming subunit. The second statistical factor S i,j can be treated as the ratio of two factors: the number of pathways for the formation of Gag i+j from Gag i and Gag j and the number of pathways for the dissociation of Gag i+j to Gag i and Gag j . K 0 iþj is a function of the number of contacts formed, i.e., where c i,j is the number of contacts of Gag i and Gag j , ΔG is the free energy associated with the formation of a contact, -2.72 kcal mol −1 , R is the gas constant, 1.987 cal deg −1 mol −1 , and T is the temperature in Kelvin, 298K. As determined by substituting Eq (8) into Eq (7), k 0 i,j can be given by the parameter k i,j based on the following function: As an example, the evaluation of k 0 1;5 is described. Owen et al. [23] found that Gag monomers could create a hexameric ring, which was believed to serve as the building block of HIV-1, thus the hexamer can be considered a hexagon from the perspective geometry. Then, i edges next to each other are removed from the hexagon, and we used these as the geometry of the imers of Gag. Fig 2 shows the reaction in which a monomer and a pentamer aggregate to form a hexamer. There is only one way to add a monomer to a pentamer to form a hexamer, and there are six ways to dissociate a hexamer to a monomer and a pentamer, thus S 1,5 = 1/6. The incoming subunit is the monomer, thus SI 1,5 = 1. Two contacts (c i,j = 2) are made in forming of Gag 6 , resulting in a free energy of 2ΔG. Thus, For the polymerization reactions (5), SI i,j , S i,j and c i,j are counted and these are listed in Table 2.
We only need to optimize the values of k i,j , because k 0 i;j can be obtained by the above-mentioned function (9). Therefore, the 20 parameters for nine polymerization reactions (5) underneath the PM and the single polymerization reaction (1) in the cytoplasm are cut by half. Combined with two proportionality coefficients k D , k s and the velocity s1 of Gag-G2A, this results in 13 parameters that needed to be optimized. Thus, the number of parameters to be optimized was substantially decreased.
The diffusion coefficient for an "average" (3 − 6 nm diameter) protein in the cytoplasm is 5 − 15 μm 2 /s [33]. The Gag monomer is a highly extended rod with a length of *20 nm and a width of 2 − 3 nm [5], resulting in an average mean of 11.25 nm between the length and width. According to the Stokes-Einstein equation, the diffusion coefficient is inversely proportional to the diameter. Therefore, we estimated the diffusion coefficient of Gag as *4 μm 2 /s. Similarly, the diffusion coefficient for a Gag dimer is approximately half of the corresponding value for a Gag monomer.
The velocities of the active transport of a monomer and a dimer are approximately equal to the velocity of the molecular motor (*1 μm/s [33]) in the cytoplasm.
Tritel et al. [34] found that 80% of Gag disappeared within 2 hours after synthesis. Therefore, we estimated that the degradation rate of a monomer was * 2.236 × 10 −4 /s, and the degradation rate of i-mers was thus *2.236 × 10 −4 /i /s. The above-described parameter values estimated by the experimentally measured data are listed in Table 3.

Numerical methods for solving mathematical model
We adopted the Crank-Nicolson method for discretizing the convection-diffusion-reaction equations to form nonlinear equations, and then used Newton's method to solve them.
First, we discretized the system of convection-diffusion-reaction equations using the Crank-Nicolson method. Let the time step and grid size of the radius be Δt and Δr, respectively. Then, the i − mers concentration is denoted by P k i;n ¼ P i ðr N þ nDr; kDtÞ. The derivatives of P k i;n with respect to t and r are discretized as follows: The system of convection-diffusion-reaction equations (Eqs 2, 3 and 6) were discretized according to the above-mentioned rules (10), and the resulting equation can be rewritten using vectors as where N is the grid number of the radius, (6N + 6) matrices that depend on the time step Δt, grid size Δr and radius r. F is the nonlinear part, which depends on r and X k+1 . At each step, Eq (11) is a nonlinear algebraic equation that can be solved using Newton's method. The following process is repeated until a sufficiently accurate value is reached. The numerical algorithms were implemented in MATLAB 2009b on a personal computer. To ensure numerical accuracy, a small time step Δt = 1 s and grid size Δr = 0.025 μm were used. The numerical solutions converged for Δt in the range from 0.5 to 36 s and Δr in the range from 0.0063 to 0.05 μm, respectively.

Identification of model parameters
Some parameters were determined from a variety of sources, as illustrated in Table 3, and the others needed to be obtained using an optimization method. For WT Gag, 12 free parameters needed to be optimized. In contrast, for Gag-G2A, only one parameter s needed to be optimized, and the other parameter values are equal to the corresponding values for WT Gag. Therefore, 13 parameters needed to be optimized by fitting to 16 experiment data points (eight data points for WT Gag and eight data points for Gag-G2A [10]). The flow chart of this process is shown in Fig 3. The advantage of the sequential scheme in Fig 3 is that the second object function will not be run until the first one meets the error criterion, and this process can reduce the program running time on a personal computer. If this program runs in a supercomputer with thousands of computers, other schemes for parallel computing, such as the weighted multi-objective scheme [35,36], would be more efficient.
Thirteen parameters need to be optimized using 16 experiment data points (eight data points for WT Gag and eight data points for Gag-G2A [10]): this is the inverse problem. In addition, the measured data is always inevitable mixed with noise. In numerical computation, this problem is often ill-posed. To decrease over-parametrization and guarantee numerical stability of this optimization problem, a regularization term using the Tikhonov regularization method is generally added (Eq 13) [37][38][39].
The idea of regularization is to add preference to a particular solution with desirable properties [38,[40][41][42]. In many cases, the solution is given preference with smaller norms, and this process is known as L 2 regulation. This regulation improves the conditioning of the problem, enabling a direct numerical solution. The form of regularization is given as: where θ is the parameter vector, θ 1 and θ 2 are the lower and upper bounds of θ, respectively. Y (θ) and Y (exp) are the calculation and the experimental data, respectively. k Á k 2 is the Euclidean norms. λkθk 2 is the regularization term, and λ is the weight coefficient, which is generally small and is set to 0.001. Because the model and the boundary conditions are nonlinear, intelligent optimization algorithms, such as Differential Evolution (DE) and Particle Swarm Optimization(PSO), are commonly used to obtain the parameter values. Here, we use the diversity-maintained differential evolution based on a gradient local search (DMGBDE) method proposed by Xie et al. 8. If the terminating condition is reached, exit loop.

Comparisons between the simulation results and experiment data
Kutluay et al. [10] used a chemical crosslinking approach to analyze the initiating events in HIV-1 assembly and genome packaging. In their experiment, 293T cells coexpressing WT Gag and HIV-1 RNA were crosslinked by treatment with EGS, a membrane-permeable crosslinker. After 30 minutes of incubation at room temperature, crosslinking was prevented by the addition of Tris-Cl. The cells were then analyzed through membrane flotation assays. Proteins from the PM and cytoplasmic fractions, including monomers, dimers, trimers, tetramers, pentamers, and hexamers, were precipitated, and their relative concentrations were obtained by western blotting. The values of the parameters were optimized and are shown in Table 4. The simulated absolute concentrations and experimentally measured relative concentrations of the polymers were normalized by dividing by the concentration of Gag monomer in the cytoplasm. As shown in Fig 4, the simulation results are consistent with the experimental data.
Mutation of the N-terminal glycine residue of MA of Gag to alanine (Gag-G2A) MA comprises the N-terminus of the Gag polyprotein, and it is responsible for targeting the Gag polyprotein to the PM. Therefore, mutation of the N-terminal glycine residue of MA to alanine (G2A) can reduce the attachment of a myristoyl group to Gag and impede its recruitment to the PM [44, 45]. In our model, the speed of Gag-G2A transport is slower compared with that of Gag WT. In addition, we assumed that k D and k s in Eq (6) for Gag-G2A were equal to the corresponding values for WT Gag. Therefore, we adjusted only one parameter s 1 to fit the experimental data [10]. The value of this parameter is 2.20 × 10 −11 μm/s, and its 95% confidence interval is [0, 0.09]. Because s1 is close to zero, we concluded that Gag-G2A might fail to hijack the molecular motor.
The comparison between the simulation values and the experimental data, which is shown in Fig 5, clearly demonstrates that the simulation and experimental results are similar.

Analysis of the identifiability of the parameters
In our study, we used eight data points for WT Gag and eight data points for Gag-G2A [10] to fit 13 parameters, including 10 polymerization coefficients k i,j , two proportionality coefficients k D and k s and the transfer speed s1 of Gag-G2A. The constraints for the 13 parameters are as follows: • All 10 polymerization coefficients are positive.
• The two proportionality coefficients are between 0 and 1.
• The transfer speed of Gag-G2A is less than 1 μm/s.
After the parameter values are optimized based on experimental data [10], we evaluated how well the model parameters were determined by these data. In 2009, Raue et al.
[46] proposed an approach to analyze the structural and practical identifiability of dynamical models by exploiting the profile likelihood, and this method has subsequently been widely applied in many fields, particularly the computational systems biology [47][48][49][50].
In this work, we used this technology [46] to analyze the identifiability of the parameters. First, finite sample confidence intervals for the parameters were estimated, and these are listed in Table 4. As shown, the confidence intervals of all of the parameters are within the bounds.
The profile likelihoods of all the parameters are shown in Fig 6. Specifically, the profile likelihoods for k11, k12, k23, k D and k s show a steep concave shape, indicating that the optimization route can rapidly reach the minimum. The profile likelihoods for k1, k15 and k24 also show a concave shape, however, the curves on the right side of the vertical dashed lines decrease slowly, indicating that their optimization routes might reach the minimum slowly. The profile likelihoods for k13, k14, k22, k33 and G2A − s1 have several local minima, hence, more iterations might be needed for the optimization route to jump out of and not get stuck at these local minima.
"Knockout" mutations of the CTD of the CA of Gag (Gag-dCTD) The CA is one of the four major domains of Gag and plays an important role in Gag multimerization and assembly at the PM. Furthermore, Gag: RNA binding is mediated by the CTD of the CA, which participates in Gag-Gag interactions. Thus, Gag-dCTD will show decreased onrate constants. Furthermore, because the CA and MA of Gag are bound to each other, Gag-dCTD might show slightly impaired CA function. The damaged CA domain will slightly decrease the transport velocity of Gag, thus, the transport coefficient of Gag-dCTD might be slightly slower than that for WT Gag. This assumption is also supported by experimental data [10] for Gag-dCTD. Taken together, these assumptions indicate that the parameters s 1 , s 2 , k i,j are decreased compared with the corresponding values for WT Gag. These limiting conditions were included in the process of parameter optimization. The values of these parameters are listed in Table 5, and the comparison between the simulation values and experimental data is shown in Fig 7. As shown, the numerical results agree with the experimental data.

Budding and release time of VLPs
According to various references [21,22,51, 52], a VLP buds and releases after *6 hours.
In 2004, Briggs et al. [3] reported that the diameter of a VLP was *145 nm, and Carlson et al.
[53] then found that a VLP was released with *2400±700 Gag proteins.  Table 4. Each parameter was varied over a wide range around its optimal value, and the remaining parameters were then refitted. All parameter values were log-transformed. Jouvenet et al. [22] observed cells over a period of 30-60 minutes starting 5-6 hours after transfection and found that 50-150 puncta per cell typically appeared during this period. The behavior of these puncta could result in their classification into two discrete classes: slowly appearing puncta and rapidly appearing/disappearing puncta. The slowly appearing puncta represented the majority of events (74%) observed at 5-6 hours after transfection, and were indistinguishable from areas of the PM. The rapidly appearing/disappearing puncta were indistinguishable from endosomes. Therefore, Jouvenet et al. believed that the slowly appearing puncta might represent genuine VLPs assembly events.
Nermut et al. showed some pictures of VLPs in the budding state [21], and these images showed that most Gag proteins gathered to VLPs. Thus, the total Gag protein in the budding  state at the PM can be estimated by the number of VLPs and Gag proteins per VLP. Taken together, these findings indicate that the threshold surface density of Gag at the PM can be computed as follows: where NG is the number of Gag proteins in a VLP, Np is the total puncta per cell in the budding state, p is the ratio of slowly appearing puncta, and R is the radius of a cell. When NG = 2400, Np = 50, p = 74% and R = 10, the threshold surface density is 1.17 × 10 2 ymol/um 2 . When Np is changed to 100 and 150, the threshold surface densities are 2.34 × 10 2 ymol/um 2 and 3.51 × 10 2 ymol/um 2 , respectively.
In our work, the surface density of Gag at the PM can be obtained by the following formula: where S Gag denotes the surface concentration of Gag, C Gagi denotes the volume concentration of Gagi underneath the PM, and H is the thickness of the concentrated domain of Gag underneath the PM, which is set to 0.13 um [5].
When the surface density of Gag is equal to the threshold surface density, the cumulative time is estimated as the budding and release time. Times of 3.51, 8.53 and 21.34 hours are predicted for threshold surface densities 1.17 × 10 2 , 2.34 × 10 2 and 3.51 × 10 2 ymol/um 2 , respectively. The predicted budding and release times agree with earlier findings to some degree [21,51,52], indicating that the total concentration of Gag underneath the PM is reasonable. These researchers also defined the elasticity of the ith output variable O i (P, T) with respect to the parameter P j , E i,j (T) as

Sensitivity and elasticity analysis
Elasticity is defined in terms of relative sensitivity and can describe the rate of change in the relative size of the output variables with respect to the relative size of the parameters. An elasticity analysis would thus yield more reliable results. This definition has an extraordinarily wide range of applications [8].
Based on the literatures [55, 56], the elasticity function E i,j (T) was estimated as follows: where ΔP j is a small perturbation of the parameter P j . The elasticity values for all parameters corresponding to eight outputs, specifically the monomer and dimer concentrations in the cytoplasm and the concentrations of each of the polymers at the PM, are shown in Fig 8. As shown in Fig 8, the elasticity values for the polymerization coefficients k11 and k12 are greater than those for the other polymerization coefficients. This findings indicates that perturbations of these two parameters can lead to relatively large changes in Gag polymer concentrations. Therefore, we can conclude that the corresponding two reactions, which involve the polymerization of two monomers to form a dimer and the polymerization of a monomer and a dimer to form a trimer, might be key reactions. If future drugs can decrease the values of k11 and k12, concentrations of Gag high-order polymers will be significantly reduced.
Among all of parameters, the highest elasticity value is found for k s , which measures the ability of Gag to land on the PM by active transport. Therefore, we can conclude that this process has a very significant impact on Gag polymers on the PM. As a result, suppression of the Gag concentration on the PM by reducing the ability of Gag to stay on the PM might be a good strategy.
The elasticity value of k D is found to be very small, which illustrates that the landing of Gag on the PM by diffusion has little impact on changes in the concentrations of Gag polymers on the PM. In addition, the elasticity value for the transport speed s1 of Gag-G2A is very small with a value close to zero. This finding further supports the hypothesis that Gag-G2A might hardly be able to hijack molecular motors to move to the PM, and as a result, Gag-G2A might be an important drug target.
The global elasticity function was defined as follows: where N is the total number of perturbations, and ΔP k is the simultaneous perturbations of all parameters during the k-th perturbation. The global elasticity function values for 10% perturbations of all parameters are shown in Fig 9. As shown in the figure, perturbations of all the parameters are not sensitive to the output, which indicates that the proposed model is reasonable and robust.

Analyzing the predominant pathways and the key intermediates in hexamer formation
The patterns underlying the formation of polymers constitute a very interesting topic [57,58]. Using our model, we analyzed the weights of the pathways for the tetramer, pentamer, and hexamer formation. Hexamer formation consists of three pathways: The rate equation for the hexamer concentration can be described as follows: dP 6 dt ¼ k 33 P 2 3 À k 0 33 P 6 þ k 24 P 2 P 4 À k 0 24 P 6 þ k 15 P 1 P 5 À k 0 15 P 6 À d 6 P 6 The three pathways increase the hexamer concentration based on the following rates: https://doi.org/10.1371/journal.pcbi.1005733.g009 k 33 P 2 3 À k 0 33 P 6 , k 24 P 2 P 4 − k 0 24 P 6 and k 15 P 1 P 5 − k 0 15 P 6 . The largest value among these rates corresponds to the predominant pathway. The values for these three pathways during the first 30 minutes are shown in Fig 10, and the results clearly show that the first pathway is the most important after 12 minutes. Therefore, the predominant pathway is 2Gag 3 Ð Gag 6 . We also explored the predominant pathways in pentamer and tetramer formation. As illustrated in Fig 11, the predominant pathway in pentamer formation is Gag 2 + Gag 3 Ð Gag 5 , and as shown in Fig 12, the predominant pathway in tetramer formation is Gag + Gag 3 Ð Gag 4 . We compared these three most important pathways and found that the trimer intermediate was needed in all three predominant pathways. Therefore, we conclude that the Gag trimer might be a key intermediate in hexamer formation.
Kutluay et al. [10] revealed that a Gag trimer could not be formed in the cytoplasm and that a Gag trimer on the PM could not return to the cytoplasm. Therefore, the formation of a trimer indicates that the Gag protein complex can now stay on the PM. As shown in Figs 4, 5 and 7, the relative concentrations of trimers, tetramers, pentamers and hexamers are similar to each other. Therefore, we could infer that the concentrations of these high-order polymers depend heavily on the tirmer concentration. In addition, we reduced the trimer concentration to the corresponding value for Gag-G2A by increasing its degradation. The concentrations of the various Gag polymers are listed in Table 6, and as shown, tetramer, pentamer and hexamer concentrations decrease markedly. However, the same finding was not obtained for reductions in the tetramer and pentamer concentrations. Therefore, we can conclude that trimer formation, namely Gag 1 + Gag 2 Ð Gag 3 might be a key pathway. As shown in Fig 10, the predominant pathway in direct hexmer formation is 2Gag 3 Ð Gag 6 . Taken together, the key pathways for the formation of a hexamer from a monomer might be 2Gag 1 Ð Gag 2 , Gag 1 + Gag 2 Ð Gag 3 and 2Gag 3 Ð Gag 6 , and the key intermediates in hexamer foramtion might be the dimer and trimer polymers.

Four theoretical combined methods for suppressing the Gag concentration
In the wake of developments in basic science, many of the most promising HIV drugs in clinical development do not target specific retroviral enzymes but rather act by interrupting the assembly of viral factors with host proteins [59]. For example, some agents that disrupt protein-protein interactions during the entry of HIV-1 are showing great clinical potential [59]. However, according to AIDSinfo, no clinically available drug can inhibit Gag transport and assembly (https://aidsinfo.nih.gov/drugs/Search/a-z/all), and the development of these agents is a daunting challenge. Because the current treatments for HIV-1 normally include the use of multiple drugs in an attempt to control this virus, we proposed and analyzed four theoretical combined methods for inhibiting Gag transport and assembly based on our model. These analyses might be helpful to the design of new anti-AIDS drug.
We took three approaches, specifically the degradation rates for Gag polymers, Gag-G2A and Gag-dCTD, into account and designed the following four theoretical combined methods:   C4: Increasing all degradation rates of polymers by 50%, setting s1 and s2 to the means of the corresponding values for WT Gag and Gag-G2A, and setting k i,j to the mean of the corresponding values for WT Gag and Gag-dCTD.
For these four theoretical combined methods, we computed the concentrations of Gag polymers in the cytoplasm and PM during the first 30 minutes, and the corresponding results are shown in Fig 13. As shown in the figures, the concentrations of high-order polymers (e.g., tetramer, pentamer and hexamer) are relatively lower with methods C1, C3 and C4. We also found that these three methods involved the Gag-G2A mutation. Therefore, we speculate that the N-terminal glycine residue of the MA of Gag might be a promising drug target.

Reducing the concentrations of higher-order polymers of Gag-G2A and WT Gag
Gag-G2A significantly decreases the concentrations of Gag higher-order polymers and thus might be a potential key drug target. To explore the mechanisms responsible for decreasing the concentrations of higher-order polymers of Gag-G2A, we first reduced the trimer concentration of WT Gag to 0.37 μm 3 /ymol by increasing the trimer degradation rate to yield the the corresponding concentration of Gag-G2A trimers. The concentrations of Gag polymers are shown in the fourth line in Table 6. Simelarly, we decreased the tetramer, pentamer and hexamer concentrations to the corresponding low concentrations for Gag-G2A, respectively, and the results are listed in Table 6. After reducing the trimer concentration in Table 6, we compared the tetramer, pentamer and hexamer concentrations with those of WT Gag. The concentrations of these higher-order polymers were all markedly decreased, particularly the hexamer concentration. As shown by the results, the tetramer, pentamer and hexamer concentrations are all very dependent on the trimer formation. However, the same does not hold true for the tetramer and pentamer polymers. Therefore, these results further support the conclusion that the Gag trimer might be a key intermediate and that trimer formation might be a key pathway.
In addition, we compared the tetramer, pentamer and hexamer concentrations obtained with Gag-G2A after reducing the trimer concentration listed in the fourth line in Table 6. The resulting concentrations were markedly higher than those found for Gag-G2A. Thus, Gag-G2A does not use decrease the polymerization coefficients to decrease the polymer concentrations.
Compared with Gag-G2A, the monomer and dimer concentrations for WT Gag in the cytoplasm are reduced by approximately a quarter and a half, respectively, and the monomer, dimer, tetramer, pentamer and hexamer concentrations for WT Gag on the PM are reduced by approximately 68%, 88%, 93%, 98%, 99% and 99%, respectively. On the PM, Gag-G2A reduces the monomer and dimer concentrations by reducing their active transport speeds, which results in decreases in the concentrations of higher-order polymers on the PM. In the cytoplasm, the low transport speeds of monomers and dimers for Gag-G2A and their very weak diffusions lead to the high monomer and dimer concentrations, and most of these monomers and dimers are gathered near the perinuclear area. However, due to the high degradation rates, these high concentrations near the perinuclear area show rapid reductions. Therefore, in the entire cytoplasm, these monomer and dimer concentrations are ultimately lower than those found for WT Gag.

Discussion
In this study, we developed a model to simulate the intracellular trafficking and polymerization of HIV-1 Gag protein. The model parameters were fitted using published experimental data [10]. The profile likelihoods of these parameters were used to show their identifiability, and an elasticity analysis of these parameters was used to show the robustness of this model. The model was able to predict the budding and release time of a VLP, and the results were in agreement with the findings of some previous studies [21,51,52]. Moreover, the model could also be applied to two mutated versions: Gag-dCTD and Gag-G2A. Using our model, we analyzed the weight of the pathways involved in the polymerization reactions, and concluded that the Gag dimer and trimer might be two key intermediates in hexamer formation. Moreover, we inferred that the three key pathways in the formation of a hexamer from a monomer might be the polymerization of two monomers to form a dimer, the polymerization of a monomer and a dimer to form a trimer, the polymerzation of two trimers to form a hexamer. We also explored four theoretical combined methods for suppressing the Gag concentration and concluded that the N-terminal glycine residue of the MA of Gag might be a proming drug target.
There is no denying that the presented modeling approach is merely an approximation to reality. However, it successfully provides a consistent and quantitative description of the transport and polymerization of Gag and lays a broad foundation for further developments. Future experimental and theoretical research is required to support the various assumptions employed in the model.
A number of important questions have not been fully addressed and need for further examination. For instance, there are two types of motor proteins: one conveys cargo to the nucleus, and the other conveys cargo to the PM. We consider only the average velocity of the transport of cargo to simplify the model. Thus, it is important to address the transport processes of these two types of motor proteins in future studies. In addition, a dynamical analysis [60] of HIV-1 trafficking process and a multilayer networks analysis [61] related to HIV-1 will also be investigated in future work.