A New Stochastic Model for Subgenomic Hepatitis C Virus Replication Considers Drug Resistant Mutants

As an RNA virus, hepatitis C virus (HCV) is able to rapidly acquire drug resistance, and for this reason the design of effective anti-HCV drugs is a real challenge. The HCV subgenomic replicon-containing cells are widely used for experimental studies of the HCV genome replication mechanisms, for drug testing in vitro and in studies of HCV drug resistance. The NS3/4A protease is essential for virus replication and, therefore, it is one of the most attractive targets for developing specific antiviral agents against HCV. We have developed a stochastic model of subgenomic HCV replicon replication, in which the emergence and selection of drug resistant mutant viral RNAs in replicon cells is taken into account. Incorporation into the model of key NS3 protease mutations leading to resistance to BILN-2061 (A156T, D168V, R155Q), VX-950 (A156S, A156T, T54A) and SCH 503034 (A156T, A156S, T54A) inhibitors allows us to describe the long term dynamics of the viral RNA suppression for various inhibitor concentrations. We theoretically showed that the observable difference between the viral RNA kinetics for different inhibitor concentrations can be explained by differences in the replication rate and inhibitor sensitivity of the mutant RNAs. The pre-existing mutants of the NS3 protease contribute more significantly to appearance of new resistant mutants during treatment with inhibitors than wild-type replicon. The model can be used to interpret the results of anti-HCV drug testing on replicon systems, as well as to estimate the efficacy of potential drugs and predict optimal schemes of their usage.


Formal problem statement
Let the vector v(t, q) = (v 0 (t, q), ..., v K−1 (t, q)) T describe the state of the system at time t. The dimension of v equals the number of state variables K. Let the vector q = (q 0 , ..., q I−1 ) T be the vector of parameters with dimension I.
The system of ordinary differential equations of the first order in respect to the independent variable t and initial conditions describe the dynamics of the system: dv dt = f (v, q); v(0, q) = v 0 ; (1.1) The model parameters are to be estimated by fitting the model output to the experimental data. This is performed by minimizing the combined objective function that equals the sum of criteria that measure the deviation of the model from the desired behavior. We used the sum of squared differences between the data and model output as the main criterion: where J independent experimental observations are denoted y(t i ) = (y 0 (t), ..., y K−1 (t)) T and i = 1, ..., J and P l , l = 1, ..., L, are the additional criteria to be satisfied with weight coefficients ζ l . Constraints in the form of inequalities can be imposed for the subset of parameters:

Trigonometric transformation of constraints
Lets introduce new parameters u i : where scaling coefficient η is chosen experimentally, and Another transformation can be also used: Consequently, parameters q i , i ∈ I l in (1.1) are substituted with there transformations (1.4) or (1.5) and minimization procedure is applied to determine unconstrained parameters u i .

Differential Evolution Entirely Parallel method
Differential Evolution (DE) is a stochastic iterative optimization technique proposed by Storn and Price [5]. DE is an effective method for the minimization of various and complex objective functions. Its power is based on the fact that under appropriate conditions it will attain the global extremum of the functional; its weakness is in high computational demand and dependence on control variables, that provides a motivation to parallelize DE. Previous work in this area has produced a number of methods that perform well in certain particular problems, but not in general applications.
DE starts from the set of the randomly generated parameter vectors q i , i = 1, ..., N P . The set is called population, and the vectors are called individuals. The population on each iteration is referred to as a generation. The size of population N P is fixed. The name of the method comes from the fact that the difference between the members of the current population are used in to generate offsprings (see Fig. 1).

Calculation of trial vectors
The first trial vector is calculated by: where q • is the member of the current generation g, S is a predefined scaling constant and r1, r2, r3 are different random indices of the members of population. The second trial vector is calculated using "trigonometric mutation rule" [6].
The third trial vector is defined as follows: where n is a randomly chosen index, x y is the reminder of division x by y and L is determined by P r(L = a) = (p) a where p is the probability of crossover.
The process is illustrated in Fig. 2. Figure 2: Construction of the trial vector

Preserving population diversity
An efficient adaptive scheme for selection of internal parameters S and p based on the control of the population diversity was proposed in [7]: where j = 0, ..., I − 1. Then and and a new control parameter γ was introduced:  Being an evolutionary algorithm, DE can be easily parallelized due to the fact that each member of the population is evaluated individually. In the method reported earlier [1,2,3] the whole population is to be divided into branches. The information exchange between branches allowed the best member of the branch to substitute the oldest member of another branch after each Π iterations. It turned out that in biological applications the value of Π was considerably small, making branches an unnecessary complication.
In the latest version the branches are eliminated, and several oldest members of the population are substituted by the same number of individuals, having the best values of objective function. The age of an individual in our approach is defined by the number of iterations, in which the individual survived without changes. The number of seeding individuals Ψ and the number of iterations called seeding interval Θ are the new parameters of the algorithm. The fact that the certain parameter set has not been update during several iterations indicates that this set corresponds to the local minimum of the objective function. As we seek the global minimum such parameter set can be deleted from the population. The set of parameters that corresponds to the minimal functional value found so far in parallel branch is copied in place of the deleted parameter set in target branch.
One of the parameters of the algorithm determines the number of parallel threads used to calculate the objective function. We utilized the Thread Pool API from GLIB project [8] and construct the pool with the defined number of worker threads. The calculation of objective function for each trial vector is pushed to the asynchronous queue. The calculation starts as soon as there is an available thread. The thread synchronization condition is determined by the fact that objective function is to be calculated once for each individual in the population on each iteration. In order to increase the robustness of the procedure we have implemented a new selection rule for DE (Algorithm 1). An offspring is to be accepted to the new generation in accordance with several different criteria. The offspring replaces its parent if the value of the combined objective function for the offspring set of parameters is less than that for the parental one. The additional criteria are checked in the opposite case. The offspring replaces its parent if the value of some criterion is better, and a randomly selected value is less than the predefined parameter for this criterion.

Implementation
New algorithm was implemented in C programming language as the software framework with suitable interface that allows a user to express the objective function using different computer systems widely used in biomedical applications, such as Octave, R or KNIME. The control parameters of the algorithm are defined in the datafile that uses an INI-format. The framework is operated via the simple command line interface. The Enhanced DEEP method can be embedded in new software. The developed new software is available on the project site [4].
Runs were performed with different combinations of parameters on the cluster (1980 Intel Xeons) in the Joint Supercomputer Center of the Russian Academy of Sciences, Moscow. The ensemble of parameter sets is presented in Table 1 and 2. The scatter plot 5 shows the low variability for the majority of the parameters.