Drug-induced resistance evolution necessitates less aggressive treatment

Increasing body of experimental evidence suggests that anticancer and antimicrobial therapies may themselves promote the acquisition of drug resistance by increasing mutability. The successful control of evolving populations requires that such biological costs of control are identified, quantified and included to the evolutionarily informed treatment protocol. Here we identify, characterise and exploit a trade-off between decreasing the target population size and generating a surplus of treatment-induced rescue mutations. We show that the probability of cure is maximized at an intermediate dosage, below the drug concentration yielding maximal population decay, suggesting that treatment outcomes may in some cases be substantially improved by less aggressive treatment strategies. We also provide a general analytical relationship that implicitly links growth rate, pharmacodynamics and dose-dependent mutation rate to an optimal control law. Our results highlight the important, but often neglected, role of fundamental eco-evolutionary costs of control. These costs can often lead to situations, where decreasing the cumulative drug dosage may be preferable even when the objective of the treatment is elimination, and not containment. Taken together, our results thus add to the ongoing criticism of the standard practice of administering aggressive, high-dose therapies and motivate further experimental and clinical investigation of the mutagenicity and other hidden collateral costs of therapies.

1 1+( u h ) k is fixing the pharmacodynamics of the drug. The problem can be solved with two alternative methods based either on Pontryagin's minimum principle or Hamilton-Jacobi-Bellman equation. The Pontryagin's minimum principle is based on the variational approach and proceeds by defining the Hamiltonian as H = L(t, S, u) + λ 1 f + λ 2 g, where L(t, S, u) := S(t, u(t))µ(u) is the Lagrangian cost functional and the multipliers (costate variables) satisfy the following equations: with boundary conditions λ 1 (T ) = 0 = λ 2 (T ).
The optimal control strategy u opt (t) is found by studying the function whose roots, should they exist, determine the singular controls. (If no roots exists, then the optimal control reduces to so-called bang-bang controls.) Using the parameters given in Table 1 (in main text), two separate roots appear; the first root can be excluded using the Legendre-Clebsch condition (H uu < 0) [1]. The candidate optimal control u * (S, R, λ 1 , λ 2 ) must then be iterated using e.g. the Forward-Backward-Sweep Method [2], where the state variables are first solved forward in time using dynamics (2) and then the multipliers are solved backwards in time using dynamics (4) with the updated variables. The numerical solution to the problem together with the trajectories and multipliers are showed in S1 Fig.

Time-independent control laws from Hamilton-Jacobi-Bellman equation
Notice that since the Lagrangian cost functional L(t, S, u) does not depend on the number of resistant cells R, we need not explicitly track them. This is because we assume resistant cells are initially so rare that the competitive interactions on the sensitive cell growth is negligible (note that the opposite is not true). If this assumption would be violated, then the cost functional itself would lose its usefulness and the object of the treatment should rather focus in the containment and management of the already present resistance and not its de novo emergence. Consequently, we can concentrate our analysis on the single state variable S and assume that its dynamics is independent of R.
To gain further insight, we then solved the same problem using the alternative method based on Hamilton-Jacobi-Bellman (HJB) equation. This approach relies on solving the cost-to-go function J(t, S) from the partial differential equation: with boundary condition J(T, S) = 0.
By recording the minimizing control at each point, the HJB approach gives a control map from which the optimal control can be read for any conceivable state. The HJB equation above is for the deterministic version of the control problem that corresponds to the Pontryagin methods described above. We also solved the stochastic version of the problem, and found that demographic stochasticity plays very little role here. The resulting control map is showed in S2 Fig. We notice from S2 Fig. that the optimal control is virtually independent of time, except at the very end of the control period, when the control is discontinued. Closer inspection of this reveals that this is in fact a biologically irrelevant boundary effect created by the fixed end-time corresponding to stopping therapy and letting the target population grow again. Consequently, by letting the end-time T to be sufficiently large so that the sensitive population is almost surely eliminated before the end, then the optimal treatment becomes time-independent and we can solve for a closed-loop control u(S) which has only feedback from the current population size. Such stationary solution can be obtained analytically for the deterministic HJB by setting − ∂ t J(t, S) = min u∈U L(t, S, u) + f (t, S, u)∂ S J(t, S) = 0 (10) and carrying out the minimisation by formally differentiating the terms inside the brackets with respect to u. This minimization yields condition which can be substituted back to the HJB equation. The stationary profile for the problem then reads which gives an implicit equation for the optimal control. Furthermore, as the time-dependence realizes only via the state and control variables, solving the implicit equation yields a time-independent control law u(S). Indeed, by substituting the Lagrangian cost functional and the sensitive dynamics, we get Notice that here we assumed nothing about the precise functional form of the dose-dependent mutation rate µ(u), the pharmacodynamics d(u) or the density-dependent growth model r(S) except that these are all differentiable functions with respect to u. Furthermore, notice that as the state variables S cancel out, the only remaining dependence of the state variable happens via the assumed density dependent growth model, where we assumed that r(N ) ≈ r(S). Thus, we have now derived the control law equation, where the only technical modelling assumption we have made is that of the log-kill hypothesis, where the control leverage depends only of the drug concentration and specifically does not depend on the population size.
The analytically derived stationary profile cannot be directly applied to the discounted problem, because of its explicit time-dependence. However, the same stationary profile can be obtained numerically more simply if the optimal time-dependent control eliminates the target population without allowing it to grow in between. Then S(t) is a monotonically decreasing function and hence there exists an inverse function S −1 : [0, 1] → [0, T ] which gives the time at which the population was at any given size. Now, if indeed time-independent control law u(S) does exist, it must be unique and thus the optimal control for some population size S must satisfy u(S ) = u opt (t ) where S(t ) = S . Then the stationary profile can be obtained using the inverse function as u(S ) = u opt (S −1 (S )).
Applying this inverse function method agrees with the analytically derived stationary profile and can be used also to the discounted problem (see S3 Fig.).
Here we have focused on the problem of determining the optimal way of eliminating the target population, with respect to two biologically meaningful objectives. The derived optimal treatment strategies were calculated with respect to the simple constraint that there is some maximum tolerated dose u MTD , which may be below the drug concentration where the pharmocodynamical effect is assumed to have plateaued. Even though the cumulative drug concentration was not constrained, we did consider that the constant MTD would generally be lower than u max (where the death rate saturates) when demonstrating the effect sizes compared to the optimal treatment. Such constraints can also be easily incorporated to the presented optimal control framework by appending the Hamiltonian with an additional state variable, which enforces the isoperimetric constraint (see e.g. [3], [4]).