How Can Viral Dynamics Models Inform Endpoint Measures in Clinical Trials of Therapies for Acute Viral Infections?

Acute viral infections pose many practical challenges for the accurate assessment of the impact of novel therapies on viral growth and decay. Using the example of influenza A, we illustrate how the measurement of infection-related quantities that determine the dynamics of viral load within the human host, can inform investigators on the course and severity of infection and the efficacy of a novel treatment. We estimated the values of key infection-related quantities that determine the course of natural infection from viral load data, using Markov Chain Monte Carlo methods. The data were placebo group viral load measurements collected during volunteer challenge studies, conducted by Roche, as part of the oseltamivir trials. We calculated the values of the quantities for each patient and the correlations between the quantities, symptom severity and body temperature. The greatest variation among individuals occurred in the viral load peak and area under the viral load curve. Total symptom severity correlated positively with the basic reproductive number. The most sensitive endpoint for therapeutic trials with the goal to cure patients is the duration of infection. We suggest laboratory experiments to obtain more precise estimates of virological quantities that can supplement clinical endpoint measurements.


Introduction
According to a 2014 report, the costs of developing a new pharmaceutical has increased by 145% since 2003 [1]. A major part of this cost increase is due to losses incurred on drug candidates that fail during clinical development [2]. Failure may be due to adverse events or an inability to detect beneficial effects on the target disease. The reasons why clinical trials fail to demonstrate efficacy of potential novel treatments are many and complex. Failure is often due to the choice of inadequate endpoints, or poor trial design. Good endpoints in clinical trials should fulfill the following two key criteria: i) they should reflect the action of the treatment on the underlying cause of disease, and ii) they should be reliably quantifiable.
Acute viral infections pose particular challenges for any assessment of the impact of novel therapies. They typically have a short incubation period (a few days) and rapid onset of clinical symptoms. Consequently, patients usually present at clinics after the viral load peak has passed and the infection is already in the decline phase. Only a few measurements of viral load or symptoms can be taken before the infection terminates due to the immunological response of the human host. Using the example of influenza A, we discuss how a quantitative analysis of viral population growth within the patient can help investigators to define more reliable endpoints and improve the assessment of the efficacy of potential novel treatments.
Symptom scores, although typically required by regulatory agencies as clinical endpoints, are seldom reliably quantifiable and often do not reflect the action of treatment on the pathogenic agent. For example, symptoms may be measured by asking patients how they are feeling, and therefore are at risk of being subjective and at best semi-quantitative [3]. Moreover, symptoms may not accurately reflect the pathologic process targeted by a specific treatment. In the case of influenza A, respiratory and systemic symptoms are not only caused by viral tissue damage, but also by the immune response to the viral infection itself. In severely ill patients with multiple morbidity-inducing conditions, symptom scores may not give any indication of the successful action of a treatment. Influenza treatment may clear the viral infection, but symptoms may persist due to secondary bacterial pneumonia [4].
The most suitable endpoints that best reflect the effect of a treatment on acute viral infections should be related to viral load. We chose influenza A as an example, because the recent controversy about the efficacy of established antiviral treatments against influenza A has reignited interest in improved endpoint measurements for the assessment of antiviral treatments and suitable statistical techniques to compare them across groups [5][6][7][8]. Since several pharmaceutical companies are working on novel treatments against influenza A including immunotherapeutic approaches [9], our analysis can inform future trials of potential new influenza treatments. The models are most useful for the planning of phase I studies, but more general conclusions can also be drawn for phase II and III studies.

Materials and Methods
In prior research we have analysed the properties of a simple differential equation model of viral load dynamics in influenza A infection and a number of measurable infection-related quantities which can be derived from this model (Fig 1) [10]. The model has been developed from a standard model [11], by assuming that the viral dynamics are much faster than those of infected cells. As we do not have quantitative information on the dynamics of infected cells, this is a sensible assumption that allows us to reduce the number of unknown parameters in the system. The model is general enough to be applicable to other acute viral infections. The analytical properties of this model and the derivation of the infection-related quantities have been discussed elsewhere [10]. In brief, the dynamics of the model are given by the following equations: where T is the number of epithelial target cells susceptible to viral infection, V is the viral load (measured in TCID 50 /ml), β is the infection rate of target cells, r is the virus production rate, and γ is the virus death or clearance rate which encompasses the action of specific and non- specific immune mechanisms. The infection-related quantities that can be derived from this model and their interpretations are listed in Table 1. The analytical expressions of the quantities are given in S1 Table (details in [10]). We fitted the model to viral load data of nine patients from the placebo groups of volunteer challenge studies conducted by Roche as part of the original Oseltamivir trials, using Markov Chain Monte Carlo (MCMC) methods [12]. All volunteers were screened for HI titre to  exclude pre-existing immunity against the challenge strain. We estimated the parameter values for each patient. See S1 File for a more detailed description of the data and fitting procedure. We used the parameter estimates to determine the infection-related quantities for each patient. We assessed the variability of the quantities among patients by calculating the coefficients of variation (CV).
In addition to viral load measurements, the data also included temperature measurements and total symptom scores (Jackson score [13]) for each patient at different time points. We calculated pairwise correlations (Pearson's r) between individual viral load and temperature measurements over time, and between individual viral load and symptom score measurements over time. We also calculated pairwise correlations between each of the nine infection-related quantities, and between the infection-related quantities and the area under the curve (AUC) of the total symptom scores and the temperature area under the curves. The latter two quantities measure the total symptom severity and temperature increase over the entire course of infection (the total symptom score AUC is the AUC over all Jackson scores at each measured time point). We determined the statistical significance of the correlations with two different correction methods for multiple comparison (11 comparisons in total), the Bonferroni method and the Benjamini-Hochberg method [14].
Written informed consent was obtained from each participant in a form approved by the institutional review boards of the University of Virginia, Charlottesville, and the University of Rochester, Rochester, NY, and subjects were compensated for participation. The review board of the named institution has approved the study in which the data we used was collected.  Table. Compared to the 95% credible intervals of individual parameter estimates, the variation among patients in the infection-related quantities shown in Table 2 is relatively small. This is not surprising considering that the volunteers in the original challenge study were all young, healthy adults from a narrow age band (18-27 years). The greatest variation among individuals occurs in peak viral load (CV = 1.26) and viral load area under the curve (AUC) (CV = 1.19) (see S3  Table for all coefficients of variation.) Temperature significantly correlates with viral load (r = 0.328, p = 0.0025). Total symptom scores also correlate with viral load (r = 0.233, p = 0.03385). When we determined the correlations between infection-related quantities using the Bonferroni correction, the following correlations were significant: R 0 and the duration of infection (r = 0.947, p bonferroni = 5.902x10 -3 ), viral load AUC and peak viral load (r = 0.9994, p bonferroni = 7.068x10 -10 ), viral load AUC and late viral decay rate (r = 0.9509, p bonferroni = 4.514x10 -3 ), time to peak viral load and generation time (r = 0.9644, p bonferroni = 1.487x10 -3 ), time to peak viral load and initial viral growth rate (r = -0.9459, p bonferroni = 3.384x10 -3 ), and  Table 3 for all correlations.

Results
Even though the association is not statistically significant, we observe that Patients 3, 4 and 5 have the highest fraction of dead cells and also the highest symptom scores AUC ( Table 2, Fig 3).

Discussion
Our parameter estimates are of the same size as previous estimates from studies using comparable models (for example [11]). As the parameters in these models tend to be highly correlated, the uncertainty around individual parameter estimates is generally large. Our analysis is limited by the small sample size which, however, is not unusual in studies fitting viral dynamics models.
Our estimates of the basic reproductive number (R 0 ) tend to be lower than previously reported results [11,15]. Some of these published estimates were derived from tissue culture experiments, where, in the absence of any immune response, R 0 is likely to be higher. The viral strains used also differed in the various studies (influenza A Texas/36/91 (H1N1) in our study, Hong Kong/123/77 (H1N1) in [11], and Albany/1/98 (H3N2) in [15]). As some viral strains may disseminate faster through infected tissue than others, infection by such strains may be associated with higher R 0 values. Moreover, some of the earlier studies consider a latent infection stage which can increase estimates of R 0 . We did not consider a latent stage, as the data did not contain enough information to justify additional parameters. In our analysis, peak viral load and viral load AUC are the most variable infection-related quantities. They were also strongly correlated, indicating that viral load AUC is mainly determined by peak viral load. High variability in AUC is thought to reflect high variability in infectiousness among individuals. Table 3. Pairwise correlations between infection-related quantities, between infection-related quantities and temperature area under the curve, and between infection-related quantities and symptom score area under the curve. Correlations that are significant using the Bonferroni correction for multiple comparisons are indicated by an asterisk. Correlations that are significant using the less stringent Benjamini-Hochberg correction are indicated by a diamond. Values below the diagonal show Pearson's correlation coefficient, values above the diagonal show uncorrected p-values for each correlation. AUC S : total symptom score area under the curve. AUC T : temperature area under the curve. R 0 : basic reproductive number. AUC V : viral load area under the curve. FDC: fraction of dead cells at end of infection. t peak : time to peak viral load. V peak : peak viral load. We found that viral load correlates significantly with body temperature and total symptom scores, but the correlations were not very strong. This may indicate that, although viral load is ultimately responsible for infection-related illness, other processes, for example the immune response, cause symptoms. Alternatively, the reporting of symptoms may be rather subjective and therefore difficult to quantify accurately.
We found that R 0 has a significant positive correlation with the duration of infection and the total symptom score AUC. Consequently, R 0 may be interpreted as an indicator for disease severity.
Viral load AUC positively correlates with peak viral load and with the late viral decay rate. Peak viral load correlates positively with the late viral decay rate. This suggests that infections with a high peak viral load tend to decline rapidly after the peak (intense, but short infection), whereas infections with a low peak viral load tend to decline more slowly (mild, but extended infection).
The fraction of dead cells at the end of infection correlates positively with the infection duration. This means that, according to our model, longer infections cause more severe tissue damage. In addition, we observed that the patients with the highest fraction of dead cells at the end of infection have the greatest total symptom score AUCs. A possible interpretation may be that symptoms are at least partly caused by tissue damage. The inter-individual variability in the fraction of dead cells, however, is very low (CV = 0.05788, see S3 Table).
Both time to peak viral load and generation time were negatively correlated with initial viral growth. This means that an infection caused by slowly reproducing virus spreads more slowly and has a later peak than an infection caused by faster reproducing virus.
With a bigger sample size, more correlations between infection-related quantities may be found to be significant. For example, in coronavirus infection, a shorter incubation time is associated with more severe disease [16].
Clinical researchers conducting trials of candidate therapies for acute viral infections should be most interested in how infection-related measures change upon treatment. In Figs 4 and 5 we show schematically what changes to expect in different infection-related quantities following treatment, (for a thorough sensitivity analysis see [10]).
We consider treatments that act on the infection rate (in the case of influenza A, for example, amantadines [17]), the viral production rate (neuraminidase inhibitors against influenza A [18]), and the viral clearance rate (monoclonal antibodies against influenza A are thought to act in this way [19]). We consider treatments that are given on days 1, 2 and 3 post infection.
As we derived in [10], treatments that act on viral growth parameters (infection rate and viral production rate) are most effective, when they are given before the time of peak viral load (day 1). They can greatly reduce peak viral load, shorten the duration of infection and reduce the viral load AUC (Fig 4A and 4B). In contrast, after the peak (days 2 and 3), a small to negligible effect on the duration of infection and the viral load AUC is observed (Fig 5A, 5B, 5D and 5E). Our results agree with those from a similar analysis by Kamal et al. [20].
Since patients with acute viral infections typically present at clinics after the time of peak viral load, our model suggests that it will be difficult to detect the impact of a treatment that acts on the infection rate or viral production rate. This may be one of the reasons why different studies have shown ambiguous results on the efficacy of neuraminidase inhibitors for the treatment of influenza A [5][6][7]. According to our model, when treatments that act on the virus clearance rate are given before the time of peak viral load, they can slow down viral growth and reduce the viral load peak and AUC (Fig 4C). Shortly after the peak, their impact on viral decay and infection duration could be much stronger than that of treatments acting on viral growth parameters (Fig 5C  and 5F).
Our predictions strongly depend on the assumption that the infection is target-cell-limited. In infections that are not target-cell-limited and where immunity plays a greater role in reducing viral load, treatments acting on viral growth parameters are expected to have a much stronger effect after the viral load peak.
The choice of endpoint in a clinical trial depends in part on the therapeutic goal. If the goal is to reduce transmission of an infectious agent, the viral load AUC, a measure of infectiousness, is a sensible endpoint. If the goal is to clear infection in a patient, other endpoints may be more useful. Based on our analyses, we make the following recommendations on clinical protocols for the assessment of treatments for acute viral infections with the goal to clear infection in a patient. Our general conclusions hold for all trial phases, but some of our suggestions concerning practical implementation may only be applicable in phase I studies.
When a treatment is given before the time of peak viral load the viral load AUC is commonly used as a virological endpoint. One of the reasons for this is that the AUC is affected by the peak viral load and the duration of infection, and therefore contains more information about the course of infection than either of these quantities on their own. Consequently, the viral load AUC should be the most sensitive indicator of the efficacy of a treatment.
According to our findings, however, the viral load AUC is the infection-related quantity that varies most among individuals. Due to the high natural variability among untreated patients, it should be more difficult to detect a statistically significant difference between treated and control groups. The same consideration holds for the peak viral load which is also highly variable. Conversely, the duration of infection varies much less among individuals (CV = 0.2059). Therefore, besides the reduction in AUC and peak viral load, the duration of infection should be considered as an endpoint.
When a treatment is given after the peak viral load, the AUC is not a very sensitive marker of impact, because the reduction in the total area will be relatively smaller (see Fig 5). The increase in the late viral decay rate and the shortening of the infection duration should be used as alternative endpoints. The duration of infection should be a more sensitive efficacy measure compared to the increase in the late viral decay rate, because the latter is more variable among individuals (CV = 0.9402).
The measured duration of infection will depend on the number of measurements taken and the sensitivity of the viral load assay. It is also easy to miss the peak viral load, if the measurement intervals are spaced too far apart. This will affect the estimate of the viral load AUC. We recommend taking at least two to three measurements per day, if possible (mainly in phase I studies). This is partly because the reduction in infection duration in treated individuals is typically not more than one day (Fig 5). If only one measurement per day is taken, it will be difficult to resolve the differences in infection duration between treated and untreated groups. The same rationale holds for measuring the viral decay rate.
Taking more frequent viral load measurements is essential for fitting mechanistic models of acute viral infections to data. In order to fit the model we introduced above, one needs a minimum of at least five data points taken at different times [21]. In particular, parameter estimation requires early data points before the time of the peak. Such data can normally only be collected in volunteer challenge studies and possibly prospective cohort studies (for example [22]).
The lower the measurement error and the higher the sensitivity of an assay, the more accurate the data. Determining the distribution of the measurement error of a given assay can improve statistical inference. This can be achieved by running replicate measurements at each step of the assay on the same sample and from different samples taken from the same patient at one time point. In the case of influenza A, the most common assays are RT-qPCR and TCID 50 [23,24]. Information on the error distribution of the assay can be incorporated in the model fitting procedure to obtain more reliable estimates of infection-related quantities.
We suggest a number of laboratory experiments that could help to improve estimates of model parameters and disease markers. Parameters that can be measured independently can be set to a fixed value or allow the definition of better prior distributions during the model fitting procedure, so that the remaining free parameters can be estimated with greater accuracy.
For example, the decay rate of influenza A virus has been measured previously in micro-carrier culture [25]. Viral culture medium, however, does not contain any components of the immune system. Consequently, the measured viral decay rate is much lower than inside the human host (provided he or she is not immunosuppressed). It should be possible to culture virus and add serum taken at different times post infection from previously infected individuals to the culture medium. This would give independent and more realistic estimates of the virus decay rate in its natural environment.
The virus production rate per cell can be estimated by determining the total amount of virus produced in a culture system and dividing it by the total amount of host cells in the system [25]. The virus production rate may be affected by the innate immune response, as may be the virus clearance rate. It would likely be viral strain-and cell type-specific and would have to be measured for all strains against which a treatment was to be tested and for several respiratory epithelial cell types. The half-life of cells infected by different strains can also be measured using common apoptosis and cytotoxicity markers [26,27].
These experiments may not be cheap, and may be rather complex. In order to supplement and validate data obtained from tissue culture systems, animal models may prove useful. In particular, it may be possible to obtain data on the dynamics of infected cells from serial autopsies from sacrificed animals. The more components of the virus life cycle and the human immune response can be measured independently the more complex and detailed models can be built of viral infection. Hence better assessments of therapy impact can be made. Compared to the cost of failed clinical trials, the cost and endeavour of additional laboratory experiments should be worth the attempt.