Figure 1.
Simulations of viral genomic diversification.
The evolution of (A) viral diversity, dG, (B) divergence, dS, and (C) average fitness, f, with generations predicted by our simulations for different population sizes, C. Each cell is assumed to be infected with M = 3 virions. Error bars represent standard deviations.
Figure 2.
Estimation of Ne from comparisons with data from patients.
Sum of squares of the errors (SSE) between data from patients [36] and our predictions of viral diversity, dG, and divergence, dS, for different values of the population size, C, (Fig. 1) and the viral generation time, τ, shown for each of the nine patients. C and τ that yield the lowest SSE provide the best fit to the data. The best-fit value of C yields Ne (Table 1).
Figure 3.
Fits of our simulations to data from patients.
Best-fit predictions of our simulations (solid lines) presented with experimental data (symbols) of the evolution of viral diversity, dG, (cyan) and divergence, dS, (purple) for each patient. Each cell is assumed to be infected with M = 3 virions in our simulations. The values of Ne (cells) and τ (days) employed for the predictions are indicated.
Table 1.
Best-fit parameter estimates and the disease progression time.
Figure 4.
Correlations with disease progression.
Correlation of (A) Ne and (B) τ with the disease progression time, or the time from seroconversion for the CD4+ T cell count to fall to 200 cells/µL [36]. Symbols represent data obtained from our simulations with the frequency of multiple infections, M, = 3 (circles) and drawn from a distribution based on a viral dynamics model (triangles) (see text). Linear fits (lines) to the data yield Pearson correlation coefficients of (A) 0.91 (circles) and 0.74 (triangles) and (B) 0.88 (circles) and 0.75 (triangles). Note that the x-axis in (A) is plotted on a logarithmic scale.
Figure 5.
Simulations of viral genomic diversification with a low frequency of multiple infections.
The evolution of (A) viral diversity, dG, (B) divergence, dS, and (C) average fitness, f, with generations predicted by our simulations for different population sizes, C. Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). Error bars represent standard deviations.
Figure 6.
Estimation of Ne from comparisons with data from patients.
Sum of squares of the errors (SSE) between data from patients [36] and our predictions of viral diversity, dG, and divergence, dS, for different values of the population size, C, (Fig. 5) and the viral generation time, τ, shown for each of the nine patients. C and τ that yield the lowest SSE provide the best fit to the data. The best-fit value of C yields Ne (Table 1).
Figure 7.
Fits of our simulations to data from patients.
Best-fit predictions of our simulations (solid lines) presented with experimental data (symbols) of the evolution of viral diversity, dG, (cyan) and divergence, dS, (purple) for each patient. Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). The values of Ne (cells) and τ (days) employed for the predictions are indicated.
Figure 8.
Simulations of viral genomic diversification with a multiplicative fitness landscape and comparisons with patient data.
The evolution of viral diversity, dG, with generations predicted by our simulations (lines) for different population sizes C = 200 (solid) and 10000 (dashed) with a multiplicative fitness landscape (see text) with s = 0.01 (pink) and 0.001 (cyan). Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). Different symbols are data from nine different patients [36] shown also in Figs. 3 and 7.
Figure 9.
Estimation of Ne using the linkage disequilibrium test.
The frequency of the least abundant haplotype in a two-locus/two-allele model determined from our simulations (solid symbols) and by Rouzine and Coffin [15] (open symbols) as functions of the population size, C, assuming neutral evolution (purple), evolution with selection (cyan), and evolution with selection and recombination where the number of infections per cell is constant at 3 (blue), or follows a distribution determined from a viral dynamics model (see text) with ki = k0 (green) or ki = 0.7ki-1 (orange). Error bars represent standard deviations. Values of C at which predictions from simulations match experimental estimates [15] of the least abundant haplotype frequency (black line) yield Ne. 95% confidence limits on the experimental data are also shown (dotted line).
Figure 10.
The relative fitness, fi, of genomes as a function of their Hamming distances from the fittest sequence, diFL, obtained from experimental observations [37] (symbols) modified to account for the ratio of synonymous and non-synonymous substitutions (Methods) and predicted (black line) by the equation , with the best-fit parameters fmin = 0.24, d50L = 30, and n = 3 obtained upon ignoring outliers (open symbols). Multiplicative fitness landscapes,
, with s = 0.001 (cyan) and 0.01 (pink) are also shown.
Table 2.
Estimates of synonymous and non-synonymous substitution rates.