Taking Multiple Infections of Cells and Recombination into Account Leads to Small Within-Host Effective-Population-Size Estimates of HIV-1

doi:10.1371/journal.pone.0014531

Figure 1.

Simulations of viral genomic diversification.

The evolution of (A) viral diversity, d_G, (B) divergence, d_S, and (C) average fitness, f, with generations predicted by our simulations for different population sizes, C. Each cell is assumed to be infected with M = 3 virions. Error bars represent standard deviations.

More »

Expand

Figure 2.

Estimation of N_e from comparisons with data from patients.

Sum of squares of the errors (SSE) between data from patients [36] and our predictions of viral diversity, d_G, and divergence, d_S, for different values of the population size, C, (Fig. 1) and the viral generation time, τ, shown for each of the nine patients. C and τ that yield the lowest SSE provide the best fit to the data. The best-fit value of C yields N_e (Table 1).

More »

Expand

Figure 3.

Fits of our simulations to data from patients.

Best-fit predictions of our simulations (solid lines) presented with experimental data (symbols) of the evolution of viral diversity, d_G, (cyan) and divergence, d_S, (purple) for each patient. Each cell is assumed to be infected with M = 3 virions in our simulations. The values of N_e (cells) and τ (days) employed for the predictions are indicated.

More »

Expand

Table 1.

Best-fit parameter estimates and the disease progression time.

More »

Expand

Figure 4.

Correlations with disease progression.

Correlation of (A) N_e and (B) τ with the disease progression time, or the time from seroconversion for the CD4⁺ T cell count to fall to 200 cells/µL [36]. Symbols represent data obtained from our simulations with the frequency of multiple infections, M, = 3 (circles) and drawn from a distribution based on a viral dynamics model (triangles) (see text). Linear fits (lines) to the data yield Pearson correlation coefficients of (A) 0.91 (circles) and 0.74 (triangles) and (B) 0.88 (circles) and 0.75 (triangles). Note that the x-axis in (A) is plotted on a logarithmic scale.

More »

Expand

Figure 5.

Simulations of viral genomic diversification with a low frequency of multiple infections.

The evolution of (A) viral diversity, d_G, (B) divergence, d_S, and (C) average fitness, f, with generations predicted by our simulations for different population sizes, C. Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). Error bars represent standard deviations.

More »

Expand

Figure 6.

Estimation of N_e from comparisons with data from patients.

Sum of squares of the errors (SSE) between data from patients [36] and our predictions of viral diversity, d_G, and divergence, d_S, for different values of the population size, C, (Fig. 5) and the viral generation time, τ, shown for each of the nine patients. C and τ that yield the lowest SSE provide the best fit to the data. The best-fit value of C yields N_e (Table 1).

More »

Expand

Figure 7.

Fits of our simulations to data from patients.

Best-fit predictions of our simulations (solid lines) presented with experimental data (symbols) of the evolution of viral diversity, d_G, (cyan) and divergence, d_S, (purple) for each patient. Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). The values of N_e (cells) and τ (days) employed for the predictions are indicated.

More »

Expand

Figure 8.

Simulations of viral genomic diversification with a multiplicative fitness landscape and comparisons with patient data.

The evolution of viral diversity, d_G, with generations predicted by our simulations (lines) for different population sizes C = 200 (solid) and 10000 (dashed) with a multiplicative fitness landscape (see text) with s = 0.01 (pink) and 0.001 (cyan). Each cell is assumed to be infected with M virions drawn from a distribution based on a viral dynamics model (see text). Different symbols are data from nine different patients [36] shown also in Figs. 3 and 7.

More »

Expand

Figure 9.

Estimation of N_e using the linkage disequilibrium test.

The frequency of the least abundant haplotype in a two-locus/two-allele model determined from our simulations (solid symbols) and by Rouzine and Coffin [15] (open symbols) as functions of the population size, C, assuming neutral evolution (purple), evolution with selection (cyan), and evolution with selection and recombination where the number of infections per cell is constant at 3 (blue), or follows a distribution determined from a viral dynamics model (see text) with k_i = k₀ (green) or k_i = 0.7k_i_-1 (orange). Error bars represent standard deviations. Values of C at which predictions from simulations match experimental estimates [15] of the least abundant haplotype frequency (black line) yield N_e. 95% confidence limits on the experimental data are also shown (dotted line).

More »

Expand

Figure 10.

Fitness landscape.

The relative fitness, f_i, of genomes as a function of their Hamming distances from the fittest sequence, d_iFL, obtained from experimental observations [37] (symbols) modified to account for the ratio of synonymous and non-synonymous substitutions (Methods) and predicted (black line) by the equation , with the best-fit parameters f_min = 0.24, d₅₀L = 30, and n = 3 obtained upon ignoring outliers (open symbols). Multiplicative fitness landscapes, , with s = 0.001 (cyan) and 0.01 (pink) are also shown.

More »

Expand

Table 2.

Estimates of synonymous and non-synonymous substitution rates.

More »

Expand