^{*}

NMD conceived and designed the experiments. GWS performed the experiments. GWS and NMD analyzed the data and contributed reagents/materials/analysis tools. NMD wrote the paper.

The authors have declared that no competing interests exist.

The ability to accelerate the accumulation of favorable combinations of mutations renders recombination a potent force underlying the emergence of forms of HIV that escape multi-drug therapy and specific host immune responses. We present a mathematical model that describes the dynamics of the emergence of recombinant forms of HIV following infection with diverse viral genomes. Mimicking recent in vitro experiments, we consider target cells simultaneously exposed to two distinct, homozygous viral populations and construct dynamical equations that predict the time evolution of populations of uninfected, singly infected, and doubly infected cells, and homozygous, heterozygous, and recombinant viruses. Model predictions capture several recent experimental observations quantitatively and provide insights into the role of recombination in HIV dynamics. From analyses of data from single-round infection experiments with our description of the probability with which recombination accumulates distinct mutations present on the two genomic strands in a virion, we estimate that ∼8 recombinational strand transfer events occur on average (95% confidence interval: 6–10) during reverse transcription of HIV in T cells. Model predictions of virus and cell dynamics describe the time evolution and the relative prevalence of various infected cell subpopulations following the onset of infection observed experimentally. Remarkably, model predictions are in quantitative agreement with the experimental scaling relationship that the percentage of cells infected with recombinant genomes is proportional to the percentage of cells coinfected with the two genomes employed at the onset of infection. Our model thus presents an accurate description of the influence of recombination on HIV dynamics in vitro. When distinctions between different viral genomes are ignored, our model reduces to the standard model of viral dynamics, which successfully predicts viral load changes in HIV patients undergoing therapy. Our model may thus serve as a useful framework to predict the emergence of multi-drug-resistant forms of HIV in infected individuals.

During the reverse transcription of HIV in an infected cell, the viral enzyme reverse
transcriptase switches templates frequently from one genomic RNA strand of a virion
to the other, yielding a recombinant proviral DNA that is a mosaic of the two parent
genomes. If one strand contains a mutation that confers upon HIV resistance to one
administered drug and the other strand resistance to another drug, recombination may
bring the two mutations together and give rise to progeny genomes resistant to both
those drugs [

Remarkable insights into HIV recombination emerge from recent in vitro experiments,
in which target cells were simultaneously exposed to two kinds of reporter viruses,
and cells infected with recombinant proviruses detected [^{+} T cells or macrophages, for
instance—does not influence the recombination rate. In contrast, Levy et
al. argue that subtle virus–cell interactions cause recombination to occur
at different rates in different types of cells [

Standard models of viral dynamics, which successfully describe short-term (a few
weeks) viral load changes in patients undergoing therapy, are predicated on the
infection of individual cells by single virions and ignore recombination
[

Fraser presents a detailed model of HIV dynamics considering up to three infections
of cells, mutation, recombination, fitness selection, and different dependencies of
the frequency of multiple infections of cells on the viral load [

In a more recent study, Althaus and Bonhoeffer extend the description of Bretscher et
al. [^{4}–10^{5} [

Currently available models thus make valuable predictions of the influence of
recombination on HIV dynamics and the emergence of drug resistance in infected
individuals undergoing therapy. The predictions, however, are diverse and have not
been compared with available experimental data [

One limitation of currently available models lies in the approximate descriptions of
the dynamics of multiple infections of cells employed. For example, the frequency
with which cells are doubly infected is assumed either to be constant
[

The latter model, however, does not distinguish between different viral genomes that infect cells and thereby precludes a description of recombination. Thus, for instance, the dynamics of the emergence of recombinant genomes and the origins of the second scaling relationship observed by Levy et al.—that the percentage of cells infected by recombinant genomes is linearly proportional to the percentage of cells coinfected with both the reporter genomes employed for infection—remain to be elucidated.

The analysis of Dixit and Perelson [

Whether currently available models of HIV dynamics that include infections by
distinct viral genomes and recombination validate the above arguments and predict
the observed scaling remains unclear. Dixit and Perelson predict the existence of
the above scaling relationships under certain parameter regimes and, importantly,
that the scaling relationships may depend on the length of time following the onset
of infection [

In this work, we develop a detailed model of HIV dynamics that considers multiple infections of cells by distinct viral genomes and describes recombination. Our model captures several recent in vitro experimental findings quantitatively and provides key insights into the mechanisms underlying the emergence of recombinant forms of HIV. At the same time, our model is consistent with the standard model of viral dynamics, which successfully captures viral load changes in patients following the onset of antiretroviral therapy, and may therefore be extended to describe HIV dynamics with recombination in vivo.

We consider in vitro experiments where a population of uninfected
CD4^{+} cells, _{11} and
_{22}, of homozygous virions containing genomes 1
and 2, respectively. The genomes 1 and 2 are assumed to be distinct at two
nucleotide positions, _{1} and
_{2}, a distance _{1} and genome 2 at
_{2} (_{12}, which
contain a copy each of genomes 1 and 2.

Viral genomes 1 and 2 employed at the onset of infection (A) and the four genomes resulting from the recombination of genomes 1 and 2 (B).

Infection of target cells by the virions _{12} yields
two kinds of “recombinant” genomes depending on the template
switching events during reverse transcription (_{1} and _{2} are both
included in the resulting proviral DNA, a recombinant genome that we denote as
genome 3 is formed. When both the mutations are excluded, the other recombinant,
genome 4, results. When one of the two mutations is included but not the other,
genomes 1 and 2 are recovered. Thus, four kinds of viral genomes, 1, 2, 3, and
4, eventually infect cells.

We distinguish infected cells by the proviral genomes they contain. We denote by
_{i}_{1} contain a single provirus 1, cells
_{2} contain a single provirus 2, and so on. We
denote by _{ij}_{11} contain two
copies of provirus 1, and cells _{12} contain a copy
of provirus 1 and a copy of provirus 2. Because cells
_{ij}_{ji}_{11}, _{12},
_{13}, _{14},
_{22}, _{23},
_{24}, _{33},
_{34}, and _{44}.
Extending the description, cells _{ijk}

Random assortment of viral RNA produced in infected cells gives rise to ten
different viral populations, which we denote _{ij}_{34} contain a
copy each of genomes 3 and 4. Cells infected with a single kind of provirus,
_{i}_{ii}_{ii}_{ij}_{ii}_{jj}_{ij}

Below, we write equations to describe changes in the various cell and viral populations following the onset of infection.

The in vitro dynamics of the uninfected cell population is governed by
[_{0} is the
second-order infection rate of uninfected cells, and _{0}.

The singly infected cell subpopulations are determined by the integral
equations: _{0}_{jh}_{jh}_{i}_{0}_{i}_{jh}_{jh}_{0}_{i}_{jh}

The doubly infected cell subpopulations are determined in an analogous
manner:

For cells coinfected with two different kinds of proviruses, we write

To evaluate the conditional probabilities _{0} is the infection rate of
an uninfected cell and _{d} is the timescale of
CD4 down-modulation. Three viral genes, _{d}) and is extended to include the influence
of

Assuming that the cell, following its first infection with provirus

Alternatively, a cell first infected with provirus

Substituting

We next determine the probability
_{i}_{ii}

Finally, when _{1} and
genome 2 at _{2} and that
_{2} – _{1}
= _{1},
i.e., the probability that the mutation on genome 1 is included in the
resulting provirus, is 1/2. Given that the mutation on 1 is included, the
mutation on 2 will be excluded if an even number of crossovers occurs
between _{1} and _{2}.

Let

Similarly, genome 2 results from genomes 1 and 2 if the mutation on genome 1
is excluded and an even number of crossovers occurs between
_{1} and _{2} so
that the mutation on 2 is included. It follows that
_{2}(12) =
_{1}(12).

Genome 3 results from genomes 1 and 2 if the mutation on genome 1 is included
and an odd number of crossovers occurs between
_{1} and _{2} so that
the mutation on genome 2 is also included. Following the above arguments, we
find that _{4}(12) =
_{3}(12).

Similarly, when

Finally, we write equations for the time evolution of the various viral
populations: _{i}_{ii}_{ii}_{ij}_{ii}_{jj}_{ij}_{11} and _{22}
alone, i.e., _{11} =
_{22} =
_{0} at

We solve ^{−1} and ^{−1}; the death rate of infected cells,
^{−1}; the
viral burst size, ^{−1}.
We let an initial target cell population, _{0}
= 10^{6}, be exposed to two equal viral populations,
_{11} = _{22}
= _{0}, which we vary over the experimental
range, 2_{0} = 10^{6} to
10^{10} [_{0}, the timescale of CD4 down-modulation,
_{d}, and the recombination rate,

In _{0} =
10^{8}, _{0} = 2
× 10^{−10} d^{−1},
_{d} = 0.28 d, ρ
= 8.3 × 10^{−4} crossovers per
position (see below), and ^{*}_{0}^{*}^{*}^{*}^{*}

The time evolution of the number of uninfected cells,
^{*}_{0} = 10^{6}; the
initial viral load, 2_{0} =
10^{8}; the birth and death rates of uninfected cells,
^{−1} and ^{−1}; the death rate of
infected cells, ^{−1}; the viral burst size,
^{−1}; the infection rate constant of uninfected
cells, _{0} = 2 ×
10^{−10} d^{−1}; the CD4
down-modulation timescale, _{d}
= 0.28 d; the recombination rate,
^{−4} crossovers per position; and the
separation between the mutations on genomes 1 and 2,

In ^{*}_{11} and _{22}
alone are employed at the onset of infection, their numbers are larger than
those of other viral subpopulations. When target cells are abundant, CD4
down-modulation ensures that singly infected cells occur more frequently
than doubly infected cells. Thus, during the first phase of the dynamics
following the onset of infection, cells singly infected with the infecting
genomes, i.e., _{1} and
_{2}, are the most prevalent. Note that because
_{11}
_{22} =
_{0} at the time of infection, at all
subsequent times _{1} =
_{2}. Next in prevalence are cells infected
twice with genome 1 and/or 2. Because coinfection by genomes 1 and 2 is
twice as likely as double infection by either 1 or 2, cells
_{12} are more prevalent than
_{11} (=
_{22}). The population of heterozygous virions,
_{12}, increases because of viral production
from the coinfected cells _{12}.

The time evolution of the various singly (solid lines) and doubly
(dashed lines) infected cell (left panels) and homozygous (solid
lines) and heterozygous (dashed lines) viral subpopulations (right
panels) following the onset of infection. Note that
_{1} =
_{2}, _{11}
= _{22},
_{3} =
_{4}, _{33}
= _{44},
_{13} =
_{23} =
_{14} =
_{24}, _{11}
= _{22},
_{33}
_{44}, and
_{13}
_{23}
_{14}
_{24}. The parameter values
employed are the same as those in _{d} = 2.8 d in (C) and (D)
and ^{−3} crossovers per position in (E) and
(F).

Infections by _{12} give rise to cells
_{3} and _{4},
infected singly with the recombinant genomes, which in turn produce virions
_{33} and _{44},
respectively. Coinfection by genomes 1 and 3 yields cells
_{13}, whose numbers are larger than those of
the doubly infected cells _{33} (=
_{44}) because of the small population of
_{33} compared to
_{11}. Again, because coinfection by genomes 3 and
4 is twice as likely as double infection by either 3 or 4, cells
_{34} are larger in number than
_{33}. Yet, homozygous virions
_{33} are more prevalent than heterozygous
virions _{34} because cells
_{3}, _{33}, and
_{34} produce _{33},
whereas cells _{34} alone produce
_{34}.

In the second dynamical phase, infected cell subpopulations decline because
of cell death at rate

Changes in the initial viral load, 2_{0}, or the
infection rate, _{0}, do not alter the dynamics
above qualitatively (unpublished data) [_{d}, does not
influence the overall dynamics in _{d} then alter the distribution of infected cells
into various multiply infected cell subpopulations but do not alter the
total population of infected cells, ^{*}_{d} = 2.8 d. A higher value of
_{d} implies slower CD4 down-modulation, which
renders infected cells susceptible to further infections for longer
durations and hence increases the relative prevalence of multiply infected
cells. Accordingly, we find that doubly infected and coinfected cell
subpopulations are higher in _{d} (

Interestingly, the recombination rate, _{1} and _{2}.
Consequently, an increase in ^{−3}
crossovers per position. Note that the numbers of cells singly and doubly
infected with genomes 1 and/or 2 are identical to those in _{33} (=
_{44}) is higher in

We examine next whether the above dynamics captures the scaling relationships
between the different infected cell subpopulations observed experimentally
[_{12}
=
100_{12}^{*}^{*}^{*}/^{*}_{12} is
proportional to (^{*}^{2}. The scaling
behavior is observed over the entire period of infection (_{12} versus
(^{*}^{2} for different viral
loads are superimposed, in agreement with the robust scaling observed in
experiments [

Parametric plots of (A) the percentage of cells coinfected with
genomes 1 and 2, _{12}, versus the total
percentage of infected cells, ^{*}_{4}, versus
_{12}, obtained by solving _{0} = 10^{6}
(green), 10^{7} (cyan), 10^{8} (blue),
10^{9} (purple), and 10^{10} (red). The dashed lines
are scaling patterns predicted by _{0} = 10^{6} (green)
and 10^{7} (cyan).

In _{12}.
Interestingly, we find two scaling regimes. When
_{12} is small, _{4} is
proportional to (_{12})^{2}, and the
parametric plots are distinct for different values of
_{0}. For larger values of
_{12}, _{4} is
linearly proportional to _{12} and independent of
_{0}. Thus, the parametric plots in the latter
regime are again superimposed, as observed in experiments [

We explain the origins of the above scaling regimes by considering two
limiting scenarios in our model (_{d}, and when changes
in viral and cell numbers are small, we find that _{eq}, we obtain _{1} as the mean
rate of the second infection of singly infected cells. As we show in

Remarkably, the scaling _{12} and
^{*}_{eq} (_{1}_{0}. We notice
thus that a transition from the small time (_{d}) scaling, _{eq}) scaling, ^{*}_{1}
= 1.4 × 10^{−10}
d^{−1} captures the long-time scaling for all initial
viral loads considered. On the other hand, the scaling between
_{4} and _{12} for
_{d},
_{4} and
_{12}, _{eq} is independent of the initial
viral load.

For very large viral loads (2_{0} ≥
10^{10}) and/or infection rates (_{0}
≥ 2 × 10^{−9} d^{−1};
unpublished data), we find that rapid infection and the consequent death of
infected cells preempts the establishment of pseudo steady state between
viral production and clearance in the first phase of infection, so that the
linear scaling relationship between _{4} and
_{12} is not observed (

Available in vitro experiments, where cells are simultaneously exposed to two
distinct kinds of viral genomes, may be segregated into two categories. First,
single-round infection experiments employ replication-incompetent (heterozygous)
virions to infect cells, and measure the fraction of cells that contain
recombinant proviral genomes [

We consider single-round infection experiments, where target cells are
exposed to a mixed viral population comprising homozygous virions,
_{11} and _{22}, and
heterozygous virions, _{12}, in the proportions
1/4, 1/4, and 1/2, respectively. Small viral loads are employed so that
multiple infections of cells are rare. Following infection, cells in which
recombinant proviruses result are identified. Rhodes et al. [_{max}, attained at arbitrarily large
separations and/or recombination rates (see below). We reproduce the
experimental data of Rhodes et al. in

(A) The ratio of the percentage of cells infected with the
recombinant 4, _{max}, as a function of the
separation, ^{−4} crossovers per position.

(B) The percentage of GFP^{+} cells as a function of
the crossover frequency,

We estimate the percentage of cells infected with genome 4 in the experiments
of Rhodes et al. as follows. We recognize that cells infected with
heterozygous virions _{12} alone may possess the
recombinant provirus 4. With the above distribution of the viral
subpopulations, the probability that an infection is due to a heterozygous
virion is 1/2. Following infection by a heterozygous virion, the probability
that recombination yields genome 4 is _{4}(12)
(_{4}(12)
=
(1/4)exp(−_{max}, of
1/8 (or 12.5%) as _{1} and
_{2}; the mutations at
_{1} and _{2} are
then selected independently, each with a probability 1/2, so that
_{4}(12) → 1/4.) Thus, according to
our model, _{max} =
[(1/2)_{4}(12)] / (1/8),
which upon combining with

We fit predictions of _{max} versus _{i}^{−4} crossovers per position indicates that
^{−4} crossovers per position in our calculations above.

Levy et al. [

To compare the observations of Levy et al. with our model predictions, we let
genome 1 (_{1} = 201 and genome 2 the
virus with the YFP gene carrying the critical YFP mutation at
_{2} = 609, so that
_{2} −
_{1} = 408 [_{1} and
_{2}, and also those genomes that contain both the
mutations at _{1} and
_{2} and the contents of genome 1 from positions
440 to 500. Accordingly, genome 3 includes those genomes that carry both the
mutations at _{1} and
_{2} but not all of the contents of genome 1 from
positions 440 to 500. With these definitions of genomes 3 and 4, we
recalculate the recombination probabilities of _{1} and _{2} and is
given by _{4}(12) that arises from recombination events
that include both the mutations at _{1} and
_{2} and the contents of genome 1 from
positions 440 to 500. The latter contribution is determined as follows. The
probability that reverse transcription begins on genome 1 at position
_{1} = 201 is 1/2. Given that the
mutation at _{1} is included, reverse
transcriptase would be on genome 1 at position 440 if an even number of
crossovers occurred between positions _{1} and
440, which happens with the probability
exp(−ρ_{a})cosh(ρ_{a}),
where _{a} = 440 –
_{1}. For the contents of genome 1 between
positions 440 and 500 to be included in the resulting provirus, no
crossovers must occur between positions 440 and 500, the probability of
which is exp(−ρ60). Finally, the mutation at
_{2} = 609 on genome 2 is included
if an odd number of crossovers occurs between positions 500 and
_{2}, which happens with the probability
exp(−ρ_{b})sinh(ρ_{b}),
where _{b} =
_{2} – 500. Multiplying the latter
probabilities and recognizing that _{a} +
_{b} + 60 =
_{4}(12) above. Similarly, we find that

We determine the fraction of infected cells that fluoresce following exposure
of cells to homozygous CFP and YFP virions and heterozygous CFP/YFP virions
in the proportions 1/4, 1/4, and 1/2, respectively, as follows. (Fluorescent
cells are detected in the experiments as infected.) When single infections
of cells predominate, half of the infections are due to homozygous virions,
which cause cells to fluoresce regardless of recombination. The other half
of the infections, which are due to heterozygous virions, induce
fluorescence when recombination yields genome 1, 2, or 4. Levy et al. ignore
GFP^{+} cells in their estimate of the total fraction
of infected cells [_{1}(12) +
_{2}(12)). The experimentally determined
fraction, _{g}, of infected cells that are
GFP+ is therefore
(1/2)_{4}(12)/[(1/2) +
(1/2)(_{1}(12) + R2(12))],
which simplifies to _{3}(12) and
_{4}(12) are determined using

Levy et al. [^{+} T cells. We compare these
percentages with our prediction of _{g} and
estimate the recombination rate in the respective cell types (^{+} T cells. Direct sequence analysis from Jurkat T
cells showed a mean crossover frequency of 7.5 (range 3–13)
[^{+} T cells is again in excellent agreement with
the estimate for Jurkat T cells and that from the data of Rhodes et al.
[

With macrophages, Levy et al. [^{+}. We
note that _{g} defined in _{1}
and _{2} but lowers the probability that no
crossovers occur between the positions 440 and 500 on genome 1. As a result,
the second contribution to _{4}(12) in _{g} increases
(_{g} = 0 when
^{+} cells observed with macrophages is thus higher
than the maximum value of _{g} predicted by our
model. We note that a higher percentage of GFP^{+} cells
than the theoretical maximum of ∼21% may result if cells
are multiply infected, which we ignore in our description of single-round
infection experiments. Indeed, Levy et al. [

We next compare our predictions with the dynamical and scaling patterns that
Levy et al. [^{6} CD4^{+}
T cells and detected the total percentage of cells infected (i.e., that
fluoresced), ^{*}_{12},
and the percentage of cells that were GFP^{+},
_{4}, with time following the onset of
infection. The quantities evolved in two distinct phases—an
initial rise and a subsequent fall. Our model captures the two-phase
dynamics qualitatively, as we demonstrate in _{12} and ^{*}_{4} and
_{12}.

Model predictions (thick lines) obtained by solving ^{+}/CFP^{+}) and the
total percentage of infected cells, and (B) the percentage of
GFP^{+} cells and the percentage of coinfected
cells. The different symbols represent experiments conducted with
cells from different donors [_{d} = 2.8 d in (A) and
^{−3} crossovers per position in (B). The thin
black line in (A) is the experimental best-fit line [

In _{12} versus
^{*}_{4}
versus _{12} for the initial viral load
2_{0} = 10^{8} and with the
parameters employed in _{3}(12) and
_{4}(12) by _{3}
and _{33}, in our count of the total number of
infected cells, ^{*}_{i}^{+} genomes (see above) present in the experiments.
Based on the relative magnitudes of the two contributions to
_{4}(12) in ^{−4} crossovers per position, the values of the two
terms are ∼0.12 and ∼0.03, respectively), we expect,
however, that a majority of the GFP^{+} genomes are those
that contain neither of the mutations on genomes 1 and 2, i.e., as shown in

We find that our model captures the quadratic scaling,
_{12} ∼
(^{*}^{2}, qualitatively. Our model
predicts that for small values of ^{*}^{*}_{1}_{0} <
1, the parametric plot of _{12} versus
^{*}_{12} at large values of
^{*}^{*}^{*}

Quantitatively, our model underpredicts the percentage of coinfected cells
_{12} compared to the experiments: the
experimental proportionality constant relating
_{12} and ^{*}_{ns}, of
non-susceptible cells in culture. The percentage of infected cells,
^{*}^{*}/^{*}_{ns}), where
^{*}_{12}, becomes
100_{12}^{*}_{ns}). The resulting
proportionality constant, _{ns} = 0) by the factor 1 +
_{ns}/(^{*}
+ ^{*}/(^{*}
+ _{ns}) ≈ 1/5. Thus, the factor
above, 1 +
_{ns}/^{*} ≈ 5,
explains at least in part the difference between the experimental
proportionality constant and that derived from our model. Further,
uncertainties exist in our knowledge of the CD4 down-modulation timescale,
_{d}, of the cells in culture [_{d} may enhance the frequency of multiple
infections and increase _{12} for a given value of
^{*}_{d} = 2.8 d appear to be in
better agreement with the experimental scaling between
_{12} and ^{*}_{d} = 2.8 d
implies that _{1} ≈
_{0} throughout, so that several assumptions
underlying the scaling relations in ^{*}_{12} and
^{*}^{*}_{d} and observed
in the experimental data. Nonetheless, quantitative comparison with the
experimental scaling between _{12} and
^{*}

The presence of non-susceptible cells, however, does not influence the linear
scaling relationship between _{12} and
_{4}. Given that _{4}/_{12} ≈
_{ns}. Our model predicts that
for small _{12}, _{4} is
proportional to (_{12})^{2} and for large
_{12}, _{4} is
proportional to _{12} (_{12}
is not observed [_{12} that may lie below experimental
detection limits (_{12} at the transition increases but the
quadratic scaling is short-lived. Thus, for larger viral loads, the
transition to the linear scaling regime appears to occur before the first
measurement following the onset of infection is made (at

We find remarkably that our model quantitatively captures the experimental
linear scaling between _{4} and
_{12} (_{12}, the model is in excellent agreement
with the data. Interestingly, the same recombination rate
(_{12}, the model slightly
underpredicts the experimental data, possibly because of the increased
likelihood of more than two infections of cells, which we ignore.
Nonetheless, the quantitative agreement between model predictions and the
experimental scaling relationship and the consistency of the predictions
with the recombination rate estimated from independent single-round
infection assays indicate that our model accurately captures the underlying
dynamics of recombination during HIV infection.

The emergence of recombinant forms of HIV that are resistant to multiple drugs often
underlies the failure of current antiretroviral therapies for HIV infection. Yet,
the dynamics of the emergence of recombinant genomes in individuals infected with
HIV remains poorly understood. Current models of HIV dynamics are unable to explain
available experimental data of the frequency of occurrence and the time evolution of
recombinant HIV genomes quantitatively. We developed a model that describes the
dynamics of the emergence of recombinant forms of HIV and quantitatively captures
key experimental observations. Mimicking recent experiments [

Model predictions are in agreement with the T cell dynamics observed in vitro. Levy
et al. [_{4}, is a small
fraction of the percentage of coinfected cells, _{12},
which in turn is a small fraction of the total percentage of cells infected,
^{*}_{12} is proportional to
(^{*}^{2} and that
_{4} is proportional to _{12},
independent of the initial viral load and the time following the onset of infection.
Our model predicts both these scaling patterns and that the patterns are independent
of the initial viral load and the time following the onset of infection.
Quantitative comparison between our model predictions and the experimental scaling
relationship between _{12} and
(^{*}^{2} is precluded by the poorly
characterized dynamics of cells not susceptible to infection by HIV that may be
present in the experimental cultures. We showed, however, that the presence of
non-susceptible cells does not influence the linear scaling relationship between
_{4} and _{12}. Indeed, our
model predictions are in quantitative agreement with the experimental scaling
relationship between _{4} and
_{12}. The quantitative agreement indicates that our model
captures the underlying dynamics of HIV recombination accurately.

Our model also captures data from single-round infection experiments on the frequency
of the accumulation by recombination of distinct mutations present on the two RNA
strands within a virion. From comparisons of model predictions with the experiments
of Rhodes et al. [^{4}
nucleotides. This number is in agreement with independent estimates from direct
sequence analysis by Levy et al. [^{+} T cells. Whereas the crossover frequency in HeLa
cells is lower, the frequency in the other two cell types is in agreement with the
estimate obtained from our analysis of the experiments of Rhodes et al.
[_{4} and
_{12} described above is also consistent with a
recombination rate of ∼8 crossovers per ∼10^{4} nucleotides.

The power law scaling that the number of doubly infected cells is proportional to the
square of the total number of infected cells is also predicted by the model of HIV
dynamics with multiple infections developed by Dixit and Perelson [^{2}) [

If the distinction between different viral genomes is ignored, our model reduces to
the model of HIV dynamics with multiple infections developed by Dixit and Perelson
[

Several advances of our model are essential, however, to predict the emergence of
recombinant genomes in vivo. First, that infected splenocytes from two HIV patients
harbored 3–4 proviruses per cell on average [

We non-dimensionalize

We solve the dimensionless _{m}_{m}

We derive below the scaling relationships mentioned in _{d}, because
_{0}, we write the
dynamics of singly infected cells as _{1} by the infection of uninfected
cells, and the second and third terms are the losses of
_{1} due to further infections and cell death. At the
start of infection, the dominant viral populations are
_{11} and _{22} (_{1}. Further, because
_{1} is small (_{1}, are negligible. _{0} and _{11}
≈ _{0} and integrate _{1}(0) = 0. With assumptions similar to
those employed in obtaining _{1}
= _{2} and _{11}
= _{22} ≈
_{0}. Substituting for _{1}
from _{12}(0)
= 0, we get _{12}, is given by _{12} are heterozygous. We ignore viral clearance
because _{12} is expected to be small. Substituting
for _{12} from _{12}(0) = 0, we obtain
_{4}, is then given by _{12}
≫ _{44} ≫
_{14}, most of the cells
_{4} are formed due to infection by
_{12} followed by recombination. Substituting for
_{12} from _{4}(0) = 0, we find

Because the total infected cell population comprises largely cells
_{1} and _{2}, which in
turn are significantly smaller in number than uninfected cells (_{12} ∼
(^{*}^{2} and
_{4} ∼
(_{12})^{2} hold.

We next consider times longer than the timescale over which viral production and
clearance reach pseudo steady state, _{eq}.
The magnitudes of the viral subpopulations still follow
_{11} = _{22}
≫ _{12} ≫
_{44} (_{1} = _{2}
≫ _{12} ≫ _{4}
(_{1} be the
“mean” infection rate of singly infected cells. We note that
_{1} is a function of the CD4 down-modulation
timescale, _{d}. If _{d} is
large, for instance, then _{1} ≈
_{0}. Applying the pseudo steady state
approximation for the viral populations yields _{11} from _{1}, which is expected in the
initial stages of infection, we integrate _{1} when _{eq}. Note that in _{eq}. Substituting
for _{11} and _{1} in _{eq} and recognizing that
^{*}_{1} following the establishment of pseudo steady
state (_{4} infected with
recombinants is _{eq}), the scaling laws _{12}
∼ (^{*}^{2} and
_{4} ∼ _{12}
hold.

cyan fluorescent protein

green fluorescent protein

yellow fluorescent protein