Central Nervous System Compartmentalization of HIV-1 Subtype C Variants Early and Late in Infection in Young Children

HIV-1 subtype B replication in the CNS can occur in CD4+ T cells or macrophages/microglia in adults. However, little is known about CNS infection in children or the ability of subtype C HIV-1 to evolve macrophage-tropic variants. In this study, we examined HIV-1 variants in ART-naïve children aged three years or younger to determine viral genotypes and phenotypes associated with HIV-1 subtype C pediatric CNS infection. We examined HIV-1 subtype C populations in blood and CSF of 43 Malawian children with neurodevelopmental delay or acute neurological symptoms. Using single genome amplification (SGA) and phylogenetic analysis of the full-length env gene, we defined four states: equilibrated virus in blood and CSF (n = 20, 47%), intermediate compartmentalization (n = 11, 25%), and two distinct types of compartmentalized CSF virus (n = 12, 28%). Older age and a higher CSF/blood viral load ratio were associated with compartmentalization, consistent with independent replication in the CNS. Cell tropism was assessed using pseudotyped reporter viruses to enter a cell line on which CD4 and CCR5 receptor expression can be differentially induced. In a subset of compartmentalized cases (n = 2, 17%), the CNS virus was able to infect cells with low CD4 surface expression, a hallmark of macrophage-tropic viruses, and intermediate compartmentalization early was associated with an intermediate CD4 entry phenotype. Transmission of multiple variants was observed for 5 children; in several cases, one variant was sequestered within the CNS, consistent with early stochastic colonization of the CNS by virus. Thus we hypothesize two pathways to compartmentalization: early stochastic sequestration in the CNS of one of multiple variants transmitted from mother to child, and emergence of compartmentalized variants later in infection, on average at age 13.5 months, and becoming fully apparent in the CSF by age 18 months. Overall, compartmentalized viral replication in the CNS occurred in half of children by year three.


Introduction
Human immunodeficiency virus type 1 (HIV-1) infection of the central nervous system (CNS) can occur shortly after transmission, and compartmentalized HIV-1 variants, genetically distinct from the virus in the blood, can be detected in the cerebrospinal fluid (CSF) in some individuals throughout the course of infection. Detection of compartmentalized CSF viral populations during primary infection suggests that compartmentalization can occur early in the absence of overt neurological symptoms [1]. Extensive compartmentalization of HIV-1 has been shown to be a strong indicator of HIV-1 neuropathogenesis contributing to HIV-1-associated dementia (HAD) [2,3,4,5], while intermediate levels of compartmentalization are associated with either an asymptomatic state or less severe forms of HIV-1-associated neurological disease [1,3]. Although a longitudinal link has not been made, these results raise the possibility that early detection of compartmentalized CSF variants may identify subjects with a higher risk of developing HIV-1-associated neurological complications.
Compartmentalized HIV-1 subtype B CNS populations can be either CCR5 (R5)-using T cell-tropic or macrophage-tropic [6,7,8,9,10]. Macrophage-tropic HIV-1 variants are characterized by the ability to infect cells with low CD4 surface expression [11,12,13], are poorly represented in the blood [9], are not transmitted [14,15], and decay slowly following initiation of highly active antiretroviral therapy (HAART) [16,17], unlike the rapid decay of virus replicating in activated CD4+ T cells within the blood [16,18,19]. While extensive research has been conducted on HIV-1 subtype B CNS infections, little is known about CNS compartmentalization of HIV-1 subtype C (the most common subtype worldwide) or the ability of HIV-1 subtype C to evolve to use low levels of CD4 for entry. Previous studies have reported significant differences in the subtype B and C envelope glycoproteins and suggested that subtype C may be less neuropathogenic than subtype B [20,21,22]. Additional studies on HIV-1 subtype C CNS infections are needed to improve our knowledge on HIV-1 subtype C pathogenesis, CNS compartmentalization and entry tropism.
Information on the genetic and phenotypic characteristics of HIV-1 within the CNS of infants and young children is scarce. HIV-1 CNS disease is often an AIDS-defining illness in children [23,24,25], and this implies that early infection of the CNS may be important in the pathogenesis of HIV-1 infection in infants [25]. Understanding the dynamics of pediatric HIV-1 CNS replication is thus of critical importance. We examined HIV-1 variants in 43 ART-naïve Malawian children aged three years or younger to determine the viral genotypes and phenotypes associated with HIV-1 subtype C pediatric CNS infection. We observed intermediate, minor compartmentalization in 25% and distinct, compartmentalized CSF variants in 28% of children. Older age and a higher CSF/blood viral load ratio were associated with CNS compartmentalization, with genetic evidence suggesting outgrowth of the compartmentalized variant starting at around 13 months of age. Transmission of multiple variants had occurred in 5 children, of which 4 had one variant sequestered within the CNS, consistent with early stochastic colonization of the CNS by the virus. Finally, we showed that genetically compartmentalized R5 virus with the ability to infect cells with low CD4 surface expression, a hallmark of macrophage-tropic viruses, can evolve in HIV-1 subtype C CNS infection in young children, although this was not common in the first three years of life. Overall we found that by 3 years of age, 50% of children infected with HIV-1 subtype C had virus independently replicating in the CNS.

Study design
We examined viral populations in paired peripheral blood and CSF samples collected from 43 HIV-infected Malawian children presenting with either neurodevelopmental delay or acute neurologic symptoms (neurodevelopmental delay will be examined in a separate study). Subjects were infected with HIV-1 subtype C, were ART naïve, and ranged in age from 3 to 35 months. Blood viral loads ranged from 4,344 to .80,000,000 copies/mL and CSF viral loads ranged from 56 to 4,745,000 copies/mL. The virological characteristics for each subject are summarized in Table 1. Infection is assumed to be by a vertical route but the fractions infected prepartum, intrapartum, or postpartum are not known. To asses HIV-1 genetic compartmentalization, cDNA templates were generated from extracted blood and CSF viral RNA and used in single genome amplification (SGA) of the fulllength viral env gene [26,27,28,29,30]. A mean of 19 amplicons were analyzed per sample. The sequence of the entire env gene was determined for each amplicon and phylogenetic analysis was completed.
Compartmentalization was assessed visually and statistically using the Slatkin-Maddison test [31]. CNS compartmentalization was defined by a Slatkin-Maddison P value,0.05 and a genetically distinct CSF population with a bootstrap value $40; we used a relatively low bootstrap value in this exploratory analysis because of the overall low diversity of the viral population in these children. Intermediate populations were defined by a Slatkin-Maddison P value.0.05 but with visual evidence of a minor CSF subpopulation of $4 CSF amplicons and a bootstrap value $40. Equilibrated populations were defined by a Slatkin-Maddison P value.0.05 and no evidence of a minor or major CSF population (Table 1).
Phylogenetic trees were also examined for clonal amplification. Clonally amplified lineages were defined as having short branch lengths in the neighbor-joining phylogenetic tree with bootstrap values $99 and a clade of $3 variants. These lineages signify the recent amplification of identical or nearly identical variants.

Relationship of viral populations in blood and CSF
Twenty out of 43 subjects (46%) had equilibrated viral populations in their blood and CSF (Table 1 and Figure 1A). For these subjects, sequences from the two compartments were well mixed and the CSF sequences were not genetically distinct from those of the blood. Two of these subjects did, however, display evidence of minor clonal amplification in the CSF.
In 11 out of 43 subjects (26%), an intermediate condition existed where the peripheral blood and CSF HIV-1 populations were not uniformly equilibrated and contained a minor CSF subpopulation (Table 1 and Figure 1B). Six of these intermediate subjects displayed evidence of clonal amplification. We hypothesize that the minor CSF population may indicate a precursor population within the CNS with the potential to expand into a compartmentalized population as infection progresses (see relationship with age below).
Significant genetic compartmentalization was detected between the blood and CSF populations in 12 out of 43 subjects (28%) ( Table 1 and Figure 1C) indicating the presence of an independent, autonomously replicating viral population within the CNS. In these subjects, the virus in the CSF was genetically distinct from the virus in the blood. In 5 compartmentalized

Author Summary
Genetically compartmentalized human immunodeficiency virus type 1 (HIV-1) subtype B populations can be variably detected in the cerebrospinal fluid (CSF) of adults. Compartmentalization is indicative of local CNS replication, and late in disease is linked to HIV-associated dementia (HAD). Compartmentalized viral populations can comprise either CCR5-using T cell-tropic or macrophage-tropic virus. Little is known about CNS infection in children or the ability of subtype C HIV-1 to evolve macrophage-tropic variants. We examined viral populations in the blood and CSF of HIV-1 subtype C-infected children. We found an intermediate level of compartmentalization in about half of the children under 18 months of age. About 50% of children older than 18 months had clearly compartmentalized virus in the CSF/CNS, and in some cases CSF virus evolved a low CD4 entry phenotype. In some of the children two variants were transmitted from the mother. In several of these cases one of the transmitted viruses was replicating in the CNS while the other was found predominantly in the blood/periphery. Our results suggest that compartmentalized CSF/CNS populations can be detected in up to 50% of children by year three, either established early in the infection or through sequestration of a transmitted variant within the CNS. e P values used to measure genetic compartmentalization between the blood plasma and CSF HIV-1 populations were obtained using the Slatkin-Maddison test for gene flow between populations [31]. A P value,0.05 indicated statistically significant genetic compartmentalization. f P value,0.05 was obtained for subject 3040, but visual assessment of the neighbor-joining tree structure indicated the presence of an additional HIV-1 population within the blood plasma that was largely absent from the CSF.
subjects, evidence of clonal amplification was observed. Therefore, within the first three years of HIV-1 subtype C pediatric infection, significant genetic compartmentalization can be observed.
Compartmentalization is significantly related to older age and a higher CSF/blood viral load ratio Relationships between subject characteristics and compartmentalization were assessed using the Mann-Whitney test. Older children were more likely to have compartmentalized CSF variants when compared to equilibrated (P = 0.05) and intermediate subjects (P = 0.005) ( Figure 2A). As the majority of vertical transmission occurs early, either in utero, at delivery or during the first months of breast feeding [32], older age provides a longer period of time for viral variants to become established within the CSF. No relationship was observed between the blood and CSF viral loads and subject classifications ( Figure S1). However, a higher CSF/blood viral load ratio was significantly related to compartmentalization when compared to equilibrated (P = 0.02) or intermediate subjects (P = 0.001) ( Figure 2B), consistent with the occurrence of local replication and expansion of viral populations within the CNS independent from the peripheral blood. Thus, compartmentalization most often appears subsequent to transmission and is associated with a higher CSF/blood viral load ratio, representing virus produced locally in the CNS.

Evolutionary history of viral populations within the first three years of life
Bayesian Evolutionary Analysis by Sampling Trees (BEAST) [33] was used to estimate the time to most recent common ancestor (TMRCA) of the entire viral populations and the compartmentalized populations (Table 1). For the majority of the subjects, transmission was predicted to have occurred at birth 66 months, as shown by the good concordance between the TMRCA and the age of the child at the time of sampling ( Figure 3A). For 5 subjects, the predicted TMRCA was substantially higher than the age of the child, which, based on further analysis (discussed below), was probably due to multiple transmitted viruses.
As the majority of subjects were probably infected at or around the time of birth, we were able to depict the occurrence of compartmentalization as a function of time ( Figure 3B). Before age (c) Two subjects demonstrating the presence of a statistically significant compartmentalized population within the CSF relative to the blood, as assessed using the Slatkin-Maddison test [31]. doi:10.1371/journal.ppat.1003094.g001 g P value,0.05 was obtained for subject 4013, but visual assessment of neighbor-joining tree structure indicated significant compartmentalization was not present. h HIV-1 population characteristics in the CSF compartment (compart). Eq, equilibrated blood plasma and CSF populations; Inter (Intermediate), a minor subpopulation of the CSF was compartmentalized; Comp, significant compartmentalization in the CSF; Amp, clonal amplification of $3 variants detected in the CSF; N/A, not applicable when the CSF viral load was too low to obtain enough CSF env sequences.
i The time to most recent common ancestor (TMRCA) analyzed by Bayesian Evolutionary Analysis by Sampling Trees (BEAST) [33].    18 months, the populations in half the children were equilibrated, with more intermediate populations than compartmentalized in the remaining half. After age 18 months, about half of the children continued to have equilibrated populations while in the remainder, compartmentalized populations were now much more prevalent than intermediate populations. These data further support a potential transition over time in one-half of the children to increasing CNS compartmentalization in the absence of antiretroviral therapy.

Evidence of multiple transmitted variants
For 5 subjects for whom the predicted TMRCA was substantially greater than the age of the subject ( Figure 3A), further analysis revealed the presence of multiple transmitted viruses  (Table 1). For one compartmentalized subject (3002; age 5 months), phylogenetic analysis revealed a deep bifurcation separating two distinct viral populations, one comprised almost exclusively of CSF variants, and the other comprised primarily of blood variants ( Figure 4A). Sequence analysis demonstrated that the viral populations were genetically distinct and had minimal recombination ( Figure 4B). The overall TMRCA was 52 months, while the TMRCAs for the distinct plasma and CSF populations were 6 and 8 months, respectively. Together, these results indicate that the mother likely transmitted two genetically distinct viruses to the child, and one variant was sequestered within the CNS while the other was maintained within the periphery. A multiple variant transmission event was observed in one additional compartmentalized subject ( Figure S2), two intermediate subjects ( Figure S3 and S4) as well as one equilibrated subject ( Figure S5). For the compartmentalized and intermediate subjects, one virus was sequestered within the CNS. For the equilibrated subject, both transmitted variants expanded within the blood and CSF. These data suggest that some variants can get selectively established in the CNS early after transmission.

Evolution of CSF viruses to infect cells with low levels of CD4 surface expression
Macrophage tropism of HIV-1 is associated with the ability to infect cells expressing low levels of CD4 [11,12,13], while R5 T cell-tropic viruses infect these cells very poorly and require high levels of CD4 to enter cells [18,19]. However, different preparations of macrophages vary significantly in their ability to be infected due to differing levels of CD4 in separate preparations of monocyte-derived macrophages (MDM) (Joseph et al., in preparation). To avoid this confounding variability, we have turned to a cell line that has regulatable levels of CD4 and CCR5, i.e. 293-Affinofile cells [34]. Entry phenotype was assessed by measuring the ability of pseudotyped reporter viruses to enter cells expressing either high or low levels of CD4. Viruses pseudotyped with Env proteins derived from virus in equilibrated subjects were only able to infect Affinofile cells with high CD4 surface expression and were considered R5 T cell-tropic ( Figure 5A). Viruses pseudotyped with Env proteins derived from virus in intermediate subjects were also only able to infect cells expressing high levels of CD4 which we infer defines R5 T cell tropism ( Figure 5B). A partially evolved entry phenotype was observed in subjects 4007 and 4013, where CSF variants were able to infect cells with low CD4 at modest levels, potentially identifying a precursor population to the low CD4 entry phenotype. Examples of viruses with a low CD4 entry phenotype were observed in two compartmentalized subjects (4049 and 4058) ( Figure 5C). For both of these subjects, only the Env-pseudotyped viruses derived from the compartmentalized CSF population, not virus from the blood, were able to infect cells with low CD4 surface expression. These results indicate that subtype C HIV-1 viruses with a low CD4 entry phenotype can be detected in the CSF of children, but this is not a common occurrence within the first three years. Thus, we hypothesize that in most children replication in the CNS is sustained by growth in T cells, while in a subset (10-20% of children with CNS compartmentalized virus) the virus evolves to replicate in cells with low CD4 surface expression, potentially macrophages and/or microglia.

Discussion
Our study design involves cross-sectional sampling and thus has several limitations, especially with regard to inferring temporal relationship. However, there is a wide distribution of ages of the subjects within the enrollment criteria allowing us to compare between different age groups and to draw correlations based on age. Also, the virus carries the history of longitudinal evolution in its sequence, thus allowing us to infer dates of bottlenecks in the history of viral replication. While cross-sectional analyses are inherently limited, we have some basis for suggesting temporal relationships in the observed phenomenon. Based on our results we hypothesize four distinct states to describe the relationship between virus in the CSF/CNS and virus in the blood/periphery. The first state has no genetic evidence for HIV-1 replication in the CNS, wherein the only virus detected in the CSF is genetically similar to that in the blood and is typically present at 1% or lower of the level in the blood, possibly due to some import or spill pathway from the blood into the CSF. The second state occurs prior to 18 months of age and involves minor compartmentalization of the CSF viral population, suggesting some local replication in the CSF/CNS but not to a level where the viral load increases in the CSF. In 10-20% of these children there is evidence for the initial evolution of virus that can use lower levels of CD4, potentially on a path to becoming macrophage-tropic. The third state occurs in about half of the children older than 18 months of age, and in this state the viral population in the CSF shows strong evidence of genetic compartmentalization, indicative of local replication and evolution within the CNS. This is also accompanied by a higher relative viral load in the CSF due to the local production of virus well above the low level that is imported from the periphery. In about 10-20% of the children with compartmentalized virus in the CSF, we identified variants that had evolved to use low levels of CD4, which we presume indicates that the virus was now growing in macrophages and/or microglia within the CNS. The fourth state involves multiple variant transmissions from mother to infant of which one variant preferentially replicates in the CNS and another replicates in the periphery. As HIV-1 replication in the CNS can contribute to neurological disease, further research should determine whether the ability to detect different states of CSF viral populations within the CNS of young children could guide strategies to monitor and prevent neurodevelopmental disorders in HIV-infected children.
Vertically transmitted viruses are often highly homogeneous, representing infection seeded by a single variant and characterized by low diversity [35,36,37,38]. We identified five infants who appeared to be infected with multiple variants, with the viral populations in the remaining infants having a phylogenetic age consistent with the age of the infant, which we assume indicates infection with a single variant. Surprisingly, in four of the five infants infected with multiple variants, one of the variants was largely sequestered in the CNS/CSF. For this to occur, either one of the transmitted variants had a selective tropism for the CNS, or infection of the CNS was a low probability event influenced by the chance introduction of a founder virus. Alternatively, since the CNS is a somewhat immune-privileged site, the absence of the sequestered virus in the periphery may be due to selection by maternal antibodies or the initial infant immune response in that compartment. All of the env genes tested from three of these subjects (3002, 3017, and 4002; see Figure 5) showed that the pseudotyped viruses required high levels of CD4 to enter cells, i.e. were R5 T cell-tropic. In our previous work we found multiple variants in approximately 30% of infants infected vertically [38], which is not substantially different from the number found here. The sequestration of virus in the CNS shortly after transmission suggests that inferring the number of transmitted variants based on the complexity of virus in the blood may result in an underestimate of the frequency of transmission of multiple variants.
We can infer several other features of compartmentalization by comparing the age of the infant to the inferred age of the viral population using BEAST for those infants infected with a single variant. In the remaining 10 infants with compartmentalization who were infected with a single variant, the age of the CSF/CNS compartmentalized viral population was significantly less than the age of the entire viral population when compared to the age of the infant (P = 0.002; Wilcoxon signed rank test). In these cases it appears that compartmentalization is established after the initial stages of infection, with the compartmentalized virus emerging on average at 13.5 months but with outgrowth in the CNS only becoming apparent in the CSF approximately 18 months after birth. Thus we can identify two distinct pathways to compartmentalization: early sequestration of a transmitted virus in the CNS, or the later establishment of independently replicating virus that originates in the periphery. The compartmentalized lineage in the intermediate group appeared earlier after birth (mean 5.1 months) compared to the compartmentalized lineage in the compartmentalized group (mean 13.5 months) (P = 0.008; Mann-Whitney test). This may be an indication that the intermediate state represents susceptibility to viral replication in the CNS but that there is a subsequent bottleneck that defines the CNS population.
Genetically compartmentalized R5 T cell-tropic and macrophage-tropic HIV-1 subtype B populations have been shown to be associated with neurological complications in adults [9]. The macrophage-tropic populations were genetically diverse, representing established CNS infections, while the R5 T cell-tropic populations were clonally amplified and associated with pleocytosis [9]. Macrophage-tropic HIV-1 variants are generally characterized by the ability to infect cells with low CD4 surface expression [11,12,13]. However, infection using MDM from healthy donors is highly variable, and the variability is correlated with different levels of CD4 (Joseph et al., in preparation). For this reason, it is more quantitative to use a cell line where the levels of CD4 are regulatable and reproducible. Thus we have used Affinofile cells [34] as a surrogate for the entry phenotype of viruses able to use low levels of CD4 versus those requiring high levels of CD4. Our results demonstrated that compartmentalized R5 T cell-tropic and what we infer to be macrophage-tropic populations can also be found in the CSF of children infected with HIV-1 subtype C. A partial entry phenotype was observed in two intermediate subjects; we hypothesize that this is evidence for the initial evolution of virus that can use lower levels of CD4, potentially on a path to becoming macrophage-tropic.
Viral replication in the CNS results in the local production of inflammatory and neuronal destruction molecules such as monocyte chemoattractant protein (MCP-1), neopterin, IP-10, and neurofilament light subunit (NFL). Production of these inflammatory markers has been observed in animal models [39,40] and has been linked to HIV-1-associated neurocognitive damage in adults [41,42,43]. The potential for long term neurocognitive damage in children as a result of HIV-1associated production of inflammatory markers within the CNS, our findings of compartmentalized viral replication with viral lineage established at 13.5 months on average, and the ability of a transmitted variant to become sequestered in the CNS shortly after transmission adds further justification to the policy of early initiation of antiretroviral treatment in children, in this case as part of an effort to prevent the establishment of compartmentalized viral populations that may contribute to neurological complications.

Study subject population
The study was approved by the Institutional Review Boards of the University of North Carolina at Chapel Hill and the University of Malawi College of Medicine in Blantyre. Permission to participate in the research study was obtained for all children through written informed consent by the caregiver. All subjects included in this study were HIV-1 subtype C-infected children between 3 and 35 months of age. HIV-1 infection was verified at time of enrollment by a positive PCR for HIV DNA/RNA if ,18 months of age or two positive rapid HIV antibody tests after age 18 months. Samples were collected at a one pre-HAART baseline visit for all subjects. CSF and blood plasma samples were used for viral genetic compartmentalization and env protein phenotypic analyses. Blood plasma and CSF HIV-1 viral loads (copies/mL) were determined by the UNC Chapel Hill Center for AIDS Research Virology Core.

Single genome amplification
Subtype C HIV-1 RNA was isolated from blood plasma and CSF samples as previously described [16]. Briefly, viral RNA was isolated from blood plasma and CSF samples (140 mL) using the QIAmp Viral RNA Mini kit (Qiagen). Prior to RNA isolation, all blood plasma and CSF samples were pelleted (0.1-0.5 mL) by centrifugation at 25,0006g for 1.5 hours at 4uC to increase template number and improve sampling. Purified viral RNA (10-50 ml) was reverse transcribed using Superscript III Reverse transcriptase (Invitrogen) and an oligo-d(T) primer according to the manufacturer's instructions. Single genome amplification (SGA) of the full-length HIV-1 env gene through the 39 LTR U3 end was conducted as previously described [30]. Briefly, cDNA was endpoint diluted and nested PCR was completed using Platinum Taq High Fidelity polymerase (Invitrogen) and the primers Vif1 [30] and 2.R3.B6R (59-TGAAGCACTCAAGG-CAAGCTTTATTGAGGC-39; nt 9607 to 9636), and EnvA [30] and Low2c (59-TGAGGCTTAAGCAGTGGGTTCC-39; nt 9591 to 9612), were used for the first and second rounds of PCR, respectively. PCR amplicons were sequenced from the start of env through env gp41, gp160 end (HXB2 numbering of positions 6110-8833). Chromatograms with double peaks, indicating amplification from more than one cDNA template, as well as sequences with frameshift mutations resulting in premature stop codons, were excluded from analysis.
Phylogenetic analysis of env viral sequences DNA sequences alignments of env genes were performed using ClustalW [44]. Sequences for each subject were codon aligned (MEGA 4.0) and phylogenetic trees were generated using neighbor-joining method (MEGA 4.0) [45]. Compartmentalization of viral sequences was assessed using the Slatkin-Maddison test [31] available through HyPhy [46] using 10,000 permutations. No contamination occurred between samples ( Figure S6).

Bayesian analysis
A Bayesian Markov Chain Monte Carlo (MCMC) approach, as implemented in BEAST v.1.6.1 [33], estimated TMRCA for each patient sample. A substitution rate of 3.5610 25 substitutions/site/ generation was fixed under a strict clock model, as determined by calculation of inter-patient percent difference in the plasma nucleotide sequence. The HKY nucleotide substitution model had estimated base frequencies and a gamma-distributed rate heterogeneity (4 gamma categories). A coalescent Bayesian Skyline tree prior with a Piecewise-constant skyline model was used (4 groups). The MCMC algorithm was run for 30 million generations, logging every 1000 and with a 10% burn-in. The results from at least two independent runs were combined, and the effective sample size for all estimates was .200. A generation time of 1.5 days was used for estimation of time to the MRCA [47].
Construction of HIV-1 env clones SGA amplicons, selected based on the subject's phylogenetic tree structure, were re-amplified from the first-round nested PCR product using the Phusion hot start high-fidelity DNA polymerase (Finnzymes) and the primers EnvAClon (59 CACCGGCTTAGG-CATCTCCTGTGGCAGGAAGAA-39; nt 5950-5982) and EnvN [30] following the manufacturer's instructions. HIV-1 env amplicons were then gel purified using the Qiagen gel extraction kit (QIagen). Purified HIV-1 env genes (50 ng) were cloned into the pcDNA3.1D/V5-His-TOPO expression vector (invitrogen) using the pcDNA 3.1 directional TOPO expression kit (Invitrogen) and the entire cloning reaction (6 ml) was transformed into MAX Efficiency Stbl2 competent cells (50 ml) as per the manufacturer's instructions. Bacterial colonies were screened for correct insertion of the HIV-1 env gene using colony PCR, and DNA was extracted from 3-6 positive colonies using the Qiaprep spin miniprep kit (Qiagen).

Env-pseudotyped viruses
Each Env-pseudotyped luciferase reporter virus was generated using the Fugene 6 transfection reagent and protocol (Roche) to co-transfect 293T cells with an env expression vector and the pNL4-3.LucR-E-HIV-1 backbone (obtained from the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH). Prior to transfection, 293T cells were seeded at a density of 4.8610 5 cells/well in 6-well tissue culture plates coated with 10% poly-L-lysine. Transfection medium was replaced five hours post-transfection with fresh culture medium and the cells were incubated at 37uC for 48 hours, after which viral supernatants were filtered with 0.45 mM filters (Millipore) and stored at 280uC.

Single-cycle infection of 293-Affinofile cells
Env-pseudotyped luciferase reporter viruses were first titered in triplicate in a 96-well plate format on 293-Affinofile cells [34] expressing the maximum induction levels for both CD4 (6 ng/ml doxy) and CCR5 (5 mM ponA) surface expression as previously described [16]. In order to ensure that each infection assay was performed within the linear range, we used the volume of each virus needed to produce 800,000 relative light units (RLUs) when used to infect Affinofile cells expressing the highest levels of CD4 and CCR5. Two days prior to infection, 96-well, black tissue culture plates were coated with 10% poly-L-lysine and then seeded with 293-Affinofile cells at a density of 1.85610 4 cells/well. 18-24 hours later, expression of CD4 and CCR5 was induced at two conditions in triplicate: CD4 high /CCR5 high (6 ng/ml doxy and 5 mM ponA, respectively) and CD4 low /CCR5 high (0 ng/ml doxy and 5 mM ponA). 18 to 24 hours later, the induction medium was removed and gently replaced with 100 ml of fresh, warmed culture medium containing env-pseudotyped virus. The infection plates were spinoculated [48] at 2,000 rpm for 2 hours at 37uC, and then incubated for an additional 48 hours at 37uC. Infection medium was then removed, the cells were lysed, and luciferase activity was assayed using the luciferase assay system (Promega).

Nucleotide sequence accession numbers
The HIV-1 env nucleotide sequences determined in this study have been deposited in GenBank under accession numbers KC186127-KC187733. Highlighter plot of aligned env plasma and CSF sequences, generated at www.hiv.lanl.gov. The HXB2 base number is indicated on the x axis, and the sequence identifier is indicated on the y axis. Base changes are indicated by the following ticks on the highlighter plot: A, green; T, red; G, orange; and C, blue. Sequestered populations resulting from the transmitted viruses are separated by heavy black lines. One variant was sequestered within the CSF (top) and additional variants were established within the blood (middle). Another variant was also isolated from the CSF (bottom), but the population was not maintained. (TIF) Figure S3 Intermediate subject 3017 exhibiting two transmitted viruses. Phylogenetic and sequence analysis of plasma and CSF HIV-1 populations for subject 3017. (a) Neighbor-joining tree. Sequences from the CSF are labeled with solid blue circles, and plasma sequences (PL) are labeled with solid red triangles. Bootstrap values $40 are indicated (*) at the appropriate nodes. Genetic distance is scaled at the bottom of the figure (0.001) and indicates the number of nucleotide substitutions per site between env sequences. The subject's age is noted, as well as the overall TMRCA. BEAST was unable to assign a TMRCA to the internal nodes due to recombination in the population. The CNS sequestered population is represented by an open black circle. (b) Highlighter plot of aligned env plasma and CSF sequences, generated at www.hiv.lanl.gov. The highlighter plot characteristics are the same as those stated in Figure  S2. Subject was likely infected with two viral variants during transmission. The two sequences that are closest to the parental strains are PL_14 (top) and CSF_3 (bottom), and recombination between the transmitted viruses appears to account for much of the env genetic diversity detected in both the plasma and CSF populations. The transmitted CSF variant was maintained within the CNS, generating a minor CSF population with small local replication, accounting for the intermediate state observed for this subject. (TIF) Figure S4 Intermediate subject 4048 exhibiting greater than two transmitted viruses. Phylogenetic and sequence analysis of plasma and CSF HIV-1 populations for subject 4048. (a) Neighbor-joining tree. Sequences from the CSF are labeled with solid blue circles, and plasma sequences (PL) are labeled with solid red triangles. Bootstrap values $40 are indicated (*) at the appropriate nodes. The phylogenetic tree characteristics are the same as those stated in Figure S3. (b) Highlighter plot of aligned env plasma and CSF sequences, generated at www.hiv.lanl.gov. The highlighter plot characteristics are the same as those stated in Figure S2. Several unique motifs were observed around 1000 base pairs (top), indicating that there were potentially greater than 2 transmitted viruses. Recombination between the transmitted viruses appears to account for much of the env genetic diversity detected in both the plasma and CSF populations. The transmitted CSF variant was maintained within the CNS, generating a minor CSF population with small local replication, accounting for the intermediate state observed for this subject. (TIF) Figure S5 Equilibrated subject 4055 exhibiting two transmitted viruses. Phylogenetic and sequence analysis of plasma and CSF HIV-1 populations for subject 4055. (a) Neighbor-joining tree. Sequences from the CSF are labeled with solid blue circles, and plasma sequences (PL) are labeled with solid red triangles. Bootstrap values $40 are indicated (*) at the appropriate nodes. Genetic distance is scaled at the bottom of the figure (0.001) and indicates the number of nucleotide substitutions per site between env sequences. The subject's age is noted, as well as the overall TMRCA. (b) Highlighter plot of aligned env plasma and CSF sequences, generated at www.hiv.lanl.gov. The highlighter plot characteristics are the same as those stated in Figure  S2. Subject was likely infected with 2 viral variants during transmission. The two sequences that are likely closet to the parental strains are CSF_16 and PL_4, and recombination between the transmitted viruses appears to account for much of the env genetic diversity detected in both the plasma and CSF populations. Both variants were maintained within the blood and CSF, accounting for the equilibrated state observed for this patient. (TIF) Figure S6 No contamination was observed between subjects. Neighbor-joining phylogenetic tree (radial topology). env sequences from the CSF are labeled with solid blue circles and env sequences from the blood plasma are labeled with solid red triangles. Genetic distance is indicated at the bottom of the figure and indicates the number of nucleotide substitutions per site between env sequences. Each subject ID is indicated. (TIF)