Compartmentalized Replication of R5 T Cell-Tropic HIV-1 in the Central Nervous System Early in the Course of Infection

Compartmentalized HIV-1 replication within the central nervous system (CNS) likely provides a foundation for neurocognitive impairment and a potentially important tissue reservoir. The timing of emergence and character of this local CNS replication has not been defined in a population of subjects. We examined the frequency of elevated cerebrospinal fluid (CSF) HIV-1 RNA concentration, the nature of CSF viral populations compared to the blood, and the presence of a cellular inflammatory response (with the potential to bring infected cells into the CNS) using paired CSF and blood samples obtained over the first two years of infection from 72 ART-naïve subjects. Using single genome amplification (SGA) and phylodynamics analysis of full-length env sequences, we compared CSF and blood viral populations in 33 of the 72 subjects. Independent HIV-1 replication in the CNS (compartmentalization) was detected in 20% of sample pairs analyzed by SGA, or 7% of all sample pairs, and was exclusively observed after four months of infection. In subjects with longitudinal sampling, 30% showed evidence of CNS viral replication or pleocytosis/inflammation in at least one time point, and in approximately 16% of subjects we observed evolving CSF/CNS compartmentalized viral replication and/or a marked CSF inflammatory response at multiple time points suggesting an ongoing or recurrent impact of the infection in the CNS. Two subjects had one of two transmitted lineages (or their recombinant) largely sequestered within the CNS shortly after transmission, indicating an additional mechanism for establishing early CNS replication. Transmitted variants were R5 T cell-tropic. Overall, examination of the relationships between CSF viral populations, blood and CSF HIV-1 RNA concentrations, and inflammatory responses suggested four distinct states of viral population dynamics, with associated mechanisms of local viral replication and the early influx of virus into the CNS. This study considerably enhances the generalizability of our results and greatly expands our knowledge of the early interactions of HIV-1 in the CNS.


Introduction
While HIV-1 can be detected in both the cerebrospinal fluid (CSF) and brain tissue during the weeks after initial exposure [1][2][3][4][5][6][7], it is unknown when the virus actually begins replicating independently in the central nervous system (CNS). Independent viral replication within the CNS has two important implications. First, HIV-1 replication can lead to CNS dysfunction and injury, and while combination antiretroviral therapy (cART) has markedly reduced the incidence of HIV-associated dementia (HAD), the prevalence of milder HIV-associated neurological disorders (HAND) has increased [8,9] in the cART era. Second, independent CNS replication may also provide a reservoir distinct from that found in CD4+ T cells in the blood and lymphoid tissue. We do not know the time course of the virologic events that lead to neurological dysfunction and the potential establishment of a CNS reservoir, or the extent to which these long-term outcomes are predicted by the initial aspects of virus-host interaction.
While extensive independent, or compartmentalized, CSF/CNS replication is associated with severe HIV-1 clinical CNS dysfunction [1,[10][11][12][13], genetically distinct virus can be detected in the CNS throughout the course of infection [4,10]. Thus far, two types of compartmentalization have been defined: one in which a few variants are rapidly expanded giving a CSF viral population of low complexity (clonal amplification) consisting of variants that require high levels of CD4 for entry (R5 T cell-tropic). The second type is characterized by a complex CSF viral population consisting of variants that can enter cells expressing low levels of CD4 (macrophage-tropic), indicative of a more prolonged period of isolated replication and evolution of the entry phenotype. Additionally, we have recently shown that after vertical transmission to children, CNS compartmentalization can be established via two mechanisms: the early sequestration of one of multiple transmitted variants in the CNS, or the later establishment of compartmentalized CNS virus originating from the periphery [14]. In a previous study, we demonstrated CSF HIV-1 compartmentalization in human subjects during the first year of HIV-1 infection in adults [4]. However, that study only examined viral population sequences from CSF and plasma in a small number of subjects (eight), limiting the generalizability of our findings. Furthermore, the previous study had sparse assessment of longitudinal samples, and included no analysis of sources of compartmentalized HIV-1 within the CSF.
For the current study, we used single genome amplification (SGA) and phylogenetic analysis to assess viral populations in the CSF in the presence and absence of cellular inflammation (i.e. pleocytosis) in cross-sectional and often longitudinal paired blood plasma and CSF samples obtained during the first two years of HIV-infection in ART-naïve subjects. We also extended our analytical approach to include Bayesian Evolutionary Analysis by Sampling Trees (BEAST) to assess time to most recent common ancestor (TMRCA) of CSF and blood HIV-1 populations [15], providing new insights on the timing of establishment of early compartmentalized populations, and we assessed the entry phenotypes of selected clones, further confirming the nature of the transmitted virus as R5 T cell-tropic [16]. Based upon a complex interplay between HIV-1 RNA concentration, viral compartmentalization, and CSF white blood cell (WBC) count, we suggest at least four different patterns to characterize the relationship between virus in the blood and virus in the CSF/CNS during the early period after infection. The current study considerably enhances the generalizability of our results and provides an unprecedented view of the early interactions of HIV-1 in the CNS.

Study population and analysis parameters
We analyzed the HIV-1 RNA concentrations in paired blood plasma and CSF samples collected from 72 adult subjects enrolled in an observational neurological study of primary HIV-1 infection, defined as within one year of initial infection. All subjects were infected with HIV-1 subtype B and were ART-naïve at all study intervals, except for one subject, 9018, who was treated with tenofovir, emtricitabine, and atazanavir between the first and second analyzed time points. Paired follow-up samples were assessed for 37 subjects with longitudinal samples available within the initial two years post infection (p.i.). In total, 144 paired samples were available for analysis. Baseline demographic and clinical characteristics at enrollment for the entire cohort (n = 72) and for the subset that had sufficient CSF viral RNA concentrations (defined as greater than 1,000 copies of viral RNA/ml) to allow adequate sampling of the viral population via SGA (n = 33) are shown in Table 1. For the 33 subjects whose samples were analyzed by SGA, a total of 55 blood plasma/CSF sample pairs were analyzed including the longitudinal samples (Table 2; three time points for subject 9018 and one time point for  subject 9040 beyond 2 years p.i. were analyzed but were not included in any overall population analysis.) Following SGA and phylogenetic analysis, compartmentalization was assessed by three approaches. The choice to use multiple approaches was based on the recent findings of Zarate et al. [17] illustrating that different methods of assessing compartmentalization often yield divergent results. Thus, we assessed CNS compartmentalization using three methods-the tree-based Slatkin-Maddison test (SM) [18] and two distance-based methods, Wright's measure of population subdivision (F st ) [19,20] and the Nearest-neighbor statistic (S nn ) [21]. CNS lineages were interpreted as being significantly compartmentalized if all three tests were significant (P values < 0.05). We define CNS phylogenetic states as ( Fig. 1): i) equilibrated, with similar populations between the blood and CSF, with no evidence of independent CNS replication (Fig. 1A); and ii) compartmentalized, with a  [19,20] and the Nearest-neighbor statistic (S nn ) [21]. P values <0.05 indicated statistically significant genetic compartmentalization. genetically distinct CSF population as indicated by three statistically significant measures of compartmentalization (Fig. 1B). In our analyses, compartmentalized populations typically included clonally amplified variants often with more genetically diverse variants (Fig. 1B, Sub. 9040), and sometimes with the presence of recombinants between two clonally amplified variants (Fig. 1B, Sub. 9096). In order to determine how often our statistical assessment of CNS compartmentalization was due to the presence of clonally amplified viruses, we performed additional analyses in which we collapsed clonally amplified sequences into a single sequence and repeated the tests of compartmentalization. We determined that of the eight sample pairs that displayed both clonal amplification and CNS compartmentalization, three were significantly compartmentalized after collapsing the clonal sequences. This suggests that clonal amplification may drive the statistical assessment of compartmentalization or may be a symptom of ongoing CNS replication. Under this latter scenario, we propose that ongoing viral replication in the CNS may produce diverse, CNS-specific lineages and trigger an influx of T cells that amplify some of these lineages. Table 2 shows the clinical and virologic assessment for each subject who had at least one time point analyzed by SGA, and S1 Table shows this information for the 39 subjects who had no time points analyzed by SGA.
We also considered what role cellular inflammation (pleocytosis) might play in determining both the HIV-1 RNA concentrations and the nature of the viral population in the CSF, specifically whether an equilibrated population might exist with higher CSF HIV-1 RNA concentrations due to an influx of cells, including infected cells, during an inflammatory response. For our analyses, we chose a cut-off of 10 WBC/μl to define a state of CSF pleocytosis, as this is two-times the published upper limit of normal values of CSF WBC (5 cells/μl) [22] ensuring that the measured pleocytosis was a robust marker for an inflammatory response. Approximately two-thirds of subjects have low CSF viral RNA concentrations during the first two years of infection In order to assess temporal patterns of CNS viral replication and inflammation, we treated all 144 paired samples from the 72 subjects as independent observations (a limitation of the analysis but one that allowed us to categorize the samples by time post infection). We divided the samples into the following three windows: acute, 0-4 months p.i.; early, 5-12 months p.i.; or established, 13-24 months p.i.. The choice of these times also allowed us to bin the data into groups with reasonable sample sizes. Regardless of length of time since HIV-1 infection, the CSF viral RNA concentration was less than 1,000 copies/ml in approximately two-thirds of samples ( Fig. 2A), suggestive of little local production of virus in the CNS, at least as assessed by viral load. Analysis of the viral populations in the remaining samples showed that in approximately 30% of the paired samples the sequence composition of the viral population in the CSF was similar to the viral population in the blood (i.e. equilibrated), and pleocytosis was detected in the CSF of one-quarter to one-half of the samples with these equilibrated populations ( Fig. 2A). Compartmentalized viral populations were also detected but exclusively after the first 4 months, in 11% of samples in the early group (total n = 70) and 3% of the established group (total n = 30). However, those samples in which pleocytosis was detected were significantly less likely to have viral RNA levels in the CSF of less than 1,000 copies/ml compared to the total group of samples (see below). The percentages of samples in each phylogenetic group with pleocytosis ( Fig. 2B) and without pleocytosis are noted (S1 Fig.).
We next assessed factors associated with the CSF viral load. We first examined subjects without evidence of inflammation (i.e. no pleocytosis) and either low viral load in the CSF (less than 1,000 copies/ml) or an equilibrated population as evidence of no sustained local replication. In these subjects the HIV-1 RNA concentration in the CSF was proportionally 1-2% of the level in the blood (Spearman's rank correlation coefficient = 0.48, P<0.0001) (Fig. 2C). We could not determine whether this proportional relationship was maintained for those samples in which the CSF viral load was undetectable (<50 copies/ml), although there was a trend for these samples to be from subjects with low plasma HIV-1 RNA concentrations (Fig. 2C). We hypothesize that this low level of virus in the CSF is due to the normal trafficking of T cells into the CNS, including infected CD4+ T cells that release the observed virus.
Pleocytosis can be associated with elevated viral load in the CSF and an influx of virus from the blood Pleocytosis was detected in 36% of all subjects (Table 2 and S1). Consistent with our prior findings in HIV-infected subjects, lymphocytes were the predominant cell type in subjects with and without elevated levels of cellular inflammation [23]. When pleocytosis was detected, the HIV-1 RNA concentration was twice as likely (64%) to be above 1,000 copies/ml in the CSF compared to the overall population of which two-thirds were below 1,000 copies/ml. Additionally, when pleocytosis was present and the CSF viral load was high enough for SGA analysis, the viral populations were most often equilibrated (Fig. 2D, open circles), with a CSF viral load that was significantly higher than for equilibrated populations without pleocytosis (P = 0.0008, Mann-Whitney Test) (Fig. 2D). This suggests that the influx of infected CD4+ T cells associated with pleocytosis brings virus into the CSF/CNS from the blood. There was a trend toward increased CSF:blood albumin ratio (a marker for reduced blood-brain barrier integrity) in the presence of pleocytosis, which could contribute to an influx of virus (Table 2). However, there was a similar (and statistically significant) increase in CSF:blood albumin ratio in the compartmentalized subjects that did not account for the increase in CSF HIV-1 RNA concentration compared to the samples with equilibrated populations without pleocytosis ( Fig. 2D; see below), which was instead due to local virus production. These data suggest that increased viral burden in the CNS can result from two factors: independent CNS replication, generating compartmentalized CSF populations, or an influx of infected cells as the result of the inflammatory response of pleocytosis, producing a viral population in the CSF that is similar to that in the blood. It is possible that much of the pleocytosis seen in these subjects is in response to HIV-1 replication in the CNS where the influx of infected cells from the blood produces elevated levels of virus that obscure the detection of a lower level of the locally produced and States represented include: Not Analyzed by SGA, due to CSF viral load <1,000 copies/ml; Equilibrated (−), CSF WBC <10 cells/μl; Equilibrated (+), CSF WBC 10 cells/μl; and Compartmentalized. We used a cut-off of 10 WBC/μl to define a state of substantial CSF pleocytosis because this is two-times the upper limit of normal value of CSF WBC (0 to 5 cells/μl) [22] ensuring that the measured pleocytosis was a robust marker for an inflammatory response. compartmentalized virus responsible for inducing the inflammation. Other agents that induce pleocytosis should have a similar effect on viral load in the CSF and this was seen in an incident of neurosyphilis in subject 9018 when sampled at day 687 p.i. (Table 2). However, we note that pleocytosis is not always associated with higher viral load (Table 2 and S1), indicating a more complex and perhaps dynamic relationship between pleocytosis and viral load.
Compartmentalized viral populations early are associated with higher CSF viral loads, and local replication and/or inflammation can persist or reoccur over time Little evidence for local CNS replication was detected until after 4 months of infection ( Fig. 2A), suggesting that detectable CNS-compartmentalized populations are present at greater frequency with a longer time since HIV-1 exposure. As noted above, CSF samples with compartmentalized populations had elevated viral RNA loads when compared to the samples with only equilibrated populations in the absence of pleocytosis (Fig. 2D), consistent with local production of virus with the detection of compartmentalization.
We used cross sectional analyses to examine when these markers of local replication and inflammation were observed. We estimated the percentage of samples with no evidence of viral involvement in the CNS from the number with a CSF viral RNA concentration less than 1,000 copies/ml ( Fig. 2A; unanalyzed samples) and the number with equilibrated populations in their CSF and little to no pleocytosis ( Fig. 2A; equilibrated (−)). In contrast, we estimated the percentage of samples with evidence of viral involvement in the CNS from the number with equilibrated populations in their CSF and pleocytosis ( Fig. 2A; equilibrated (+)) and the number with compartmentalized populations in their CSF ( Fig. 2A; compartmentalized). In these analyses we interpret equilibrated populations with pleocytosis/inflammation as an indirect marker of local viral replication, i.e. the inflammatory response to local viral replication. Based on these analyses we determined that the vast majority of samples collected during the three time periods (0-4, 5-12 and 12-24 months p.i.) had no evidence of CNS involvement (82, 78 and 90%, respectively), while the remaining samples (18,22 and 10%, respectively) had evidence of CNS involvement. However, if viral replication and its associated markers (e.g. pleocytosis) are transient or produce small signals, these values will underestimate the actual proportion of subjects with viral replication in their CNS.
We used a longitudinal analysis to evaluate the persistence of viral replication in the CNS. Of the 37 subjects with longitudinal sampling, the majority (17 of 37) had low CSF viral loads at all sampling time points and were not analyzed by SGA (S1 Table). If we focus on the 20 subjects that had higher CSF viral loads (>1,000 copies of HIV-1 RNA/ml) and were analyzed by SGA, we observe that 9 (45%) had equilibrated populations with minimal to no pleocytosis at all time points analyzed (Fig. 3A). The remaining 11 subjects (55% or 30% of all longitudinally sampled subjects) all showed evidence of pleocytosis and/or compartmentalized viral replication at one or more time points within the initial two-year period (Fig. 3B) and 6 (30% or 16% of all longitudinally sampled subjects) had one of these states at two or more time points (Fig. 3B, subjects below dotted line). Thus, 55% of longitudinally sampled subjects analyzed by SGA showed evidence of viral replication and/or marked pleocytosis in the CNS at one or more time points, and 30% showed evidence of persistent viral replication in the CNS. Finally, if the initial sample showed evidence of compartmentalization or an equilibrated population with pleocytosis then subsequent samples were significantly more likely to also be compartmentalized or have pleocytosis than if the initial sample was not in one of these states (P = 0.001).

Genetic evidence for the persistence of HIV-1 CNS replication
In two subjects, 9040 and 9021, a compartmentalized CNS population was observed at enrollment and for all analyzed longitudinal time points, spanning a period of 753 and 201 days p.i., respectively (one time point beyond 2 years p.i. was analyzed for subject 9040 but was not included in any overall population analysis). Further analysis of these subjects identified distinct trends of how HIV-1 becomes established in the CNS.
One pattern was seen in the viral evolution in subject 9040 (Fig. 4A). In this case a compartmentalized, clonally amplified population present at the initial time point (day 165) was replaced with a second compartmentalized, clonally amplified variant at day 352, but the sampling included a recombinant between the first and second clonally amplified variants. Several recombinants between these two early populations were maintained through day 918, however, the overall population was equilibrated at this last time point. Using BEAST analysis to estimate number of generations of the viral population, the time to most recent common ancestor (TMRCA) of the blood population at the initial time point was estimated to be 209 days, reasonably consistent with the reported date of transmission (165 days prior to sampling). We also estimated that the initial clonally amplified CSF population was established 33 days prior to sampling (approximately 170 days after transmission), followed by a subsequent clonal amplification event and recombination between the two lineages. These data showed the early establishment of a lineage within the CNS that persisted over a period of at least 2 years. To our knowledge, this was the first study to show the maintenance and evolution of a compartmentalized viral population within the CNS over a long duration of time starting during early infection.
A second trend was seen for subject 9021 (Fig. 4B). Again, the reported date of transmission, 140 days prior to the first sampling date, was reasonably close to the transmission bottleneck estimated using BEAST at 159 days prior to sampling. In this subject one compartmentalized, clonally amplified lineage was detected in the CSF at the first sampling time point with an estimated age of 102 days, or starting 57 days post infection. This lineage was not present at the second sampling time point (at 341 days) but was replaced by a different compartmentalized, clonally amplified variant with an estimated age of 55 days. Thus for this subject we observed a permissive environment for viral replication in which variants were successively and independently amplified within the CNS.

Timing of introduction of HIV-1 into the CNS
We detected compartmentalization as early as 140 days p.i. (Table 2). Using BEAST to estimate the time to most recent common ancestor (TMRCA) we were able to show that often these CSF populations were established much earlier. Four subjects with longitudinal sampling (11 samples total) had a compartmentalized CSF population detected between 5-12 months post infection ( Table 2); three of these subjects had an initial clonally amplified CSF population (7146, 9021 and 9040; Fig. 4). When we determined the estimated TMRCAs of these CSF populations, we estimate the initial clonally amplified populations were all established within approximately the first four months after infection and two were established within the first two months of infection ( Table 2). These data show that the CNS compartment is permissive for HIV-1 replication in at least a subset of subjects from very early times after infection.
We have recently shown that after vertical transmission to children, CNS compartmentalization can be established early via the sequestration of one of multiple transmitted variants in the CNS [14]. When we reanalyzed data from a previously described subject (7146, Fig. 4C) [4] using BEAST, we showed that the phenomenon of transmission of two variants with one sequestered in the CSF/CNS shortly after transmission also occurs in adults. In this case the two transmitted variants diverged from each other in the donor (BEAST-estimated TMRCA of 965 days but with a reported transmission date of 156 days prior to sampling) while the transmitted variants diversified in the recipient. One lineage that was present in both the blood and CSF went through a bottleneck (presumably the transmission bottleneck) at 134 days prior to sampling, again consistent with the reported transmission date of 156 days prior to sampling. The other variant was sequestered in the CNS with an estimated bottleneck of 85 days, appearing early as a clonally amplified variant. In addition, a series of recombinants between these two lineages appeared in both the blood and CSF over time. This additional mechanism for establishing a compartmentalized viral population within the CNS early following transmission was also observed for subject 9018 ( Table 2).
All transmitted variants are R5 T cell-tropic and are predominantly selected to use high levels of CD4 for entry Macrophage-tropic HIV-1 variants can infect cells expressing low levels of CD4 while R5 T cell-tropic viruses are selected for replication in cells with high levels of CD4 for entry [24][25][26][27][28][29][30][31]. Macrophage-tropic HIV-1 is seen most reliably as a compartmentalized CSF/CNS population in a subset of individuals with HAD. To further our understanding of viral characteristics in the CNS early during infection, we analyzed the entry phenotype of viruses isolated from our adult primary infection cohort. Affinofile cells, on which CD4 and CCR5 surface expression can be differentially induced [32], are a more reproducible model for entry tropism analysis compared to primary cells [33]. We assessed entry phenotypes by measuring the ability of pseudotyped reporter viruses to enter Affinofile cells expressing either high or low levels of CD4.
Viruses pseudotyped with Env proteins derived from 24 subjects, representing all phylogenetic states and a wide range of days p.i., all required high levels of CCR5 and CD4 for efficient entry and were considered R5 T cell-tropic (Fig. 5A-B). However, Env proteins from CSF samples containing compartmentalized viral lineages (all collected more than 4 months post infection) were significantly better at entering cells expressing low CD4 than Env proteins derived from equilibrated CSF samples (ANOVA; P<<0.001). While this enhanced ability to enter low CD4 cells did not near the level of a macrophage-tropic Env protein (Fig. 5A-B, Ba-L), the detectably elevated levels suggest that compartmentalized lineages may be adapting to enter low CD4 cells in the CNS (discussed below). In contrast, viruses isolated from the CSF of adults within four months of infection are uniformly poor at infecting low CD4 cells, indicating that they have been selected for replication in T cells. This is consistent with multiple studies showing that macrophage-tropic viruses are not transmitted [16,27,28,[34][35][36][37][38][39]. Together these results indicate that the CNS is initially exposed to R5 T cell-tropic variants and that viruses in the CNS remain R5 T cell-tropic for the first two years of infection, but the data provide suggestive evidence that the evolution of macrophage tropism may begin during this period.
We also assessed the infectivity of a subset of the pseudotyped viruses on monocyte-derived macrophages (MDMs) generated from three separate donors. Infectivity was first assessed on Affinofile cells expressing high levels of CD4, with equal levels of infectious virus then added to each of the MDM preparations. There was general concordance between infectivity on Affinofile cells and infectivity on MDMs, although the level of infectivity differed significantly between the three donors (S2 Fig.). Viruses representing subjects with equilibrated populations and with low infectivity on Affinofile cells with low CD4 (Fig. 5A) had low infectivity on MDM (from subjects 9027, 9045, 9055, 9063, and 9073; S2 Fig.).
Viruses from four of five subjects with CNS compartmentalization were also assessed for their ability to enter Affinofile cells expressing low levels of CD4 and MDMs. Both blood-and CSF-derived viruses from one subject (subject 9096) were observed to have intermediate infectivity on low CD4 Affinofile cells (Fig. 5B) and MDMs (S2 Fig.). Similarly, two of the compartmentalized subjects (9018 and 9021) had CSF-derived viruses with elevated infectivity of low CD4 Affinofile cells (Fig. 5B) and MDMs (S2 Fig.). Finally, both blood-and CSF-derived viruses from one compartmentalized subject (subject 7146) were unable to efficiently enter both low CD4 expressing Affinofile cells and MDMs (S2 Fig.). Thus infectivity of MDMs is generally consistent with the conclusions derived from infection of Affinofile cells although the variability of infectivity of MDMs between donors precludes an accurate assessment of the range of CD4 entry phenotypes that can be observed using Affinofile cells.

Discussion
Independent HIV-1 replication in the CNS has been associated with neurological disorders [10,40,41] and may represent a distinct reservoir from that found in the blood and lymphoid tissue [42,43]. We examined the virologic characteristics associated with early CNS infection through analysis of paired cross-sectional and longitudinal blood plasma and CSF samples from a large cohort of 72 ART-naïve subjects infected with HIV-1 for less than two years. Our current study significantly builds upon a previous preliminary study by our group [4], enabling us to propose a model with four distinct states to describe the relationship between viral populations in the CSF/CNS and viral populations within the peripheral blood during early HIV-1 infection. These states are based upon details revealed by the current study on mechanisms of establishment of viral compartmentalization within the CNS, relationships between cellular inflammation, HIV-1 RNA levels and phylogenetic state, and insight into longitudinal maintenance and evolution of compartmentalization.
The first state (Fig. 6A) was observed in subjects with little evidence of CNS replication or pleocytosis, with CSF HIV-1 RNA concentrations proportionally 1-2% of the viral load in the periphery (Fig. 2B). In many of these subjects, the CSF HIV-1 RNA level was very low, below the limit of detection of standard assays. Minimal CSF viral burden has been observed in a prior report on a portion of this primary infection cohort [5]. With little or no pleocytosis, HIV-1 is likely entering the CSF/CNS at low levels via incomplete partitioning of virus at the blood-brain barrier, or low level trafficking of immune cells, including small numbers of infected CD4+ T cells. In this circumstance the viral population is very similar to the population in the blood. It is possible that some HIV-1 is replicating independently in the CNS at low levels in these subjects, but we were not able to detect these putative genetic variants above the low level background of virus recently imported from the periphery into the CSF/CNS. An argument in favor of even this low level viremia in the CSF being the result of T cell trafficking is the observation that in neuro-asymptomatic subjects with CD4+ T cells below 50 cells/ul in the blood the viral load in the CSF is on average lower than in subjects with higher CD4+ T cell counts [44,45].
In a second state (Fig. 6B), we observed a relationship between equilibrated viral populations with elevated viral load and high levels of pleocytosis (Fig. 2C). These equilibrated populations were most likely the result of the release of virus from increased numbers of infected CD4+ T cells trafficking from the periphery into the CNS. Though pleocytosis of > 10 cells/ul might have been due to a variety of inflammatory conditions (e.g. neurosyphilis as documented in one individual), it is most likely in response to HIV-1 replication in these PHI subjects. We screened for syphilis in this cohort, and detailed clinical and imaging assessment did not reveal other contributing causes of pleocytosis. Furthermore, other 'background' non-HIV causes of CSF WBC 10 cells/ul are unlikely in these subjects, as our parallel studies of 54 HIV-uninfected volunteers recruited from the similar local community demonstrated median CSF WBC counts of 1 cell/ul (IQR 0-2), and none of these 54 subjects had a CSF WBC as high as 10 [5]. In the setting of pleocytosis, while low levels of local CNS replication may have been occurring, the virus imported from the periphery by infected CD4+ T cells dominated the population as it raised the CSF HIV-1 RNA concentration by release of virus imported from the blood. If the inflammatory immune response was successful, pleocytosis might eventually result in low CSF viral loads, a condition observed in a small subset of subjects with pleocytosis but very low levels of virus in the CSF (S1 Table). Pleocytosis was also observed in several subjects with an intermediate viral population phenotype and half of the subjects with compartmentalized viral populations (Fig. 2D), suggesting pleocytosis may result in dynamic changes in the viral population in the CSF. An association between equilibrated compartments and high pleocytosis was also observed in a previous study analyzing four HIV-infected subjects during therapy interruption [46]. In a third state (Fig. 6C), we observed clonally amplified CSF populations of low complexity ( Fig. 1 and 4) representing the recent expansion of identical or nearly identical variants that required high levels of CD4 for entry (R5 T cell-tropic; Fig. 5). High levels of pleocytosis were observed in approximately half of the subjects with clonally amplified CSF populations, making it possible that the influx of activated CD4+ T cells may also have provided cellular targets for further transient amplification of a CSF variant. We [47] (Dukhovlinova et al., in preparation) and others [48][49][50] have observed clonal amplification in the genital tract as well as in the CSF both early [4,14] and at later times in infection [41]. Clonal amplification appears to be a distinct type of virus-host interaction where infection of a population of CD4+ T cells in a compartment is a low probability event and when it occurs there is transient rapid expansion of the viral population. Due to the daily rapid turnover of the CSF viral RNA load, the elevated CSF viral RNA load that is often observed during clonal amplification, and the fact that these clonally amplified lineages generate their own diversity that can persist within the CNS, it is highly unlikely that clonally amplified virus represents virus produced from a single cell. The detection of clonally amplified populations in the CSF within the first year of infection has allowed us to estimate the establishment of these populations within the CNS to within the first 2-6 months ( Table 2, Fig. 4), and such amplified populations were detected in 8% of subjects in this study within the first year.
In the final state, we observed more genetically complex compartmentalized viral replication within the CSF/CNS (Fig. 6D) indicative of persistent replication beyond a single clonal amplification event. In an effort to get a more complete view of the interaction of the virus within the CNS at these early times of infection we have interpreted the presence of persistent replication in the CNS based on four criteria: i) sequential clonal amplification events that indicated a permissive CNS environment for viral replication (Fig. 4B); ii) overlapping clonal amplification events that gave rise to compartmentalized recombinants showing continuous replication between the sampled time points (Fig. 4A); iii) intermittent compartmentalization and pleocytosis suggesting an inflammatory immune response to ongoing replication (Fig. 3B); and iv) sequestration of a transmitted variant within the CNS (Fig. 4C). Collectively these markers defined approximately 30% of subjects in the first two years as having evidence of viral replication in the CNS in at least one time point, and 16% having evidence of replication and/or inflammation at multiple time points within this period. This suggests that the CNS compartment is permissive for HIV-1 replication in at least a subset of subjects from a very early period after infection.
Entry tropism analysis revealed that all compartmentalized variants required high levels of CD4 for entry. It is now widely described in the literature that macrophage-tropic variants utilize low levels of CD4 for entry [24][25][26][27][28][29][30][31], are not transmitted, and that the transmitted virus is R5 T cell-tropic [16,27,28,[34][35][36][37][38][39], an understanding further supported by our phenotypic analysis (Fig. 5). Our finding that the viruses involved in this early persistent CNS replication were adapted to replication in CD4+ T cells is distinct from previous studies of individuals with HAD where genetically complex compartmentalized CSF populations that had been replicating as an isolated population had evolved to replicate in macrophages/microglia [41]. Thus, adaptation to use low levels of CD4 for entry, a hallmark of macrophage tropism, is not a feature of the transmitted virus and does not evolve during the early stages of CNS infection in adults, at least as reflected in the compartmentalized virus detected in the CSF. However, we do note that the compartmentalized viruses from the CSF show a small but statistically significant increase in the ability to enter cells with low levels of CD4 compared to CSF virus from equilibrated subjects (Fig. 5B). One explanation for this small difference is that the virus in the CNS is carrying out at least a portion of its replication in a cell with low levels of CD4 which allows for at least a low level of selection for a low CD4 entry phenotype. We found no consistent differences in glycosylation site count or positions, or consistent sequence changes in sites previously described as being associated with macrophage tropism in comparing the viral sequences from the plasma to the compartmentalized sequences in the CSF (S2 Table and S3).
The only CNS tissue that is readily sampled in volunteer human subjects is CSF, which, though not identical to brain, is produced within the brain in the choroid plexus, and reflects brain inflammation and infection in the context of CNS infections including HIV-1. Measures of immune activation, HIV-1 burden, and neural injury detected in CSF are markers of brain involvement in HIV-1 that correlate to clinical and pathologic disease [51]. While the cellular source of HIV-1 RNA detected in the CSF is not certain and may differ during different stages of infection [52], compartmentalization of HIV-1 detected in CSF associates with clinical dementia in humans [10] and immunopathology in the brain in rhesus macaques [53]. Despite limitations of generalizing CSF findings to those of the CNS more broadly, our studies have used the best methods available in living humans to assess HIV-1 populations derived from the CNS in a unique cohort of subjects enrolled during primary HIV-1 infection. Our results show that in cross-sectional analysis over the first two years of HIV-1 infection, 30% of subjects have evidence of either local viral replication in the CNS or a robust CNS inflammatory response, and that in approximately 16% of subjects this CNS involvement can persist over time. We have found that the viral population in the CSF is dynamic as the result of local replication and/or the influx of virus in infected CD4+ T cells as part of an inflammatory response. This early viral replication in a subset of subjects may represent an inability to protect the CNS from infection, potentially leading to HAND later in infection, and may also define a distinct reservoir of infected cells within the body. Longitudinal follow-up of these subjects to examine the long-term impact of the presence of early active HIV-1 replication in the CNS will help to define the significance of these findings for clinical neurologic disease outcomes and compartmentalized viral reservoirs in the setting of HIV-1.

Ethics statement
The study was approved by the Institutional Review Boards at UCSF, Yale University, and the University of North Carolina at Chapel Hill. All study participants were adults (age 18 years). Written informed consent was obtained from all participants.

Study design
We assessed samples obtained through an observational longitudinal neurological study of primary HIV-1 infection to determine viral characteristics associated with early HIV-1 CNS infection. Subjects were referred from the community and were eligible if they met prospectively determined criteria for laboratory confirmation of primary HIV-1 infection, as previously described [4]. Subjects were screened at study enrollment for systemic syphilis by blood RPR testing. Subsequent blood and CSF samples were tested for RPR and VDRL, respectively, if an outside test suggested syphilis exposure or CSF WBC was markedly elevated compared to that in an earlier longitudinal sample. No subjects had clinical evidence of other inflammatory neurologic disorders such as multiple sclerosis or CNS opportunistic infections based on interview and examination by an HIV-1 neurologist at each visit. A total of 57 of the 72 subjects volunteered to participate in magnetic resonance imaging of the brain for the overall study protocol; scans were reviewed by a neuroradiologist and none revealed evidence of encephalitis, tumor or opportunistic infection. Blood and CSF samples were collected at enrollment, six weeks, and every six months thereafter. This analysis included samples obtained up to two years post-infection from subjects enrolled prior to 4/1/2012. CSF and plasma HIV-1 RNA concentrations were determined as described [4]; paired samples were selected for further SGA analysis if CSF HIV-1 RNA was greater than1,000 copies/ml (to ensure adequate sampling). Samples with lower viral loads could be analyzed if larger volumes were committed to concentrate the virus, but we chose to use a cut-off of 1,000 viral RNA copies/ml for these studies. Primary study endpoints included SGA of the HIV-1 env gene for viral genetic compartmentalization and phenotypic analyses, CSF and blood HIV-1 RNA concentrations, measures of CSF cellular inflammatory response (white blood count, WBC) and blood brain barrier disruption (CSF: plasma albumin ratio).

Single genome amplification
When collected, CSF samples were initially placed on wet ice and delivered to the lab for processing within one hour. For virology analyses, CSF samples were centrifuged at 1200 x g for 10 min to remove contaminating cells or cellular debris, and the supernatant was subsequently aliquoted and stored at −70°C for later assay. Viral RNA was isolated as previously described from the CSF supernatant [4]. Briefly, RNA was isolated from samples (140 μl) with viral loads >10,000 copies/ml using the QIAmp Viral RNA Mini kit (Qiagen). To increase template number, samples with viral loads <10,000 copies/ml were first pelleted by ultracentrifugation. cDNA was generated using an oligo-d(T) primer. Single genome amplification/template endpoint dilution PCR [38] of the env gene through the 3' LTR U3 end was conducted using the cDNA as template as previously described [4,14]. Sequences for full-length env were generated (samples analyzed previously were sequenced from the start of V1 through the ectodomain of gp41 [4]).

Phylogenetic analysis of env viral sequences
Phylogenetic analysis were conducted in a manner similar to our previous study [14]. In brief, DNA sequences were aligned (MUSCLE) [54][55][56] using EBI web tools [57], and phylogenetic trees were generated (neighbor-joining method, MEGA 4.0 [58]). Phylogenetic states were determined by statistical evaluation using the Slatkin-Maddison (SM) test [18] as previously described [14], Wright's measure of population subdivision (F st ) [19,20] and the Nearestneighbor statistic (S nn ) [21]. CSF populations were defined as being compartmentalization if all three statistical tests (SM, F st and S nn ) yielded significant results (P values < 0.05) or equilibrated if one or more of tests yielded non-significant results. Low bootstrap values were used (35) because of the overall low diversity of the viral populations early after infection. Clonally amplified lineages (short branch lengths with bootstrap values 99 and a clade of 3 variants) were also identified. No contamination occurred between samples (S3 Fig.) [4].

Bayesian analysis
A Bayesian Markov Chain Monte Carlo (MCMC) approach using BEAST v.1.6.1 [15] estimated the TMRCA for each viral population. A substitution rate of 1.5x10 −5 substitutions/site/generation and standard deviation of 3.0x10 −6 were fixed under a lognormal relaxed clock (uncorrelated) model. The rate was calculated via tip dating, using a consensus sequence (set as day 0), and the estimated days post infection. The HKY nucleotide substitution model had estimated base frequencies and a gamma-distributed rate heterogeneity (4 gamma categories). A coalescent Bayesian Skyline tree prior with a Piecewise-constant skyline model was used (10 groups). The MCMC algorithm was run for 30 million generations, logging every 1000 and with a 10% burn-in. The results from at least two independent runs were combined, and the effective sample size for all estimates was >200. A generation time of 1.0 day was used.

Construction of HIV-1 env clones
Full length HIV-1 env genes were re-amplified from the first-round SGA products as previously described [41]. The PCR product was cloned into the pcDNA3.1D/V5-His-TOPO expression vector (Invitrogen) using the pcDNA 3.1 directional TOPO expression kit (Invitrogen).

Env-pseudotyped viruses
Env-pseudotyped luciferase reporter viruses were generated as previously described [14]. Briefly, 293T cells were cotransfected with an env expression vector and the pNL4-3.LucR-E-HIV-1 backbone (obtained from the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH) using the Fugene 6 transfection reagent and protocol (Roche). Transfection medium was replaced with fresh culture medium five hours post-transfection and the cells were incubated at 37°C for 48 hours, after which viral supernatants were filtered with 0.45 μM filters (Millipore) and stored at −80°C.

Single-cycle infection of 293-Affinofile cells
As described previously [14], Env-pseudotyped luciferase reporter viruses were first titered on 293-Affinofile cells [32] expressing CD4 high /CCR5 high . For viral infections, black tissue culture plates (96 wells) were coated with 10% poly-L-lysine and seeded with 293-Affinofile cells (1.85 x 10 4 cells/well). Eighteen to 24 hours later, expression of CD4 and CCR5 was induced at CD4 high /CCR5 high and CD4 low /CCR5 high . Eighteen to 24 hours later, the induction medium was removed and replaced with 100 μl of fresh, warmed culture medium containing Env-pseudotyped virus. The plates were spinoculated [59] at 2,000 rpm for 2 hours at 37°C, and incubated for 48 hours at 37°C. Infection medium was removed, cells were lysed, and luciferase activity was assayed using the luciferase assay system (Promega). Clone sequences were not compared to the original parental sequence prior to pseudotyping and Affinofile cell infection. Instead, entry tropism data for each parental amplicon included three replicates from 2-3 clones derived from the same parental amplicon. In this analysis we assume any PCR-introduced error would either not change the entry phenotype or would create a nonfunctional protein which would not be included in the subsequent analysis. The concordance of the entry phenotype of the replicate clones was taken to represent the phenotype of the amplicon sequence.

Statistical analysis
Compartmentalization was assessed statistically using the Slatkin-Maddison test for gene flow [18], Wright's measure of population subdivision (F st ) [19,20] and the Nearest-neighbor statistic (S nn ) [21]. Differences between groups were examined for statistical significance using the Mann-Whitney Test. The one exception was our analysis examining whether env genes derived from samples with and without CSF compartmentalized lineages differed in their ability to enter Affinofile cells expressing low levels of CD4 expression. We used a linear model to perform this analysis (performed in [R]) and used the stepAIC function to perform stepwise model selection. All correlations employed Spearman's rank correlation coefficient. For all statistical tests, P values less than 0.05 were considered significant.

Nucleotide sequence accession numbers
The HIV-1 env nucleotide sequences determined in this study have been deposited in GenBank under accession numbers KM353586-KM355197 Supporting Information MDMs were infected with two positive controls that are known to be macophage-tropic (4051_C3 [41] and Ba-L [60]), two negative controls that are known to be T cell-tropic (4051_P8 [41] and JRCSF [61]) and pairs of pseudoviruses from nine subjects described in this study. Each pair was comprised of a CSF-derived virus ("C" clone) and a plasma-derived virus ("P" clone). Each virus was used to infect three replicate wells of cells from donors 1 and 3 and two replicate wells from donor 2. MDMs were generated and infected using the protocol described in Joseph et al. [62]. Briefly, monocytes were isolated from the blood of three healthy donors and differentiated for seven days in medium containing recombinant human macrophage colony stimulating factor (M-CSF). MDMs were then infected with the volume of each pseudovirus stock that we previously determined to generate 800,000 relative light units (RLU) of luciferase expression when used to infect maximally induced Affinofile cells. Five days after infection, MDMs were lysed and luciferase expression was measured. The y-axis shows the mean RLUs from replicate wells. (TIFF)   Table. Selective analysis of amino acid differences between compartmentalized and plasma populations in subjects analyzed for entry tropism in MDMs. (DOCX)