In–Depth Characterization of Viral Isolates from Plasma and Cells Compared with Plasma Circulating Quasispecies in Early HIV-1 Infection

Background The use of in vitro models to unravel the phenotypic characteristics of circulating viral variants is key to understanding HIV-1 pathogenesis but limited by the availability of primary viral isolates from biological samples. However, overall in vivo genetic variability of HIV-1 within a subject may not be reflected in the viable viral population obtained after isolation. Although several studies have tried to determine whether viral populations expanded in vitro are representative of in vivo findings, the answer remains unclear due to the reduced number of clonal sequences analyzed or samples compared. In order to overcome previous experimental limitations, here we applied Deep Pyrosequencing (DPS) technology in combination with phenotypic experiments to analyze and compare with unprecedented detail the composition of viral isolates and in vivo quasispecies. Methodology/Principal Findings We amplified by DPS HIV-1 genomic regions covering gag, protease, integrase and env-V3 to characterize paired isolates from plasma and peripheral blood mononuclear cells and compare them with total plasma viral RNA in four recently HIV-1 infected subjects. Our study demonstrated the presence of unique haplotypes scattered between sample types with conservation of major variants. In addition, no differences in intra- and inter-population encoded protein variability were found between the different types of isolates or when these were compared to plasma viral RNA within subjects. Additionally, in vitro experiments demonstrated phenotypic similarities in terms of replicative capacity and co-receptor usage between viral isolates and plasma viral RNA. Conclusion This study is the first in-depth comparison and characterization of viral isolates from different sources and plasma circulating quasispecies using DPS in recently HIV-1 infected subjects. Our data supports the use of primary isolates regardless of their plasma or cellular origin to define genetic variability and biological traits of circulating HIV-1 quasispecies.


Introduction
Human immunodeficiency virus (HIV-1) exhibits a high degree of genetic diversity particularly difficult to characterize due to the complexity of the RNA viral populations. This complexity is associated with factors such as the lack of proof-reading activity of HIV-1 polymerase, the high rate of generation of viral particles, and the recombination and hypermutagenesis process favored by host cellular proteins [1,2,3,4,5,6,7]. Consequently, the HIV-1 population is composed of a swarm of genetically related variants, known as viral quasispecies, which grant the virus with the ability to quickly adapt to various selective pressures. Examples of the rapid adaptive machinery of HIV-1 are the selection of mutations enabling escape from the humoral and cellular host immune responses [8,9,10,11] and the selection of mutations generating resistance to currently available antiretroviral drugs [12]. Therefore, to define the composition of HIV-1 quasispecies and identify virus diversity or variability within a single infected subject or at the population level it is essential to understand the pathogenesis of HIV-1 and design optimal antiretroviral treatments and vaccines.
Some studies associated pathogen diversity with poor prognosis [13,14,15], and increased diversity of HIV-1 has been related to disease progression [16,17]. As a result, the maintenance of virus population structures in primary isolates is a key feature for the accurate study of specific viral biological traits, such as fitness and co-receptor usage, which are central to completing our understanding of the HIV-1 pathogenesis. The recent development of a new generation of massively parallel sequencing technologies has enabled us to carry out comprehensive studies of the genotypic characteristics of viral populations, genetically comparing thousands of sequences and increasing our chances of identifying minority variants. Deep Pyrosequencing (DPS) technology has made possible to describe the complexity of viral dynamics during immune escape, to quantify the presence of minority drug resistance variants, and to define virus co-receptor use for the management of CCR5 antagonists [18,19,20,21,22].
This study aims to investigate with the use of DPS technologies whether viral isolates from biological samples preserves the variability of circulating viruses and the phenotypic features found in vivo. For that reason, we compared paired HIV-1 isolates obtained from plasma and cells with total plasma viral RNA in four recently HIV-1-infected subjects. We combined multipleamplicon DPS covering gag, protease, integrase, and env-V3 with in vitro replicative capacity and virus co-receptor use assays in order to address the genetic and phenotypic associations between HIV-1 isolates and viral quasispecies.

Efficiency of HIV-1 recovery correlates with sample viral load for both plasma-derived and cell-derived viral isolates
In order to compare the efficiency of the methods used to obtain primary HIV-1 isolates from plasma or peripheral blood mononuclear cells (PBMCs), we analyzed a total 94 samples from different subjects at unique time-points, with the exception of the four included in the study; 56 plasma samples and 38 PBMCs samples with viral loads ranging from 10 to .10 6 copies/ml. Of those, we recovered a total of 63 primary isolates (34 from plasma samples and 29 from PBMCs). After stratification of samples by viral load, we observed an increase in the efficiency of virus recovery concomitant with the increase in viral load for both plasma and PBMCs HIV-1 isolation methods, Fig. 1. Furthermore, the categorization of viral load ranges into linear values demonstrated the existence of a direct correlation between sample viral load range and efficiency of virus recovery (Plasma: r = 0.94, p,0.016; PBMCs: r = 0.94 p,0.016 [Spearman correlation test]). Therefore, overall efficiency of the HIV-1 isolation methods used was similar and correlated to sample viral load.
Phylogenetic analysis of multiple-amplicon DPS reveals clusters of interspersed variants between cell virus isolates, plasma virus isolates, and plasma viral RNA Four naïve, recent HIV-1-infected subjects were enrolled in the study. A summary of their clinical and epidemiological characteristics is shown in Table 1. Three sample types from a unique blood sample were obtained per subject, as represented in Fig. 2, for comparative purposes: 1.Total plasma viral RNA (RNA); 2. Plasma virus isolates (VP) after HIV-1 capture from plasma and virus in vitro expansion and; 3. Cell virus isolates (VC) obtained from PBMCs co-culture and virus in vitro expansion.
VP and VC primary isolates were expanded in vitro for a period of 2 to 3 weeks and 3 to 4 weeks respectively. Afterwards, virus were harvested for further genotypic (DPS) and phenotypic characterization (Tropism and Replicative Capacity) Fig. 2.
Multiple-amplicon DPS was carried out in the three samples types RNA, VP, and VC, thus covering the gag, protease, integrase, and env-V3 regions with an average number of reads per nucleotide of 4039, 4193, 3629, and 4488, respectively. Data extracted using DPS were corrected for sequencing errors, filtered to a final number of unique reads, and merged into haplotypes (unique sequences represented in $1%), a resume of the sequences obtained after the various filtering steps is represented in Table 2. Final haplotypes were used to build phylogenetic trees based on the best-inferred model for conserved regions gag, protease and integrase as well as variable regions env-V3 of the HIV-1 proteome. As shown in Fig. 3, the phylogenetic trees for gag, protease, and integrase did not show segregation of clusters between VC, VP, and RNA variants, with low genetic distances between sample types and preservation of major variants after in vitro culture. A similar tree topology was observed for the variable env-V3 loop region, with clear interspersion of major variants from VC, VP, and RNA. A tendency toward clustering of VP was found in the case of P21 for gag, P23 for integrase, and P22 for env-V3. However, this pattern was not consistent for other genes within the same subjects. In summary, VC, VP, and total plasma viral RNA populations were structured in closely related quasispecies represented by interspersed variants with a low genetic distance between them.
Low intra-and inter-population variability for VC, VP, and RNA variants among HIV-1 proteins To define in detail VC, VP, and total viral RNA populations, we calculated intra-and inter-population variability, defined as the tendency for individual genomes to vary from one to another in a population. For that purpose, we simulated a viral population by considering the sequences obtained in the DPS run as a sample of the real population. We measured pairwise intra-and interpopulation variability according to sample type for each HIV-1 protein and subject. We found low intra-population variability, with values close to zero for VC, VP, and RNA populations in all subjects and genetic regions (Table 3). Additionally, interpopulation analyses comparing RNA with VC, RNA with VP, and VC with VP (Table 3) demonstrated a similar pattern of low variability. These results indicated that VC, VP, and plasma viral RNA populations were composed of HIV-1 variants with low intra-and inter-population variability.

VC and VP isolates display similar in vitro replicative capacity in primary cultures
In certain cases, heterogeneity in the distribution of quasipecies during in vitro passage of HIV-1 modified virus fitness in the absence of changes in the consensus sequences [23]. In order to test whether minor genetic changes described in our populations (synonymous changes, differences in the number of unique variants) affected the phenotypic properties of VC and VP  isolates, we measured replicative capacity for VC and VP isolates in primary cells. After infection, viral growth was monitored by p24 production for one week and the log-transformed data on the exponential growth phase used to calculate the virus growth rate (slope of the linear regression) for each type of isolate. As shown in Fig. 4, VC and VP pairs display similar replication kinetics with no differences in replicative capacity per pair in any of the study subjects. Thus, in spite of minor genetic differences in quasispecies composition, our data revealed no differences in replicative capacity between VC and VP isolates for each subject.
Phenotypic determination and genotypic prediction of co-receptor usage in VC isolates, VP isolates, and plasma RNA HIV-1 co-receptor use is a key determinant of viral pathogenesis; the presence of CXCR4 using strains has been related to disease progression, and detection of minor CXCR4 variants has a clear clinical interest in the management of CCR5-antagonists [21,24,25,26]. Therefore, in order to understand the relationship between VC, VP and plasma RNA, we compared virus coreceptor use by means of U87 cells in VC and VP isolates and by means of ESTA in plasma RNA. Furthermore, for genotypic prediction of virus co-receptor, we used the PSSM and g2p algorithms in env-V3 loop from the most frequent haplotypes, with cut-off values of 24.75 and #3.5, respectively. These results are summarized in Table 4. Phenotypic data show a concordance of 100% between U87 and ESTA results. Moreover, genotypic prediction of co-receptor usage with PSSM was 75% (9/12) concordant with g2p. In spite of minor discrepancies between the methods used, plasma RNA, VC, and VP isolates exhibited good matching in terms of virus co-receptor per study subject and sample type. Additionally, a more detailed prediction of coreceptor use was made in VC, VP, and RNA by inference of g2p and PSSM scores in the unique env-V3 sequences extracted from DPS. We observed a cluster of combined variants from VC, VP, and RNA with low intra-patient deviation and preferential R5 use, with the exception of p20 Fig. S1. In the case of P20, g2p and PSMM scores from DPS sequences suggest the presence of a homogeneous population of X4R5 dual tropic virus when compared to previously defined R5+X4R5 or X4R5 HIV-1 isolates Fig. S1 [27]. Therefore, inference of phenotypic and genotypic tropism in VC and VP pairs and plasma viral RNA demonstrated concurrence in virus co-receptor usage among sample types for each study subject.

Discussion
Primary viral isolates play a key role in our understanding of the HIV-1 pathogenesis and are a common approach for various in vitro studies such as antibody neutralization, drug testing, or virus co-receptor use assays. Furthermore, the relevance of using  primary isolates for an accurate description of virus phenotype has been highlighted by differences in replicative capacity found between recombinant viruses and full isolates [28,29].
Previous studies in the HIV-1 field have determined whether viral populations from primary isolates were representative of in vivo findings with contradictory results. Some of them report a decrease in HIV-1 gp120 diversity in isolates [30], while others support the maintenance of major variants in blood after coculture with PBMCs [31]. Additionally, most of these studies are limited by the number of samples analyzed, the number of clonal sequences obtained, and their focus on comparing proviral DNA to primary isolates recovered from co-cultured PBMCs. In order to overcome previous experimental limitations, we carried out multiple-amplicon DPS to genetically compare thousands of sequences in four regions of the HIV-1 genome and clearly define phylogenetic relationships between primary isolates obtained from VP, VC after in vitro HIV-1 expansion and plasma circulating quasispecies (RNA) in vivo. Our results demonstrate a structured population of interspersed major VC, VP, and RNA variants with fluctuations in low frequency unique sequences in most of the HIV-1 genes studied (gag, protease, integrase, and env-V3) and among subjects but with no significant differences in the total numbers of unique haplotypes (data not shown). The presence of major variants in similar frequencies for VC and VP primary isolates, when compared to in vivo RNA, demonstrates the maintenance of high frequency variants after in vitro expansion in both VC and VP isolates. Furthermore, the low intra-and interpopulation variability, with values close to zero, reflects homogeneous populations both within HIV-1 proteins or sample types. Nevertheless, relative homogeneous viral populations have been reported in both proviral HIV-1 DNA and plasma HIV-1 RNA during early infection [16,32]. As a consequence, the low level of genetic variability found among primary viral isolates and total RNA, could be related to the short time after seroconversion in our samples, where homogeneous viral populations will be present before diversification at later stages of disease [16]. On the other hand, recent studies on founder virus evolution support early variation in the HIV-1 genome after transmission and accumulation of changes over the first year after infection [33]. In this context, our results suggest an adequate representation of RNA circulating quasispecies after HIV-1 in vitro expansion. However, these results should be viewed with caution until they are confirmed in chronically infected samples.
RNA virus populations are composed of a swarm of closely related genotypes or quasispecies in which viral evolution operates as a unit and adaptation is the result of cooperative interactions between multiple genomes [15]. Various studies have demonstrated how minor genetic differences in composition and quasispecies heterogeneity can modulate HIV-1 fitness in the absence of changes in population sequence [23,34]. Additionally, genetic similarities in studied regions cannot be extrapolated to the whole viral genome. Therefore, similarities in virus genotype might not take the form of similarities in virus phenotype. In this context, our data revealed no differences in terms of virus replicative capacity in paired VC and VP isolates, regardless of minor differences in genotypic composition of the viral quasispecies studied. However, our approach is limited by the short-term in vitro culture of the replicative capacity experiments and presence of antiretroviral drugs, neutralizing antibodies, cytotoxic T lymphocytes, or other selective pressures may induce unpredictable fluctuations in closely related viral populations, which are not capture in this study.
Together with replicative capacity, HIV-1 co-receptor use is an essential trait when defining HIV-1 pathogenesis. The presence of CXCR4-using HIV-1 variants is associated with disease progression [24,25,26], and detection of minor CXCR4 HIV-1 populations has become a key marker for the management of CCR5 antagonists [21,35]. A previous study showed high concordance of co-receptor usage in paired plasma and PBMCs samples during primary infection [36]. In agreement with this observation, we found concordance in co-receptor use between VC isolates, VP isolates, and plasma RNA as measured both by ESTA and U87. Comparable results were obtained by genotypic inference of virus co-receptor use in DPS env-V3 sequences with g2p and PSSM. Regardless of small differences in the methods applied intra-subject, co-receptor use was very homogeneous.
Many studies have described the use of DPS in combination with genotypic algorithms in the env-V3 variable region as a key tool when detecting minor CXCR4 populations for the management of CCR5 antagonists. We used the same approach to compare VC isolates, VP isolates, and circulating plasma quasispecies.  We found clusters of mixed sequences from VC, VP, and RNA sequences with homogeneous populations and preferential R5 use. These data contrast with those of previous studies, where DPS revealed the presence of more heterogeneous populations in proviral quasispecies [35], but argue in favor of homogeneous replication-competent populations obtained after in vitro expansion in VC primary isolates, regardless of the heterogeneity in proviral DNA.
In summary, our study provides the first direct comparison of viral isolates with plasma circulating quasipecies using DPS in recently HIV-1 infected subjects. Our data demonstrated that VC and VP share genotypic characteristics with HIV-1 quasispecies and maintain the presence of major variants after virus in vitro expansion. In spite of minor genetic differences, phenotypic data reveal similarities in paired VP and VC isolates with regard to replicative capacity and co-receptor use. Our data support the potential use of VP or VC primary isolates as a reliable tool to characterize the circulating quasispecies. Nevertheless, further comparisons will help to clarify whether our findings also apply to later stages of the disease.

Study subjects and Ethics Statement
The study sample comprised four treatment-naïve HIV-1-infected subjects. Epidemiological and clinical data are summarized in Table 1. Virus subtype was assigned based on gag, pol, and env-V3 sequences using the REGA HIV-1 Subtyping tool. The study was approved by the institutional review board of Hospital Germans Trias i Pujol, and all four subjects gave their written informed consent to participate.

Cell lines
The following reagents were obtained through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH: TZM-bl from Dr. John C. Kappes, Dr. Xiaoyun Wu and Tranzyme Inc; U87CXCR4 and U87CCR5 from Dr. HongKui Deng and Dr. Dan R. Littman as previously described [37,38].

Plasma virus isolation
Viral isolates were obtained from plasma samples using anti-CD44 beads following the manufacturer's protocol (Miltenyi Biotec, Germany) with minor modifications as previously described [29]. Briefly, before virus extraction, PBMCs from three HIV-1-seronegative donors were isolated and CD8 + T cells depleted using the RosetteSep human CD8+ depletion cocktail (Stemcell Technologies, France). Pooled CD8 + -depleted PBMCs were then stimulated under three different conditions ('363' method, Miltenyi Biotech). After 72 hours, cells were mixed to a final concentration of 10 6 cells/ml in R10 supplemented with IL-2 (100 U/ml) (Roche, Spain), and 200 ml of the extracted virus was added to the culture. Cultures were fed weekly with 10 6 cells/ml fresh 363-stimulated cells. Viral growth was monitored weekly using p24 enzyme-linked immunosorbent assay (ELISA) (Innogenetics, Spain). Virus isolates were harvested when the p24 concentration in the supernatant reached at least 100 ng/ml and then stored at 280uC.

Cell virus isolation
Viral isolates from cryopreserved cells were obtained by coculture of PBMCs from each HIV-1-infected subject with a pool of PBMCs from three HIV-1-seronegative subjects that had been previously stimulated with phytohemagglutinin (PHA) (3 mg/ml) and IL-2 (10 U/ml) for 72 hours. Viral growth was monitored weekly by p24 ELISA and cultures were fed weekly with fresh cells. Viral stocks were harvested and stored at 280uC.

PCR amplification and amplicon preparation
Total viral RNA was extracted (QIAamp Viral RNA Mini Kit TM , QIAGEN, CA) from plasma (2 ml), plasma viral isolates (1 ml), and cell viral isolates (1 ml) in order to carry out PCR amplification. gag, pol, and env-V3 were amplified using one-step reverse transcriptase polymerase chain reaction (RT-PCR) (Super-ScriptH III One-Step RT-PCR System with PlatinumH Taq High Fidelity, Invitrogen, Carlsbad, CA, USA) based on a primer set containing 59-GCA GAA TGG GAT AGA TTG CAT CCA-39 (1,417R1,440, HXB2) and 59-CCT TGT TAT GTC CTG CTT GAT ATT CAC-39 (5,438r5,464,HXB2), and 59-TAG AGC CCT GGA AGC ATC CAG GAA G-39 (5853R5877, HXB2) and 59-TTG CTA CTT GTG ATT GCT CCA TGT-39 (8,913r8,936,HXB2) for gag, pol, and env-V3, respectively. Amplification conditions were as follows: 30 minutes at 52uC during reverse transcription, 2 minutes at 94uC, 30 seconds at 94uC, 30 seconds at 55uC, and 4 minutes at 68uC for 25 cycles. A final polymerization step of 5 minutes at 68uC was applied. The enzyme used for the RT-PCR was the Super-Script III one-step PCR (Invitrogen, USA). Amplicons for QDS were generated using carried 454 adaptor A and subject-specific multiple identifiers; pyrosequencing was unidirectional. The conditions for the enzyme were 5 minutes at 94uC, 30 seconds at 52uC, and 1 minute at 68uC for 25 cycles. A final polymerization step of 5 minutes at 68uC was applied. The enzyme used was Platinum High Fidelity , and 59-CTG GGT CCC CTC CTG AGG-39 (7,315r7,332 HXB2) for gag, protease, integrase, and env-V3, respectively. All PCR reactions were performed in triplicate to reduce amplification bias and the founder effects. Triplicate amplifications were pooled before the purification procedure. Reactions were purified using the Agencourt AMPure Kit (Beckman Coulter, Germany) to eliminate the primer-dimers produced. The number of molecules was quantified by fluorometry using the Quant-iT PicoGreen dsDNA assay kit (Invitrogen, USA). When concentrations were below 5 ng/ml, amplicon quality was assessed by spectrometry using BioAnalyzer (Agilent Technologies, USA). Quantitative multiple amplicon DPS was performed in a 454 Genome Sequencer FLX (454 Life Sciences/Roche, USA) using FLX chemistry. A pNL4.3 clone was sequenced to assess the likelihood of errors during DPS. Discrepancies between data obtained by DPS and Sanger sequencing of pNL4.3 clone were attributed to the process.

Multiple amplicon DPS data clean-up and phylogenetic analysis
Data were cleaned in order to increase the quality of the sequences for down-stream analysis after multiple-amplicon DPS. The first step was to retrieve those sequences with a similarity .70%, when compared with HXB2 from the sequencing run. We then manually corrected the homopolymer tracks, since these are the most common sequencing errors produced by the technique. Sequences with stop codons within the open reading frame of the protein were removed from the analysis, and sequences containing gaps were maintained and included in the analysis, rather than being removed using a conservative bias towards an unknown nucleotide at this position. Identical sequences were collapsed into a single unique sequence or haplotype. Haplotypes with less than 1% presence in the population were removed from the analysis. A summary of the number of reads after the various filtering steps and the final number of haplotypes is represented in Table 2. Phylogenetic trees were built on the nucleotide alignment for the total unique reads collapsed into unique haplotypes. The best phylogenetic model was inferred using jModeltest v0.1.1 [39] for each HIV-1 protein in all subjects. Phylogenetic trees were constructed taking into account the inferred model in PhyML over

Population variability per HIV-1 protein and sample type
To study and reproduce the variability according to sample type and among HIV-1 proteins, we simulated a viral population taking into account the sequences obtained in the sequencing run as a sample of the real population. The percentage of each sequence, based on the sequencing run, was used to create a population of 100 sequences where each haplotype was represented as many times as indicated by the percentage of the sequence in the sequencing run. This population of 100 sequences was used to infer variability among populations in the same patient and among HIV-1 proteins. We measured pairwise intra-and inter-population variability using the best model found by jModeltest v0.1.1, as implemented in MEGA4.

Replicative capacity experiments
Viral isolates obtained from plasma and cells were titrated in the TZM-bl immortalized cell line. Replicative capacity experiments were carried out using PBMCs from three seronegative individuals; previous infection PBMCs were stimulated for 72 hours with PHA (3 mg/ml) and IL-2 (10 U/ml). Stimulated PBMCs were then infected in triplicate with an equal multiplicity of infection of each viral variant at 37uC for 2 hours. Pellets were washed twice with phosphate-buffered saline (PBS) and cultured at 37uC and 5% CO 2 in R20 supplemented with IL-2 (20 U/ml) (Roche, Spain) [41]. Viral growth was measured by p24 ELISA in supernatants over 10 days (Perkin Elmer, Spain). Replicative capacity was calculated by fitting a linear model to the log 10transformed data of p24 production and comparing the slopes as previously described [42].

Determination of virus co-receptor use
Viral tropism from VP and VC was measured in U87 immortalized cell lines expressing CCR5 or CXCR4, as previously described [27,38]. Briefly, 5,000 cells were plated on a 96-well plate and infected with 2 ng of p24 for each viral variant overnight. The next day, virus was washed 3 times with 200 ml of PBS and fresh media added to a final volume of 200 ml. Five days after infection, virus growth was identified microscopically by observation of syncytium formation, and the results were corroborated by p24. Furthermore, virus tropism was assessed in plasma samples at similar time-points using the Enhance Sensitivity Trofile Assay (ESTA, with a detection limit of 0.3% for non-R5 variants). In addition, two algorithms were used to infer virus co-receptor use based on env-V3 loop sequences from DPS: PSSM (http://indramullins.microbiol.washington.edu/ webpssm) and geno2pheno (g2p) (http://www.geno2pheno.org/) with a false positive rate of 10%. Cut-off values to define non-R5 using sequences were 24.75 for PSSM and #3.5 for g2p [21]. Figure S1 Genotipic predicition of co-receptor use in DPS sequences from VC, VP and total plasma RNA. Unique sequences obtained from the DPS of the env-V3 loop region were used to run PSSM and g2p algorithms to infer virus co-receptor use per sample type VC (ligth symbols), VP (dark symbols) and RNA (empty symbols) and subject.