The Genotypic False Positive Rate Determined by V3 Population Sequencing Can Predict the Burden of HIV-1 CXCR4-using Species Detected by Pyrosequencing

Objective The false-positive rate (FPR) is a percentage-score provided by Geno2Pheno-algorithm indicating the likelihood that a V3-sequence is falsely predicted as CXCR4-using. We evaluated the correlation between FPR obtained by V3 population-sequencing and the burden of CXCR4-using variants detected by V3 ultra-deep sequencing (UDPS) and Enhanced-Sensitivity Trofile assay (ESTA). Methods 54 HIV-1 B-subtype infected-patients (all maraviroc-naïve), with viremia >10,000copies/ml, were analyzed. HIV-tropism was assessed by V3 population-sequencing, UDPS (considering variants with >0.5% prevalence), and ESTA. Results By UDPS, CCR5-using variants were detected in 53/54 patients, irrespective of FPR values, and their intra-patient prevalence progressively increased by increasing the FPR obtained by V3 population-sequencing (rho = 0.75, p = 5.0e-8). Conversely, the intra-patient prevalence of CXCR4-using variants in the 54 patients analyzed progressively decreased by increasing the FPR (rho = −0.61; p = 9.3e-6). Indeed, no CXCR4-using variants were detected in 13/13 patients with FPR>60. They were present in 7/18 (38.8%) patients with FPR 20–60 (intra-patient prevalence range: 2.1%–18.4%), in 5/7 (71.4%) with FPR 10–20, in 4/6 (66.7%) with FPR 5–10, and in 10/10(100%) with FPR<5 (intra-patient prevalence range: 12.1%–98.1%). Conclusions FPR by V3 population-sequencing can predict the burden of CXCR4-using variants. This information can be used to optimize the management of tropism determination in clinical practice. Due to its low cost and short turnaround time, V3 population-sequencing may represent the most feasible test for HIV-1 tropism determination. More sensitive methodologies (as UDPS) might be useful when V3 population-sequencing provides a FPR >20 (particularly in the range 20–60), allowing a more careful identification of patients harboring CXCR4-using variants.


Introduction
Human immunodeficiency virus type 1 (HIV-1) entry into host cells requires coordinated interactions of the envelope glycoprotein gp120 with the CD4 receptor and with one of the chemokine receptors, CCR5 or CXCR4. Pure CCR5-tropic and pure CXCR4-tropic virus use only the CCR5 and CXCR4 co-receptors to enter target-cells, respectively, while dual-tropic virus can use both co-receptors [1].
The impact of HIV-1 co-receptor usage has been correlated with the rate of disease progression in HIV-1 infected individuals [2][3][4]. Determining HIV-1 co-receptor usage is also critical since the CCR5 co-receptor has become the target of a new class of anti-HIV-1 drugs that specifically inhibit the entry of CCR5-tropic HIV-1 strains into the target cells by allosteric inhibition of the CCR5 co-receptor [5]. Maraviroc is the first approved CCR5 antagonist, that entered clinical practice in 2007. Since then, assessment of HIV-1 co-receptor usage is mandatory for the clinical use of this drug (http://www.aidsinfo.nih.gov/ ContentFiles/AdultandAdolescentGL.pdf) [6].
HI V-1 co-receptor usage can be assessed with either phenotypic or genotypic approaches. The commercial Trofile assay (Monogram Biosciences, San Francisco, California, USA), and its newer version the enhanced sensitivity Trofile assay (ESTA) (with a nominal lower limit of sensitivity of 0.3% for detecting CXCR4-using virus within clonal mixture) have been so far the most widely applied phenotypic test. Due to logistical and financial limitations of Trofile assays, different genotypic assays have been developed (http://www.aidsinfo.nih.gov/ContentFiles/ Adultand AdolescentGL.pdf) [6]. They are based on the amplification and sequencing of the patient's derived HIV-1 gp120 V3 domain, which is the major determinant for co-receptor binding [7][8][9].
Two approaches have been used for V3 sequencing: V3 population sequencing and V3 ultra-deep pyrosequencing (UDPS). V3 population sequencing is currently used in routine clinical practice especially in Europe, while UDPS is mainly used for research purposes [6,[10][11][12][13][14][15][16][17][18][19][20][21]. In comparison to V3 population sequencing, UDPS can capture a detailed cross-section of coreceptor use across a patient's viral population and quantify the prevalence of CXCR4-using variants within the patient. The genetic information contained in the V3 sequence (generated by either V3 population-or ultra-deep sequencing) is then used to infer HIV-1 tropism by using web-based bioinformatic interpretation algorithms. Among them, Geno2pheno (http://coreceptor. bioinf.mpiinf.mpg.de/) is so far the most commonly used interpretation algorithm in clinical practice in Europe [6]. For the tropism prediction, Geno2Pheno provides a score, called falsepositive rate (FPR). FPR is a percentage score (range 0-100) indicating the likelihood that a V3 sequence is falsely predicted as CXCR4-using. Thus, a viral sequence with high FPR has a high probability to be CCR5-using. Although several studies have investigated the performances of genotypic tropism testing (based on V3 population sequencing) in comparison with phenotypic testing [16,18,19,22], none of them has investigated the potential correlation between the FPR and the burden of CXCR4-or CCR5-using species circulating in a patient.
In this light, this study is aimed at: i) investigating the correlation between FPR by V3 population sequencing and the burden of X4species, detected by UDPS; ii) analyzing the correlation between quasispecies diversity and frequency of CXCR4-using variants.

Patients
Stored plasma samples derived by clinical routine assessment of HIV-1 resistance from fifty-four HIV-1 infected patients were retrospectively retrieved and included in the analysis. Ethic approval was deemed unnecessary because, under Italian law, biomedical research is subjected to previous approval by ethics committes only in the hypothesis of clinical trials on medicinal products for clinical use (art. 6 and art. 9, leg. decree 211/2003). The research also was conducted on RNA samples and data previously anonymized, according to the requirements set by Italian Data Protection Code (leg. decree 196/2003). All of the selected specimen had a viral load .10,000 copies/ml at the time of sampling, and they were all infected by HIV-1 subtype B, as determined by phylogenetic analysis of pol sequences, and confirmed by V3 analysis [19]. For each specimen, HIV-1 tropism was assessed by V3 population-sequencing (based on a single PCR) and V3 ultra-deep sequencing (based on 4 PCR replicates). For 44 out 54 samples, viral tropism was also determined phenotypically by ESTA.

V3 Population Sequencing
The protocol for V3 population sequencing based on single round of amplification has been generated and optimized as previously described [18,19]. A detailed description of this protocol is reported in SI text (S1).

V3 Ultradeep Pyrosequencing
UDPS was carried out with the 454 Life Sciences platform (GS-FLX; Roche Applied Science) as described in [10,11,17], on plasma samples from all the 54 enrolled patients. Nucleic acid extraction, quantification of the templates actually undergoing UDPS and V3-specific reverse transcription PCR were performed as described in [17]. Unique in-house designed stretches of eight nucleotides (multiplex identifiers) were used to tag each sample. To maximize the genetic heterogeneity of viral population present in 1 ml of plasma and thus to ensure a good sampling of the viral population, amplicons from at least 4 replicate PCR reactions were pooled for each sample. To minimize most of the procedural/experimental errors, due to error rate of the highfidelity polymerase and the high-throughput pyrosequencing platform, a correction pipeline was adopted as previously described in [12,17]. In particular, after translation of nucleotide sequences, only the coding ones, having at least one forward and one reverse sequence, have been analysed.
To estimate the UDPS error rate, a plasmid clone containing the region of interest was sequenced in parallel with the Sanger method [9,14]. Any nucleotide differences between the two methods were considered to be GS-FLX sequencing errors. Within the env region, the crude error rate was 0.43%, reduced to 0.058% after the application of the correction pipeline (0.043% for non-homopolymeric regions and 0.11% for homopolymeric Table 1. Demographic characteristics of the study population. regions). Taking into account the estimated error rate for the highfidelity polymerase used to obtain the amplicons (,161026 mutations/bp per duplication), mutation frequencies at each nucleotide site, exceeding by at least eight times the corrected error rate, were considered to reflect true variability and not procedural/experimental errors by our in-house developed correction pipeline. Considering the number of viral templates actually undergoing UDPS and the corrected error rate, the threshold of sensitivity was set to 0.5%.

Genotypic Prediction of Viral Tropism
HIV-1 co-receptor usage was inferred from the V3 nucleotide sequence by using the geno2pheno algorithm available at the following website http://coreceptor.bioinf.mpi-inf.mpg.de/. HIV-1 co-receptor usage of V3-sequences, obtained by both population and ultra-deep sequencing, was inferred by using the clonal version of geno2pheno set at FPR of 5.75. This cut-off, used in all the analyses carried out in this study, was chosen since it has been shown to be a good predictor of virological response to a maraviroc-containing regimen in both multi-experienced and drug-naïve patients [6,14,20]. In addition, to estimate the concordance, sensitivity and specificity of tropism prediction by UDPS using ESTA as reference, a FPR of 5.75 and 10 was used.

Heterogeneity Parameters Calculation
The amino acid UDPS sequences resulting from the correction pipeline were analyzed to assess diversity and quasispecies complexity. To assess diversity, the mean genetic distance of amino acid sequences was calculated by PROTDIST using Jones-Taylor-Thornton matrix and with an in-house written code. Quasispecies complexity was calculated using normalized Shannon entropy (Sn = -S(pi ln pi)/ln N), where pi was the frequency of each distinct nucleotide sequence and N was the total number of sequences analyzed.

Statistical Analysis
Data were analyzed using the statistical software package SPSS (SPSS Inc., Chicago, IL). In particular, the correlation between the prevalence of X4 and R5 variants and the values of FPR at V3 population sequencing was assessed by Spearman's rank correlation coefficient. P-values less than 0.05 were considered statistically significant.
We observed that the FPR obtained by V3 population sequencing was directly correlated with the median FPR of V3 sequences detected by UDPS (p,0.001) (Fig.1), thus suggesting that the CCR5 usage of the entire viral population progressively increases with the FPR at V3 population sequencing.

Correlation between the FPR by V3 Population Sequencing and the Amount of R5 and X4 Species Detected by V3 Ultra-deep Sequencing
A next step of this study was to evaluate the correlation between the FPR detected by V3 population sequencing and the burden of CXCR4-using species detected by UDPS. In this analysis, at least 1 CCR5-using variant was detected in 53 out 54 patients, irrespective of FPR values obtained by population V3 sequencing. Their intra-patient prevalence progressively increased by increasing the FPR (rho = 0.75, p = 5.0e-8) (Fig.2), while intra-patient prevalence of X4 variants progressively decreased by increasing the FPR (rho = 20.61; p = 9.3e-6) (Fig.2).

Discussion
This study highlights a direct correlation between the FPR detected by V3 population sequencing and the burden of CXCR4-using species detected by UDPS in HIV-1 B-subtype infected patients.
In particular, no CXCR4-using variants were detected in patients with FPR .60 by V3 population sequencing. These results were supported in an independent dataset of 15 HIV-1 infected patients tested for HIV-1 tropism by V3 ultra-deep sequencing (454 GS-Junior). In this dataset, none of the 3 patients with FPR.60 had X4 variants (0.1% cut-off) (Ceccherini-Silberstein et al., personal communication). These results can also explain a recent study aimed at determining the prevalence and the correlates of co-receptor switch in antiretroviral-naïve patients [20]. The authors found that the FPR, obtained by V3 population sequencing at baseline, was the only variable associated with coreceptor switch in the observation period of 2 years. In particular, no switches from R5-using virus to X4-virus were observed in patients with FPR.50 [20].
For 1 patient with FPR of 60.9 by V3 population sequencing, exceptionally, the ESTA result reported an X4-tropism while UDPS reported only the presence of R5-using species. Discordances between genotypic and phenotypic tropism testing have been previously described, and can be explained by the existence of additional positions in the env gp160, beyond those within the V3 loop, which may influence viral tropism [23][24][25][26][27]. Moreover, due to the laborious ESTA procedure, we cannot exclude that such genotypic/phenotypic discordance may be due to technical issues.
Interestingly, in our study, the intra-patient prevalence of CXCR4-using variants by UDPS progressively decreased by increasing the FPR obtained by V3 population sequencing. In particular, CXCR4-using variants were observed in 38.9% (7/18) of patients with FPR ranging from 20 to 60 (X4 prevalence: 2.1%-18.4%), in 75% (9/12) patients with FPR ranging from 5 to 20 (range X4 prevalence 0.6%-98.7%), and in 100% (10/10) patients with FPR,5 (range X4 prevalence 12.1%-100%). The presence of CXCR4-using variants in almost all patients with FPR ,20 by population V3 sequencing is in line with the current guidelines [6] recommending a FPR of 20% as cut-off for the identification of patients candidate to maraviroc treatment when genotypic testing is based on a single round of PCR amplification.
Furthermore, for the specific set of patients with FPR ranging from 20 to 60, V3 population sequencing (based on single amplification) may also not be sufficient for proper determination of HIV-1 tropism, and thus, more sensitive methodologies, such as V3 UDPS or the phenotypic ESTA, might be used to identify more precisely patients candidate to maraviroc treatment. This is important since analyses from the MERIT and MOTIVATE trials have recently shown that the presence of as little as 2% of non-R5 viruses is independently associated with an increased risk of virological failure to maraviroc-containing regimens [17,20,28].
In particular, V3 UDPS has been shown to be highly predictive of clinical outcome to CCR5 antagonist in retrospective analyses of large clinical studies [17]. However, it can be achieved so far only in specialized settings (mainly at specific academic or commercial service units), and, since it is expensive and requires much computing capacity and interpretation expertise, its use in current routine clinical practice could be limited. Nevertheless, our results (even if based on a small number of patients) may suggest a guided-use of V3 UDPS, especially for patients with FPR ranging from 20 to 60. This could contribute to rationalize the use of this methodology in clinical practice.
For all these reasons, genotypic testing based on V3 population sequencing still remains the preferred test for tropism determination in several clinical settings. In this light, this study contributes to further support the use of genotypic testing as valid testing for tropism determination in line with the recommendation of recent guidelines on clinical management of HIV-1 tropism testing [6].
Finally, it is intriguing that, in line with previous results [12], intra-patient X4 frequencies were always positively correlated with parameters of quasispecies heterogeneity. This finding may suggest either a possible evolutionary pathway, during which heterogeneity accumulation is necessary to give rise to X4 variants, or, otherwise, that X4 variants are intrinsically more heterogeneous. Studies on longitudinal samples are needed in order to confirm this hypothesis.
In conclusion, this study shows that the FPR determined by V3 population sequencing can predict the burden of CXCR4-using variants in the infecting viral quasispecies, and suggests to use the FPR score with more attention before CCR5 antagonist prescription. Due to its low cost and short turnaround time, V3 population sequencing may represent the most feasible test for HIV-1 tropism determination. More sensitive methodologies might be useful when V3 population sequencing provides a FPR .20 and particularly in the range from 20 to 60, allowing a better identification of patients harboring CXCR4-using variants.