Prediction of coreceptor usage by five bioinformatics tools in a large Ethiopian HIV-1 subtype C cohort

Background Genotypic tropism testing (GTT) has been developed largely on HIV-1 subtype B. Although a few reports have analysed the utility of GTT in other subtypes, more studies using HIV-1 subtype C (HIV-1C) are needed, considering the huge contribution of HIV-1C to the global epidemic. Methods Plasma was obtained from 420 treatment-naïve HIV-1C infected Ethiopians recruited 2009–2011. The V3 region was sequenced and the coreceptor usage was predicted by five tools: Geno2Pheno clinical–and clonal–models, PhenoSeq-C, C-PSSM and Raymond’s algorithm. The impact of baseline tropism on antiretroviral treatment (ART) outcome was evaluated. Results Of 352 patients with successful baseline V3 sequences, the proportion of predicted R5 virus varied between the methods by 12.5% (78.1%-90.6%). However, only 58.2% of the predictions were concordant and only 1.7% were predicted to be X4-tropic across the five methods. Compared pairwise, the highest concordance was between C-PSSM and Geno2Pheno clonal (86.4%). In bivariate intention to treat (ITT) analysis, R5 infected patients achieved treatment success more frequently than X4 infected at month six as predicted by Geno2Pheno clinical (77.8% vs 58.7%, P = 0.004) and at month 12 by C-PSSM (61.9% vs 46.6%, P = 0.038). However, in the multivariable analysis adjusted for age, gender, baseline CD4 and viral load, only tropism as predicted by C-PSSM showed an impact on month 12 (P = 0.04, OR 2.47, 95% CI 1.06–5.79). Conclusion Each of the bioinformatics models predicted R5 tropism with comparable frequency but there was a large discordance between the methods. Baseline tropism had an impact on outcome of first line ART at month 12 in multivariable ITT analysis but only based on prediction by C-PSSM which thus possibly could be used for predicting outcome of ART in HIV-1C infected Ethiopians.


Introduction
Human immunodeficiency virus type 1 (HIV-1) enters cells through the use of the CD4 receptor and one of the two coreceptors, CCR5 or CXCR4 [1,2]. These variants are termed R5 and X4 virus, respectively [3]. Initially, determination of strain tropism was done by phenotypic assays [4], but several factors such as cost and turn-around time have limited their clinical use. In contrast, genotypic tropism tests (GTT), which allow computational prediction of coreceptor usage by sequencing the V3 region of the envelope of HIV-1 has become wide-spread. Prediction methods include the charge rule (11/24/25) and several machine learning classification algorithms, including position specific scoring matrix (C-PSSM), support vector machine (SVM) based (e.g. Geno2Pheno) and PhenoSeq. Their performance is influenced by the training data used and most are based on HIV-1 subtype B (HIV-1B) [5], posing a question about their prestanda in non-B subtypes.
Studies validating GTT report decreased sensitivity to detect X4 tropic viruses and differences between subtypes when compared to phenotypic assays [6]. We recently showed that the HIV-1 subtype C (HIV-1C) epidemic in Ethiopia is monophylogenetic and found a discordance between the predicted tropism based on the clonal and the clinical models of Geno2-Pheno [7]. Moreover, Ethiopian HIV-1C strains exhibit several unique characteristics in other parts of the genome which are likely to influence their replication characteristics [8,9].
The increasing influx of migrants to European countries has resulted in an increase of compartmentalized epidemics of non-B subtypes and recombinants forms [10], especially HIV-1C, in several countries e.g. Sweden [11], United Kingdom [12] and Greece [10]. In addition, intracountry cross-transmission between heterosexually infected and men who have sex with men contributes to the increased number of circulating HIV-1C strains [10,13]. Evaluations of the utility of different bioinformatics for GTT in HIV-1C are thus warranted. Also, R5 and X4 viruses exhibit distinct biological properties, e.g. with regard the replication kinetics, and a preferential selection of X4 virus has been reported in HIV-1B patients on antiretroviral therapy (ART) [14]. However, no study has evaluated the impact of the viral co-receptor tropism on the outcome of non-maraviroc containing ART in low-middle income countries. We therefore investigated the degree of concordance between different bioinformatics methods in a large Ethiopian HIV-1C infected cohort and correlated the findings to outcome of antiretroviral therapy (ART). Eastern Ethiopia; Jimma, Western Ethiopia; Gondar and Mekele, Northern Ethiopia; Hawassa, Southern Ethiopia; Addis Ababa, Central Ethiopia), and a clinic treating the Ethiopian military, the Mobile Group. Subjects were selected randomly, stratified by site, 60 participants from each site. An additional seven non-randomized patients were included only for the study of coreceptor switch at ART failure. The first line ART used consisted of 2 nucleoside analogue reverse transcriptase inhibitors (stavudine or zidovudine or tenofovir and 3TC) and one NNRTI (efavirenz or nevirapin).
The plasma samples were temporarily stored at -20˚C at each site, transported to the central laboratory of the Ethiopian Health and Nutrition research institute (EHNRI) and stored at -80˚C. HIV-1 RNA load (VL) was analysed by MT 2000 real time PCR (Abbott, USA) and CD4+ T-cell count was enumerated using FACSCount (Becton Dickenson), according to standard operating procedures at EHNRI.
Ethical approval for the study was obtained from the Ethiopian Ministry of Science and Technology and the EHNRI institutional review board. Written informed consent was obtained from all participants.
As bulk sequencing often produces sequences with sites of ambiguity whereby two or more nucleotides occur with similar frequency, V3-nucleotide sequences were used as input for PhenoSeq-C, which along with the tropism prediction converts bulk nucleotide V3 sequences containing ambiguity into multiple unambiguous amino acid sequences, by generating and translating all possible nucleotide combinations [18]. The obtained amino acid sequences were used as an input for the other algorithms. PhenoSeq-C and C-PSSM provide a binary answer: the virus can use X4 or the virus cannot use X4. For Geno2Pheno, a false positive rate (FPR) below 10% was considered as X4-tropic strains and coded in binary value. The virus was considered "X4 tropic" or "R5-tropic" if all sequences from a particular patient predicted to be X4 or R5-tropic, respectively, in the above-mentioned tools. Virus that were predicted to contain sequences from both X4-and R5-tropic viruses were classified as mixed "R5/X4-tropic".

Statistical analysis
Baseline socio-demographic and laboratory characteristics (gender, age, year of HIV diagnosis, year of enrolment, CD4+ T-cells, VL) were analysed. The treatment outcome was assessed with both intention-to-treat (ITT) and on-treatment (OT) analysis. In the ITT analysis, ART failure was defined as either detectable VL (>1000 copies/ml), death or lost-to-follow-up (LTFU). In the OT analysis, we include only those who had actual VL data at the follow up time points and excluded patients who were dead or LTFU. Descriptive analyses included frequencies for categorical variables, mean and standard deviation or median and interquartile range for continuous variables. Chi-square test or Fisher's Exact Test were used to test differences between categorical variables. Independent t-test, Mann-Whitney, Anova and Kruskal-Wallis test assessed differences of numerical variables between two or more categories. Logistic regression models were used for the multivariable analysis of virological responses to compare differences between R5 and X4 infected patients as predicted by 4 different methods, adjusting by age and gender, baseline CD4+ T-cell count and VL. Odds Ratios (OR), 95% Confidence interval and p-values were used to present the regression models results. Results were adjusted for design effect due to the cluster nature of the study design. P-values<0.05 were considered significant. Data analysis was done by the STATA software 13 (Stata Corp. College Station, USA) and IBM SPSS Statistics, version 22 (IBM Corp).

V3 loop sequences
At baseline V3 loop sequencing was successful in 352 of 420 (84%) patients. No difference was found in viral load or other characteristics between patients who had a successful sequencing or not. The length of the V3 loop was 35 amino acids (aa) in 333 (94.6%) of the 352 sequences while 15 sequences were 34 aa, two sequences 37 aa long, two sequences 36 and 38 aa each. HIV-1C was found in 350 (99.4%) patients and A1 in two (0.6%) patients.

R5 prediction at baseline
Patients predicted to be infected with R5 or X4 strains were compared with regard to age, gender, CD4+ T-cell count and VL at baseline. No statistically significant difference was found for any comparison, except for the Geno2Pheno clinical model where the CD4+ T-cells were higher in R5 patients (cells/μl: 132 vs 58; P<0.001) ( Table 1).
There was a tendency to higher CD4+ T-cells in R5 patients also for the Geno2Pheno clonal model (p = 0.09) and the C-PSSM (p = 0.067), but not for PhenoSeq-C.
The proportion of patients predicted to be infected with R5 virus varied between the methods by 12.5%, the highest (90.6%) by Raymond's algorithm and the lowest (78.1%) by Pheno-Seq-C (Table 1). Also, the proportion of patients predicted to be infected with mixed R5X4 viruses were similar (Geno2Pheno clinical: 2.0%; Geno2Pheno clonal: 3.1%; PhenoSeq-C: 2.3%; C-PSSM: 2.0%). However, when the five methods were compared, only 205 (58.2%) of the predictions were concordant for all five tools ( Table 2). The PhenoSeq-C and the Geno2-Pheno clinical model showed least concordance (70.5%), while the highest was between C-PSSM and Geno2Pheno clonal model (86.4%).

Impact of baseline tropism on ART outcome
Of the 352 patients, 33 patients had died, 28 were LTFU and 15 had a missing VL, at month six. Of the remaining 276 (78.4%) patients, 37 (12.7%) had >1000 copies/ml (mean VL log10, copies/ml: 5.06; range: 3.1-7.0). By OT bivariate analysis, there was no significant difference in achieving viral suppression at month six between patients with R5 or X4 virus at baseline, by any method (Table 3).
Among the 33 patients who were reported to have died at month 6, the proportion of X4 infected patients was 42.4%, 27.3%, 15.2%, 33.3% and 15.2% by G2P clinical, G2P clonal, Phe-noSeq-C, C-PSSM and Raymond respectively. Among the 28 LFTU patients he corresponding figures were 14.3% for G2P clinical, G2P clonal and PhenoSeq-C, 17.9% for C-PSSM and 3.6% for Raymond. When patients were grouped as alive, dead or LTFU at month 6, a statistically significant difference (p = 0.001) was found between the groups only when tropism was predicted by G2P clinical. At month 12, a further 10 patients had died, 28 were LTFU and 55 patients had a missing VL (mean VL: 4.9; range: 3.4-6.8). Of the remaining 198 (56.3%) patients, 22 patients (8.6%) had > 1000 copies/ml and 55 (21.7%) had missing VL. Thus, 176 out of 352 (50.0%) reached treatment success in an ITT-analysis at month 12. By OT bivariate analysis, X4 infected patients, predicted by C-PSSM, achieved viral suppression less frequently although the difference was not statistically significant (90.2% vs 79.4%, P = 0.08). However, no difference was shown between R5 and X4 infected patients at months six and 12 in multivariable OT analysis (Table 3).
In bivariate ITT analysis, a significant difference in treatment outcome between R5 and X4 infected patients was observed at month six by the Geno2Pheno clinical model (77.8% vs 58.7%,P = 0.004) and at month 12 by C-PSSM (61.9% vs 46.6%, P = 0.038) ( Table 4).
The Geno2Pheno clinical tool finding did not remain significant in the multivariable analysis, when adjusted by age, gender, baseline CD4 and VL. Instead, tropism as predicted by C-PSSM had an impact on month 12 in ITT by the multivariable analysis, with patients with R5 tropism 2.47 times more likely to achieve VL suppression compared to patients with X4 tropism (P = 0.04, OR 2.47, 95% CI 1.05-5.79) ( Table 5).

Tropism switch at months six and 12
At month six, virological failure occurred in 37 randomized patients and in seven non-randomized additional patients who were included only for the study of coreceptor switch. Plasma samples at baseline and at failure were available in 41 of them. At baseline, V3 sequencing was successful in 34 patients. The most frequent R5 tropic virus prediction was by Raymond's algorithm (30/34, 88.2%) and the lowest rate by C-PSSM (26/34, 76.5%). The most frequent  [24][25][26]. In order to compare the clinical usefulness of five commonly used bioinformatics tools in HIV-1C infected patients, we analysed the V3 loop sequences derived from a large cohort of HIV-1C infected Ethiopians and compared the predicted tropism with clinical data and ART outcomes.
The bioinformatics methods have been developed largely on HIV-1B data, although Pheno-Seq-C [18] and C-PSSM [19] have been trained on HIV-1C. In our study, each bioinformatics tool predicted a similar prevalence of R5 viruses in the HIV-1C infected Ethiopians. This is in concordance with previous comparative studies claiming a reliable performance of GTT tools [27][28][29], with no one clearly performing better than the other [30] when compared to phenotypic assays. Studies comparing the concordance between genotypic tools are however scarce. In our study, large differences were found between the four bioinformatics tools with an overall concordance of 58.2% in 352 treatment naïve patients. This discrepancy is likely due to the use of different statistical models and e.g. how to handle insertions, deletions and ambiguous positions [31]. In agreement with our finding, one study which employed Geno2Pheno (clinical as well as clonal), PSSM, and Raymond's algorithm reported concordance in 29 of 50 samples (58%) [30]. In contrary, the prediction in patients failing ART seemed to have a better concordance, although only few patients were analysed. A preferential X4-tropic dependent elimination during ART has been reported [14]. Also, we found that a switch from X4 to R5 viruses at failure month six was more common with C-PSSM. If these selection events result in a more homogenous viral population at failure, in connection with a lower viral load, this could possibly explain the higher concordance between the methods. When comparing the prediction tools pairwise, the C-PSSM and Geno2Pheno clonal models were highly concordant (86.4%), in line with previous studies which demonstrated the best concordance (>85%) between these two tools among several [32][33][34]. A V3 length other than 35 amino acids have been reported to be the only factor independently associated with prediction disagreement for Geno2Pheno clonal model and PSSM (Seclen et al 2011). However, we could not find any association of any factors including the V3 length with prediction disagreement in our data set. It should be noted that in Seclen's study nearly 20% of the V3 sequences analysed had amino acid length other than 35 while in our study the vast majority of V3 sequences (94.6%) had amino acid length of 35.
Data regarding impact of baseline tropism on first line, non-maraviroc containing, ART outcome are scarce. Some reports have described poorer viral load suppression or CD4+ T cell increase in patients with X4 virus at baseline [35][36][37][38], while others show similar rates [39,40]. In addition, a recent study showed that presence of CXCR4-using viruses was associated with the virological failure of antiretroviral treatment initiated during primary HIV infection [41]. The clinical value of GTT in HIV-1B has thus not clearly been shown for non-maraviroc containing ART. Also, it has been claimed that tools based on HIV-1B perform well for HIV-1C [20], while others report the contrary [19]. We did not find any significant association of the predicted coreceptor usage to clinical parameters or outcome of ART, for most comparisons. The presence of R5 virus was associated with higher CD4+ T-cells only with the Geno2Pheno clinical model, which also correlated to the month six ART outcome in a bivariate analysis. However, in a multivariate ITT analysis only tropism predicted by C-PSSM correlated to ART outcome, at month 12. Thus, the clinical value for predicting outcome of ART by GTT is limited in an Ethiopian setting. To our best knowledge, this is the first study to analyse the impact of baseline viral tropism on first line ART outcome, using a large data set from a cohort of exclusively HIV-1C infected patients.
There are some shortcomings in our study. We classified patients who were predicted to have a mixed R5X4 infection as having X4 virus in the statistical analysis, despite that no clonal analysis was done, since X4 viruses are uncommonn in HIV-1C patients. However, only 2% of the samples exhibited such a pattern by each prediction method and did not affect the results. We did not compare our genotypic data with a phenotypic method which is usually considered as a gold standard so that determination of sensitivity and specificity of the GTT methods for HIV-1C was not possible. However, the purpose of our study was rather to compare the results with clinical information in the largest cohort of HIV-1C patients analysed so far with GTT. Consequently, we could not identify a single GTT, which correlated best to clinical outcome in HIV-1C patients, although tropism predicted by C-PSSM correlated to outcome at month 12 in an ITT analysis. Moreover, we did not adjust for multiple comparisons since this study focused on only a few scientifically sensible comparisons, rather than every possible comparison. For this reason there is a risk of false positive findings.

Conclusion
Each of the GTT predicted R5 tropism with comparable high frequency in HIV-1C infected Ethiopians, but there was a large discordance and only 1.7% were X4-tropic by all five methods. Baseline tropism had an impact on outcome of first line ART in only multivariate ITT analysis based on C-PSSM, at month 12, although no difference was found at month 6. Thus, among all of the tropism prediction methods tested, C-PSSM could possibly be used to predict outcome of ART in HIV-1C infected patients.
Supporting information S1 File. Demographic and clinical information of study subjects at baseline and follow-up points (month 6 and month12). (XLSX)