Presence, persistence and effects of pre-treatment HIV-1 drug resistance variants detected using next generation sequencing: A Retrospective longitudinal study from rural coastal Kenya

Background The epidemiology of HIV-1 drug resistance (HIVDR) determined by Sanger capillary sequencing, has been widely studied. However, much less is known about HIVDR detected using next generation sequencing (NGS) methods. We aimed to determine the presence, persistence and effect of pre-treatment HIVDR variants detected using NGS in HIV-1 infected antiretroviral treatment (ART) naïve participants from rural Coastal Kenya. Methods In a retrospective longitudinal study, samples from HIV-1 infected participants collected prior [n = 2 time-points] and after [n = 1 time-point] ART initiation were considered. An ultra-deep amplicon-based NGS assay, calling for nucleotide variants at >2.0% frequency of viral population, was used. Suspected virologic failure (sVF) was defined as a one-off HIV-1 viral load of >1000 copies/ml whilst on ART. Results Of the 50 eligible participants, 12 (24.0% [95% CI: 13.1–38.2]) had at least one detectable pre-treatment HIVDR variant against Protease Inhibitors (PIs, n = 6 [12%]), Nucleoside Reverse Transcriptase Inhibitors (NRTIs, n = 4 [8.0%]) and Non-NRTIs (n = 3 [6.0%]). Overall, 15 pre-treatment resistance variants were detected (frequency, range: 2.3–92.0%). A positive correlation was observed between mutation frequency and absolute load for NRTI and/or NNRTI variants (r = 0.761 [p = 0.028]), but not for PI variants (r = -0.117 [p = 0.803]). Participants with pre-treatment NRTI and/or NNRTI resistance had increased odds of sVF (OR = 6.0; 95% CI = 1.0–36.9; p = 0.054). Conclusions Using NGS, pre-treatment resistance variants were common, though observed PI variants were unlikely transmitted, but rather probably generated de novo. Even when detected from a low frequency, pre-treatment NRTI and/or NNRTI resistance variants may adversely affect treatment outcomes.


Conclusions
Using NGS, pre-treatment resistance variants were common, though observed PI variants were unlikely transmitted, but rather probably generated de novo. Even when detected from a low frequency, pre-treatment NRTI and/or NNRTI resistance variants may adversely affect treatment outcomes.

Background
Highly active antiretroviral therapy (HAART) can successfully suppress HIV replication in vivo. However, the expansion of HAART, as expected, has led to a parallel increase in the emergence and transmission of HIV drug resistance (HIVDR) mutations [1,2]. Transmitted HIVDR viruses confer reduced susceptibility to, and are known to compromise the effectiveness of first-line HAART regimens [3,4]. The prevalence of transmitted resistance is reported to be significantly higher but stable in high-income countries, and low but on the increase in low/middle income countries [1,2]. Consequently, genotypic antiretroviral resistance testing (GART) for HIVDR is currently recommended to guide clinical care in developed settings [5], and for surveillance purposes in resource limited settings [6]. However, the inability of conventional GART using Sanger sequencing to detect resistant variants at frequencies <20% of the circulating viral quasispecies [7,8] likely leads to an underestimation in the actual burden of HIVDR [9].
Highly sensitive methods, including codon-specific point mutation assays (allele specific PCR [AS-PCR] and Oligonucleotide Ligation Assays [OLA]) and partial/whole genome ultradeep sequencing assays are becoming increasingly available and cost effective [10]. Compared to Sanger capillary sequencing, these ultrasensitive assays have the ability to detect HIVDR variants at low frequencies of <1% circulating quasispecies [11][12][13]. A systematic review and meta-analysis of HIV-1 low frequency variants studies, all from HIV-1 subtype B predominant regions and comparing AS-PCR and GART estimates, reported up to an 8-fold higher detection of some transmitted resistance variants by AS-PCR [14]. Importantly, a pooled analysis of individual data from 10 studies, all from North America and Europe, reported a significant dose-response association between presence of HIV-1 low frequency variants, particularly those conferring reduced susceptibility to NNRTIs, and increased risk of virologic failure [15], even amongst patients with high levels of adherence [16].
Only a handful of studies, all in the context of women receiving single dose Nevirapine for prevention of vertical HIV-1 transmission, and all using point mutation assays, have been reported from Africa, where the burden of HIV-1 is highest [17][18][19][20]. More recently, a study using the point mutation OLA for NNRTI variants K103N, Y181C and G190A reported a tenfold increased risk of virological failure among treatment naïve adults initiated HAART in Kenya [21].
Hence, whilst the epidemiology of transmitted HIVDR by GART has been widely studied, much less is known about HIVDR variants detected from low frequencies using next generation sequencing (NGS) assays. The few existing studies are largely from developed settings, and applied codon-specific point mutation assays particularly targeting NRTI-associated M184V and NNRTI-associated K103N, Y181C and G190A mutations. To date, about 93 transmitted HIVDR mutations conferring reduced susceptibility to NRTIs (n = 34), NNRTIs (n = 19) and protease inhibitors (PIs, n = 40) have been identified and are recommended for surveillance by the WHO [22]. We aimed to describe the presence, persistence and effect of HIVDR variants detected using an ultra-deep amplicon-based NGS assay in chronically HIV-1 infected ART-naïve individuals from a rural HIV clinic in Coastal Kenya, an HIV-1 non-B subtype predominant region.

Study site
Participants were recruited from the HIV clinic at Kilifi County Hospital, a public health facility located in rural Coastal Kenya. The hospital has a catchment population of an estimated 260,000 people, majority of whom practice subsistence farming as the mainstay socio-economic activity. HIV care and treatment services in the hospital are provided according to national guidelines [23]. In brief and at the time of the study, HAART eligibility included individuals with WHO clinical staging III/IV regardless of CD4 T-cell count, or individuals with CD4 T-cell count of <350 cells/mm 3 regardless of WHO clinical staging, or both. Standard first-line regimen comprised two nucleoside reverse transcriptase inhibitors (NRTI) and a non-nucleoside reverse transcriptase inhibitor (NNRTI). Individuals failing first-line regimen were switched to a second line regimen comprising two NRTIs and a boosted protease inhibitor (PI). Previous work suggests that this is largely a non-B subtype region, with pre-dominance of HIV-1 subtypes A1, C and D [24].

Study design
A retrospective longitudinal study design, nested in a facility-based HIV surveillance cohort was adopted. In brief, HIV-infected, newly diagnosed individuals enrolling for care at the clinic were recruited and followed up between 2008 and 2013. All participants reported no previous exposure to antiretroviral therapy, including prevention of mother to child transmission (pMTCT) interventions, at enrolment into care. Remnant blood samples from routine diagnostic investigations were used to obtain plasma, which was archived at -80˚C.
For the purpose of this study, ART-naïve individuals with at least three longitudinal archived samples that met the following criteria were considered: i) an enrolment sample (collected at enrolment into care and at least 6 months prior to treatment initiation); ii) an ART baseline sample (collected at most 3 months prior to treatment initiation); and iii) an on-treatment sample (collected at least 6 months after treatment initiation).

Laboratory methods
i) RNA extraction, PCR amplification and next generation sequencing. These have been previously described in detail elsewhere [25,26]. In brief, HIV-1 viral RNA was manually extracted using the QIAamp UltraSens protocol (QIAGEN, Hilden Germany), with a starting volume of 200 μL of plasma and eluted in 60 μL Buffer AVE.
An amplicon-based approach, targeting the pol subgenomic region of HIV-1 containing protease and part of reverse transcriptase gene (between positions 2250-3550 of the reference HxB2 sequence) was used. The fragment was amplified by single-step reverse transcription (RT)-PCR followed by a nested PCR reaction. For samples that failed to amplify, a second repeat PCR was attempted. Archived plasma was retrieved and a repeat RNA extraction done on samples that failed the repeat PCR. Then, a third repeat PCR attempt was done. Results from the repeat extraction and PCR attempts are summarised in the supplementary material (S1 Fig). Nested PCR products were quantified with the Qubit 2.0 fluorometer using the Quant-iT assay kit (Life technologies). One ng/μl of the amplified DNA was used for library preparation with the Nextera XT DNA sample prep kit (Illumina).
NGS was performed on the MiSeq system (Illumina). In brief, samples concentrations were validated and standards added to the consolidation plate. The consolidation plate was then replicated and quantified to two dilution plates of 1ng/ul and 0.2 ng/ul respectively. The second dilution plate was further replicated into a Nextera XT library preparation plate where adapter sequences are attached for tagmentation and PCR amplification. The amplified plate underwent normalization and fragmentation into a pooled amplicon library (PAL) plate, which was quantified and further reconstituted into the diluted amplicon library (DAL). DAL was then loaded and attached onto a HiSeq flow cell and reagents, which were then subsequently loaded onto a MiSeq catridge for sequencing. Short read fragments were generated in a standard FASTQ format ii) Bioinformatics and sequence handling. This was done using an automated computational pipeline developed in-house using Python and C++ and run through a workflow in Galaxy. In brief, the raw FASTQ data were taken through quality trims and filters. A subset of the reads from each FASTQ file was compared to a local database of HIV reference sequences using BLAST in order to identify an optimum reference sequence. Reference mapping was then performed using BWA-MEM (version 0.7.5). The resulting files were converted into BAM format using SAMTools in preparation for the in-house developed QuasiBAM software, which generated consensus sequences with user-specified variant thresholds for the inclusion of minority nucleotides. For the purpose of this study, HIV-1 drug resistance mutations were called if they had nucleotide variants occurring at frequencies >2.0% of the sample quasispecies. While this threshold has been shown to yield better reproducibility in the detection of HIV-1 drug resistance variants generated from NGS assays [26], it may still not represent an accurate estimate of the frequency of circulating variants, especially in samples with low viral loads, as the actual number of viral templates input used in the sequencing reaction was not empirically determined.
iii) HIV-1 subtyping and drug resistance interpretation. Consensus sequences were submitted to the REGA HIV-1 subtyping tool v3.0 [27], and cross referenced with the subtype classification using Evolutionary Algorithm (SCUEAL) tool on Datamonkey [28]. Indeterminate or discordant sequences were further submitted to the jumping profile hidden Markov Model (jphMM) tool [29] for determination of recombinants.
Consensus sequences were submitted to the Stanford HIVDR database to identify resistance mutations and interpret genotypic susceptibility [30]. Transmitted and acquired HIVDR mutations were confirmed using the list for surveillance of transmitted drug resistance and the updated list of major resistance mutations from the HIV Drug Resistance Database respectively [22,31]. iv) HIV-1 viral load quantification. HIV-1 RNA viral load quantification was done using an in-house real time qPCR assay with a detection threshold of 400 copies/ml. In brief, 25ul reactions of 12.5 ul RNA extract and 12.5 ul mastermix of the QIAGEN QuantiTect kit, probes and a set of integrase forward and reverse primers were used. Quantification was done on the ABI Prism 7500 real time PCR machine and results interpreted based on a standard curve with calibrated in-house standards. A one-off viral load was used to determine suspected virologic failure (sVF), defined as HIV-1 RNA viral load of >1000 copies/ml whilst on HAART.

Data analysis
Data analysis was done using Stata/SE version 13.1 (Stata corp. TX). The overall prevalence of pre-treatment HIVDR (95% confidence intervals, CI) was determined as a percentage of participants with at least one detectable resistance variant at >2.0% frequency in the two pretreatment time-points over the total number of participants included in the analysis. The prevalence estimate was further stratified by time points and antiretroviral drug classes. In addition, the overall and drug-class specific relationship between mutational frequencies and absolute mutational load (as a product of mutational frequencies and viral load) was assessed using Pearson's correlation coefficients. Logistic regression was used to determine the effect of pre-treatment HIVDR on sVF.

Ethical considerations
Science and Ethics approvals were granted by the National Scientific Steering Committee and the Ethics and Review Committee of the Kenya Medical Research Institute (SSC No. 1341). All participants gave written informed consent to participate in the study. All pre-treatment consensus HIV-1 pol sequence fasta files are available from GenBank (Accession numbers MH575294-MH575373).
Three longitudinal samples were retrieved from each participant at a median duration of 11.6 (min/max: 6.4-42.2) months prior to ART initiation, 1.2 (0.0-2.9) months prior to ART initiation and 21.3 (6.1-45.9) months after ART initiation, respectively. The median CD4 Tcell count dropped from the first time-point to the second-time point prior to ART initiation but increased after ART initiation as expected. Similarly, the median BMI was relatively low in the first and second time points prior to ART initiation but increased after treatment initiation ( Table 2).

Presence of HIV-1 pre-treatment drug resistance variants
We were able to successfully amplify and sequence at least one sample from all the 50 participants, either from the first (n = 48) or the second (n = 35) time points prior to ART initiation. There were no significant differences in successful amplification of time- . This was also consistent within drug classes (Fig 1[A]). High-level resistance was observed against Nevirapine, Efavirenz, Lamivudine and Emtricitabine, whilst intermediate level resistance was observed against Zidovudine and Efavirenz (Fig 1[B]). Overall, 15 mutations (frequency, range: 2.3-92.0%) were detected conferring reduced susceptibility to PIs (n = 7), NRTIs (n = 4) and NNRTIs (n = 4), with the most common being the PIs-associated D30N (n = 2) and the NNRTIs-associated G190A (n = 2). Only one participant had dual class resistance with mutations against both NRTIs (D67G) and NNRTIs (V106IM). Similarly, only one participant had more than one resistance variant within a single drug class (NNRTIs; K103N and G190A) ( Table 3). There were no significant differences in the presence of pre-treatment HIVDR by HIV-1 subtype; 18.8% of individuals with HIV-1 subtype A1 had pre-treatment HIVDR, compared to 33.3% of those with HIV-1 non-A1 subtypes (p = 0.246).

Persistence of pre-treatment HIV-1 drug resistance variants
Of the 50 participants included in the study, we were able to successfully amplify and sequence pre-treatment longitudinal pairs (time points one and two) from 32 (64.0%) participants. In a phylogenetic tree, all the paired individual sequences clustered together (Fig 2). Of these, five had at least one detectable HIVDR variant in either (or both) of the two pre-treatment time points.  (Table 3).

Table 3. Characteristics and distribution of participants with HIV-1 pre-treatment drug resistance variants detected by next generation sequencing in a rural HIV clinic in Coastal Kenya (N = 50
Overall, presence of any pre-treatment HIVDR did not have an effect on sVF (OR [95% CI], p-value: 1.7 [0.3-8.2], p = 0.527). However, participants with pre-treatment NRTI and/or NNRTI resistance had a modest six-fold increased odds of sVF, compared to those without pre-treatment NRTI and/or NNRTI resistance (OR [95% CI], p = value: 6.0 [1.0-36.9], p = 0.054).
We were able to amplify and sequence six of the nine samples with sVF. Of these, 3 (50.0%) had acquired at least one HIVDR variant, with two having dual class resistance mutations to NRTIs and NNRTIs (Table 4). Amongst those with detectable acquired resistance, all had intermediate to high-level resistance to Nevirapine and Efavirenz. No PI-associated mutations were detected, with all the participants remaining fully susceptible to recommended second line PI options.

Discussion
Our data from rural Coastal Kenya suggest that about one in four ART naive patients with chronic HIV infection have at least one surveillance HIVDR mutation detected at >2% frequencies using highly sensitive NGS methods. This represents an overall 10-20 fold increased detection of resistance variants at lower thresholds, when compared to recent estimates from the same setting but determined by conventional GART using Sanger capillary sequencing [32]. There was no significant difference between estimates from the two studies, when compared at >20% threshold, confirming that the increased fold difference is attributed to detection of additional low frequency variants.
Given the wide range of pre-treatment mutations observed, even within drug classes, use of codon specific point mutation assays, mostly targeting amino acid positions K103, Y181, G190 and M184 would have significantly underestimated the overall prevalence of surveillance HIVDR mutations determined by NGS in this setting. Indeed, recent OLA data for NNRTI variants K103N, Y181C and G190A from a Kenyan study report pre-treatment HIVDR estimates of 3.9% amongst ART-naïve individuals [21], which is significantly underestimated HIV-1 drug resistance variants detected using next generation sequencing from rural Kenya when compared to this study, and may warrant consideration for the screening for all WHO recommended drug resistance mutations by NGS, at least for surveillance purposes. However, given resource constraints and technical capacity challenges, it is also important to acknowledge that OLA may present a more feasible option for routine diagnostics in resource limited settings, compared to NGS platforms.
The observed decline in the prevalence of detectable pre-treatment resistance variants from the first time-point to the second time-point, especially within reverse transcriptase inhibitors variants, is likely attributed to decreased viral fitness of transmitted variants and the subsequent replacement by wild type virus with better replicative capacity [33,34]. Indeed, cross sectional GART data from treatment experienced patients in this setting report prevalent acquired mutations at codon positions M184, K103, Y181 and G190 [35], which would be expected to be propagated to new infections. Given that the sampled population is largely chronic, presence of the relatively stable and persistent NNRTI mutations G190A and K103N, together with the thymidine analogue mutations K70R and D67G, and the absence of the less stable M184V and Y181C mutations, which have high fitness costs and may have decayed to very low levels below the limit of detection of our assay, is therefore not surprising [33,34].
High levels of pre-treatment PI mutations compared to NRTI and NNRTI mutations were observed, which is consistent with data from some developed [13,36] and developing settings [37]. Whilst these may, in part, be attributed to the high levels of PI coverage in developed settings, the same may not apply in our setting where PIs were only recommended as a second line option and coverage was low, at less than 5%, at the time of the study. Most of the observed PI mutations were sporadic in appearance, occurred in isolation and at very low frequencies of <5%. In addition, whilst there was a positive correlation between NRTI and NNRTI mutation frequency and absolute mutation load, no correlation was observed amongst PI mutations. These data therefore suggest that the observed PI mutations, at least in our context, are likely to have developed de novo from viral diversification, and strengthens the suggestion that absolute mutation load may likely be a better indicator for identifying surveillance HIVDR at low frequencies [15,38]. A good proportion of our pre-treatment samples failed to successfully amplify, despite multiple attempts at RNA extraction and PCR amplification. Given that almost all the samples that failed to amplify were from the second time point, it is likely that these participants may have been initiated on HAART elsewhere and after our time point 1 visit, thus achieved virologic suppression by the time they were returning to our clinic for the second time point sampling.
The observed levels of sVF are consistent with literature from low-and middle-income settings [39] and comparable to what has been previously reported in this setting [35]. Despite the low numbers, participants with a pre-treatment NRTI or NNRTI mutation demonstrated a six-fold increased odds of sVF. However, and in spite of the presence of pre-treatment NRTI or NNRTI mutations in some participants, a significant drop in their viral load was observed, with some achieving virologic suppression after treatment initiation, which may be attributed to the multiple positive effects of combination antiretroviral therapy. Thus, it remains unclear if those with sVF were a result of partial suppression with on-going viral replication, or complete suppression followed by virologic rebound secondary to acquired resistance.
Unlike the pre-treatment variants, most of the acquired resistance mutations out-competed wild type virus and constituted more than 90% of the circulating quasispecies, which may be a consequence of selection by antiretroviral therapy. Of particular interest is the absence of the G190A mutation after treatment initiation in a participant who had this mutation at enrolment into care. Noteworthy is that the mutation had undergone viral decay and was replaced with wild type virus even before initiation of antiretroviral therapy, which may explain its absence after treatment initiation. As we did not quantify viral templates inputted into each RT reaction, it is also possible that the G190A variant was present as a minority variant and was detected at a higher frequency due to differential input of viral templates. Thus, the resulting sVF may therefore be explained by non-adhrence to HAART. This finding is consistent with data from other studies reporting absence of most variants detected prior to treatment initiation during virologic failure [40] and warrant further investigations on the re-emergence and potential effects of transmitted variants that revert to the wild type and are undetectable at treatment initiation, even at low frequencies, on long term treatment outcomes.
Of concern, however, is the presence of dual class resistance mutations conferring reduced susceptibility to both NRTIs and NNRTIs in two participants around a year after starting therapy, with one of these having multiple thymidine analogue mutations D67N, K70R, T215F, K219E, and the multidrug resistance T69N variant. These participants did not have any pretreatment resistance, suggesting that the acquired mutations are likely to be due to sub-optimal adherence [41,42], underpinning the vital role of adherence support in ART programs in resource limited settings, especially in light of recent policy move to put all HIV-infected individuals on HAART [43]. The expected absence of PI mutations after first line non-PI regimen, even at low frequencies, complements evidence supporting continued recommendation of PIbased second line regimen in this setting.
Our study had a few inherent limitations. Firstly, we had a small sample size with few longitudinal time points, which results to wide uncertainties in our observations. As such, it is likely that the observed sVF may in part be attributed to pre-treatment mutations at frequencies >20% or acquired mutations resulting from sub-optimal adherence. Secondly, we studied a largely chronic population with long-standing HIV infection. Hence, it is likely that we have missed out less stable transmitted variants that may have been out-competed by wild type virus by the time of study enrolment [33,44], which may have resulted in an underestimation of the prevalence of pre-treatment resistance in this population. Lastly, we used a conservative threshold of >2% frequency to effectively exclude random mutations arising from errors related with reverse transcription, PCR amplification and sequencing artefacts [45]. However, the presence of genuine pre-treatment HIVDR variants occurring at <2% frequency cannot be ruled out.
In conclusion, and limitations notwithstanding, our pilot data from a rural HIV clinic in Coastal Kenya suggest significantly higher levels of pre-treatment HIVDR detected by a more sensitive NGS method. Results from this non-subtype B predominant region with low PI coverage suggest that the pre-treatment PI variants observed are unlikely transmitted, but rather probably de novo generated. Whilst the study was not powered to determine an independent effect of pre-treatment low frequency variants on treatment outcomes, these data point towards a possible deleterious effect of NRTI and/or NNRTI mutations detected by NGS from a low frequency on treatment outcomes, though their pathways and cut-off thresholds of clinical significance remain unknown. Bigger longitudinal studies are therefore warranted to clearly delineate the effect and independent pathways of pre-treatment resistance variants detected by NGS from low frequency thresholds on treatment responses. There is also need to assess the performance of mutational frequency and absolute mutational load on treatment outcomes and to empirically determine their cut-off thresholds of clinical significance.