Progression to AIDS in South Africa Is Associated with both Reverting and Compensatory Viral Mutations

We lack the understanding of why HIV-infected individuals in South Africa progress to AIDS. We hypothesised that in end-stage disease there is a shifting dynamic between T cell imposed immunity and viral immune escape, which, through both compensatory and reverting viral mutations, results in increased viral fitness, elevated plasma viral loads and disease progression. We explored how T cell responses, viral adaptation and viral fitness inter-relate in South African cohorts recruited from Bloemfontein, the Free State (n = 278) and Durban, KwaZulu-Natal (n = 775). Immune responses were measured by γ-interferon ELISPOT assays. HLA-associated viral polymorphisms were determined using phylogenetically corrected techniques, and viral replication capacity (VRC) was measured by comparing the growth rate of gag-protease recombinant viruses against recombinant NL4-3 viruses. We report that in advanced disease (CD4 counts <100 cells/µl), T cell responses narrow, with a relative decline in Gag-directed responses (p<0.0001). This is associated with preserved selection pressure at specific viral amino acids (e.g., the T242N polymorphism within the HLA-B*57/5801 restricted TW10 epitope), but with reversion at other sites (e.g., the T186S polymorphism within the HLA-B*8101 restricted TL9 epitope), most notably in Gag and suggestive of “immune relaxation”. The median VRC from patients with CD4 counts <100 cells/µl was higher than from patients with CD4 counts ≥500 cells/µl (91.15% versus 85.19%, p = 0.0004), potentially explaining the rise in viral load associated with disease progression. Mutations at HIV Gag T186S and T242N reduced VRC, however, in advanced disease only the T242N mutants demonstrated increasing VRC, and were associated with compensatory mutations (p = 0.013). These data provide novel insights into the mechanisms of HIV disease progression in South Africa. Restoration of fitness correlates with loss of viral control in late disease, with evidence for both preserved and relaxed selection pressure across the HIV genome. Interventions that maintain viral fitness costs could potentially slow progression.


Introduction
With few exceptions, untreated individuals infected with Human Immunodeficiency Virus Type 1 (HIV-1) develop Acquired Immunodeficiency Syndrome (AIDS), associated with opportunistic infections, malignancies and, eventually, death. Some patients progress to AIDS quickly, whilst others maintain undetectable plasma viral loads without therapy and do not become unwell for many years. Deciphering the correlates of this heterogeneous protection is important, as there are implications for the design of vaccines and other interventions.
The pace of HIV disease progression is multifactorial -a mixture of host and pathogen genetics combined with factors such as the immune response and viral adaptation. In genome-wide association studies a limited number of SNPs and alleles correlate with lower viral loads [1] [2], and HLA Class I and the human MHC associate reproducibly [3]. The role of the cell-mediated immune system in HIV-associated disease has received much scrutiny, especially the effect of different HLA Class I alleles. Welldocumented examples include the protection conferred by HLA B*57 and B*27 [4] [5] in Caucasian individuals and HLA B*5801 and B*8101 in patients from South Africa [6].
What determines this differential HLA Class I effect is unclear. 'Beneficial' HLA Class I alleles may be associated with T cell clones with broader cross-reactivity to viral variants due to reduced thymic selection [7], and thus broader and more pervasive selection pressure. However, HIV is adept at adapting to selection pressures invoked by both antiretroviral drugs (ARVs) and the immune system in the forms of drug resistance [8] [9] and immune escape mutations [10] [11], respectively. The latter are widespread across the HIV-1 genome [12][13] [14], and may influence outcome in individual patients [15] and across different populations [16].
Escape from an effective immune response is determined by the strength of the imposed selection pressure and may explain why the prevalence of HLA-associated polymorphisms is greater for HLA Class I alleles associated with protection [17]. Although the adapted virus maintains a fitness advantage in the presence of the selection pressure conferred by cytotoxic T cells (CTL), there may be a significant drop in replicative capacity compared to a wildtype virus in a selection-free environment [18] [19]. We, and others, have previously hypothesised that immune escape may therefore result in the maintenance of relatively lower viral loads and clinical advantage [17] [20,21]. This is supported by high reversion rates of escape mutations selected by 'beneficial' HLA Class I alleles following transmission to HLA-mismatched recipients [22] [23].
These interactions between HLA Class I imposed selection, viral escape and viral fitness are together likely to influence clinical progression, however the mechanisms that lead to progression to late disease and AIDS are poorly defined. We hypothesised that the nature of these interactions can be understood better by investigating patients with late-stage HIV infection to determine if, or how, CD8+ve T cells are maintaining selection pressure, and whether the virus shows different patterns of adaptation and fitness costs compared to patients with earlier infection. There are limited reports that, despite the loss of CD4+ve cells, CD8+ve T cells may still be functional in AIDS, although with varied avidity, less polyfunctionality and less differentiation [24], and targeting Env rather than Gag [25]. We proposed that if CD8 T cell pressure is 'relaxed' due to HIV-induced immunodeficiency this might facilitate reversion of costly escape mutations, leading to a restoration of viral fitness and the subsequent rise in viraemia seen in AIDS. Reversion of costly drug resistance mutations has been associated with a rise in viral load and clinical progression [26], and therefore a precedent exists to potentially explain the rise in viral load associated with the onset of AIDS. Alternatively, CTL pressure may be maintained, but the virus might develop secondary compensatory mutations, which restore the replicative cost of the initial escape mutation. Compensatory mutations have been reported in chronic infection [27,28], but whether they explain the AIDS-associated rise in viraemia is not known.
In this study we question whether progression to AIDS is associated with an increase in viral fitness, and whether this is related to changes in T cell imposed selection pressure. We start by comparing T cell ELISPOT responses in patients across different CD4 count strata, and then examine how the variation in these responses correlates with different patterns of selection across the HIV genome in patients with very low CD4 cell counts compared with less progressed infections. Finally, we measure the viral fitness of these variants to correlate the replicative cost of adaptation with outcome. We investigated 1053 untreated HIV-1 infected South African individuals from Bloemfontein and Durban, and found that viral fitness was greater in patients with advanced disease with examples of both viral compensatory mutations and of reduction in escape mutations, the latter possibly due to 'immune relaxation'. Despite the complex interplay between host and pathogen, these data support a key role for the restoration of viral fitness in association with AIDS progression and help to explain the rise in viraemia in terminal stages.

Ethics Statement
Full ethical approval was gained for the study of both cohorts. Written informed consent was provided by all study participants. The study ethics for the Bloemfontein cohort was approved by the University of the Free State (ETOVS 10/04 and ETOVS 206/05) and for the Durban cohort by the University of KwaZulu-Natal Review Board.

Study subjects
Participants from two South African cohorts were studied (total n = 1053) -the 'Bloemfontein' cohort from the Free State (n = 278) and the 'Durban' cohort from KwaZulu-Natal (n = 775). Both are established cohorts, described elsewhere ([29] [6]). In summary, each of the cohorts was comprised of antiretroviral naïve and chronically HIV-1 subtype C-infected adults from neighbouring provinces in the central east region of South Africa. Of these, 916 patients with data on CD4 cell count, HLA class I type and either HIV gag, pol or nef sequences were included in the analysis of HLAlinked polymorphisms, and comprised the 'total' cohort. Subsequent analyses involved stratification of the patients according to HLA type, plasma viral load and CD4 T cell count. In the CD4 T cell count stratification, patients were assigned to either ''High CD4'' (CD4 T cell counts .500 cells/ml, n = 299), or ''Low CD4'' (CD4 T cell counts ,100 cells/ml, n = 196).

HLA typing
Participants' HLA Class I type was determined to the oligoallelic level using Dynal RELITM Reverse Sequence-Specific Oligonucleotide kits for the HLA-A, -B and -C loci (Dynal Biotech). To obtain four-digit typing, Dynal Biotech Sequence-Specific priming kits were used, in conjunction with the Sequence-Specific Oligonucleotide type.

T Cell ELISPOT assays
Peptides, 18 amino acids in length (n = 410), overlapping by 10 amino acids, and spanning the entire expressed HIV genome, were synthesized based on the consensus of available C-clade sequences in 2001. These peptides were used in a 'mega-matrix' of 11-12 peptides per pool to test patient samples for HIV-specific T cell responses by interferon-gamma ELISPOT assay, as previously described [30]. Confirmation of recognised individual 18-mer peptides within a peptide pool was carried out in separate ELISPOT assays.
Sequencing of HIV gag, pol and nef HIV gag, pol and nef were sequenced using previously described methods and primers [17]. In brief, viral RNA was extracted from plasma, reverse transcribed to cDNA and amplified using a nested polymerase chain reaction (PCR) protocol. Population sequencing of the HIV pol gene was carried out using ABI Big Dye terminator sequencing kits (Applied Biosystems), according to manufacturer's instructions. Sequences were aligned manually using X11 and Se-Al software.

Identification of HLA class I linked polymorphisms
Amino acid polymorphisms that associated with host HLA class I alleles, were identified using previously described methods utilising 'phylogenetic dependency networks' [31]. Briefly, the analysis combines phylogenetic correction with a statistical model of evolution to evaluate associations between host HLA class I alleles and viral amino acid site-specific polymorphisms. The analysis adjusts for confounding factors including founder effects, linkage disequilibrium of host HLA types, co-variation in HIV codons and corrections for multiple statistical tests. The significance of an association is expressed as a 'q value', which estimates the false discovery rate for each p-value. In this study an association is considered significant for q,0.2. 'Escape' describes increased polymorphisms observed at a particular amino acid site in the presence of a specific HLA class I allele. 'Reversion' describes decreased polymorphisms observed at a particular amino acid site in the absence of a specific HLA class I allele. The strength of associations between HLA class I alleles and viral polymorphisms in the ''total'' cohort, ''high CD4'' group or ''low CD4'' group was derived using a logistic regression model that corrected for phylogeny, and is reported as a log 2 -adjusted odds ratio.
Identified polymorphisms were mapped to known cytotoxic T cell epitopes. Epitope maps were defined from experimental data from the Durban cohort [6], and from those listed in the A-list of Los Alamos Database (http://www.hiv.lanl.gov/content/immunology/ pdf/2008/optimal_ctl_article.pdf). The epitope flanking region was also included in the analysis, defined according to the five amino acids neighbouring to the defined epitope in both C and N terminal directions.
The pNL4-3 HIV plasmid had been mutated to contain unique restriction enzyme (BstEII) sites at the 59 end of the gag gene and 45 nucleotides downstream from the 39 end of the protease gene, as described elsewhere [19]. The BstEII enzyme digest results in deletion of HIV gag-protease and self ligation of the remaining plasmid (pNL4-3Dgag-protease). For transfection, the pNL4-3Dgagprotease was linearised by BstEII digestion for two hours at 60uC (10 U/ml of enzyme per 1000 mg/mL of plasmid). For each sample, 10 mg of the linearised pNL4-3Dgag-protease was cotransfected with 5 mg of patient-derived gag-protease PCR product in 3.9610 6 tat-driven GFP reporter T cells (GXR cells of CEM origin). After transfection by electroporation (250 V, 950 mF, for 30-40 msec), 10 ml polybrene (4 mg/ml) and 1610 6 GXR cells were added and incubated at 37uC. Positive control samples were generated using co-transfection of pNL4-3Dgag-protease with gagprotease amplified from pNL4-3, pHXB2 (subtype B), and pMJ4 (subtype C). After five days, GFP expression was measured every 1,2 days using a FACSCalibur flow cytometer (FACSCalibur; BD Biosciences, San Jose, CA). Supernatant was harvested when .15% of cells were positive for GFP, and stored at 280uC.

Measurement of viral replication capacity (VRC)
Viral titration was performed in 1610 6 GXR cells to determine the multiplicity of infection (MOI). The VRC was assayed by adding recombinant virus supernatant to 1610 6 GXR cells at an MOI of 0.003. The VRC was measured, at least in duplicate, using gating strategies as previously described [19]. The growth of virus was calculated from the slope derived from the natural log of the percentage of GFP positive cells between days 2 and day 7, as appropriate for an exponential growth curve. The final VRC of each variant is expressed as percentage growth rate of the variants compared to that of recombinant NL4-3 strains. In addition, for each VRC assay, viral supernatants were collected on day 7 for viral RNA extraction and sequencing to confirm the population sequences VRC measured.

Statistics
Student t-tests (two tailed) and Mann-Whitney tests (two tailed) were used to compare various clinical parameters and VRC measurements. The statistical analyses were carried out using Prism4 for Macintosh (GraphPad Software, version 4.0c). Associations between HLA Class I alleles and viral polymorphisms were determined using phylogenetically-corrected algorithms, as described above.

Structure of cohorts
Patients were recruited from the Bloemfontein and Durban cohorts [29] [6]. The Bloemfontein cohort comprises 884 drugnaïve patients at all stages of HIV infection recruited through the South African government ARV program. Plasma samples were studied from 278 out of the 884 participants: 96 out of 110 patients in the high (CD4 count.500 cells/ml), 18 out of 308 in an intermediate (CD4 count = 201-400 cells/ml), and 164 of 183 in the low (CD4 count,100 cells/ml) strata. The samples were used for the HLA-association study and to make recombinant viral strains for the analysis of fitness. To increase the power of the HLA-association analysis we incorporated HIV viral sequences from the neighbouring Durban cohort, which comprised 775 adults, naive to antiretroviral therapy. Of these, 219 were asymptomatic women identified through antenatal clinics. The remaining 556 subjects were recruited from out-patient clinics. For all 775 patients host HLA class I alleles were genotyped. ELISPOT assays were carried out using overlapping peptides designed from the subtype C majority consensus. For 656 of these patients, sequences of either viral gag, pol and nef genes were available.

Breadth of ELISPOT responses narrows in advanced disease
In order to measure the T cell-imposed selection pressure acting on HIV at different stages of the infection, ELISPOT assays using overlapping peptides (OLPs) covering the entire HIV genome were analysed from 775 patients. Results were stratified according to CD4 cell count and are presented according to the number of OLPs seen per patient for each gene (Figure 1a), and the percentage contribution of each protein to all responses in each strata ( Figure 1b). The figure shows that over the course of infection, there is a general narrowing of the absolute breadth of responses ( Figure 1a). However, the relative contribution of Gag to the total responses decreased (p,0.0001, r 2 = 0.055) in contrast to the proportion of responses to other proteins such as Env (p,0.0001, r 2 = 0.024) and Vif (p = 0.003, r 2 = 0.011), which was greater in patients with lower CD4 cell counts. We hypothesized that the reduction in breadth of CD8+ve T cell responses might result in, and therefore correlate with, a decline in selection pressure, particularly in Gag in light of the relative shift away from Gag targeting. As a result, this 'immune relaxation' might allow the reversion of costly viral escape mutations back to a fitter wild type state. Accordingly, we next analysed HLA Class I-associated polymorphisms in the HIV-1 gag, pol and nef genes to identify any difference in the strength of selection pressure in patients with high and low CD4 cell counts.

Identification of HLA class I linked polymorphisms in HIV-1 gag, pol and nef
To identify evidence of differential HLA Class I imposed selection pressure in patients with high and low CD4 cell counts, we carried out a study to identify HLA Class I associations with particular amino acid polymorphisms in this cohort. We accounted for potential confounding effects such as phylogeny, founder effect and multiple comparisons by using phylogenetic dependency networks [31] -an approach which reports associations between specific viral polymorphisms and HLA Class I alleles according to p value, q value (a measure of false positive error) and a phylogenetically-adjusted odds ratio.
Initially, we analysed 916 HIV-1 gag, pol and nef sequences from the combined Durban and Bloemfontein cohorts and then we carried out sub-group analyses on patients with either 'high' (.500 cells/ml, n = 299) or 'low' (,100 cells/ml, n = 196) CD4 cell counts. We then selected those HIV amino acid polymorphisms significantly associated with an HLA Class I allele in either, or both of, the 'high' or 'low' CD4 count group of patients for further analysis. The results for gag, pol and nef are presented in Tables 1, 2 and 3, respectively. The tables detail each viral polymorphism and its associated HLA Class I allele, whether the association lies in, or within, the flanking region of a known restricted epitope and whether this association is expected to revert in HLA-mismatched hosts [35]. The q values for the statistically significant associations for the whole cohort, the 'high' CD4 count subgroup (.500 cells/ ml) and 'low' CD4 count subgroup (,100 cells/ml) are shown, followed by the log 2 -adjusted odds ratios for the 'high' and 'low' groups. In the final column, the p value is reported, when there is a significant difference between the two odds ratios.
We identified 40, 91 and 20 associations, respectively, between an HLA Class I allele and a viral polymorphism in the HIV-1 gag, pol and nef genes, in either, or both of, the 'high' or 'low' CD4 count subgroups. Of these, 16, 28 and 7 associations in HIV-1 gag, pol and nef, respectively, had significantly different log 2 -adjusted odds ratios in the two subgroups. These 51 associations are shown in Figure 2, according to HIV-1 gene. In gag, pol and nef, respectively, there were 5/16 (31%), 24/28 (86%) and 7/7 (100%) associations that were more prevalent in the 'low CD4' group. These data show that, where we could detect a difference, associations were more prevalent at lower CD4 counts in pol and nef, suggestive of continued selection pressure, but that certain associations -particularly in HIV-1 gag -were weaker. Why might HLA associations become weaker as disease progresses? One possible explanation is that as CTL pressure weakens due to immune exhaustion and loss of breadth, escape mutations with replicative fitness costs are no longer required by the virus. These therefore revert back to wild-type, resulting in a reduction in the observed log 2 -adjusted odds ratio. Thus, low-cost mutations accumulate over the course of infection, leading to higher prevalence in Nef and Pol, whereas high-cost mutations begin reverting, leading to a lower prevalence in Gag.
If changes in selection pressure were resulting in an increase or decrease in the prevalence of wild type virus during late-stage disease, we would expect to find changes in viral replicative capacity in patients with low CD4 cell counts. We therefore undertook assays of viral fitness. As the HIV gag gene was associated with the most potential examples of immune relaxation we implemented a fitness assay targeting the gag gene and focussed on three beneficial HLA Class I alleles (HLA B*57, HLA B*5801, and HLA B*8101) and one allele associated with more rapid progression, HLA B*5802. In particular, we highlighted two associations: the T186S mutation in the TPQDLNTML (TL9) epitope, restricted by B*8101, with an observed decrease in log 2adjusted odds ratio from 18.38 to 7.25 in patients with high and low CD4 cell counts, respectively, (consistent with immune relaxation, p = 0.0397, likelihood ratio test), and the B*57/ B*5801 restricted epitope TSTLQEQIAW (TW10), with log 2adjusted odds ratios of 25.4 and 23.2 for high and low CD4 cell count stratification analysis (p = 0.12, likelihood ratio test, showing no significant evidence of immune relaxation). We focussed on these two associations as published data suggest that they are both associated with significant viral fitness costs [32] [18,27] and yet in this dataset they behave very differently in terms of changes in log 2 -adjusted odds ratios.

Increased viral replication capacity (VRC) of chimeric HIV-1 NL4-3 at low CD4 counts and high plasma viral loads
To investigate further possible reasons for declining selection pressure within Gag epitopes with progression to AIDS, we considered the hypothesis that loss of CD8+ T-cell responses as CD4 count declines results in reversion of escape mutants, with resulting increase in viral replicative capacity. To investigate the impact of the gag-protease mutations selected by different HLA class I alleles at different clinical stages, we constructed 148 chimeric HIV-1 NL4-3 viruses containing autologous gag-protease from patients of the Bloemfontein cohort. Patients with specific HLA class I alleles previously associated with control of HIV in this population (HLA-B*57/5801/8101) or lack of control (B*5802) [6] were selected for chimeric virus construction and stratified by CD4 cell count. 88 viruses were constructed from patients with low CD4 cell counts (,100 cells/ml), comprising HLA-B*57 (n = 2), HLA-B*5801 (n = 15), HLA-B*5802 (n = 54) and HLA-B*8101 (n = 10) plus 14 'neutral/other' (i.e. neither protective or disadvantageous) HLA Class I alleles. From patients with high CD4 cell counts (.500 cells/ml), 42 viruses were made: HLA-B*57 (n = 7), HLA-B*5801 (n = 8), HLA-B*5802 (n = 15), HLA-B*8101 (n = 10) plus 10 'neutral/other' HLA Class I alleles. Sixteen patients co-expressed two of these HLA class I alleles. A further 18 viruses were made from patients randomly selected from intermediate CD4 T cell counts. The VRC of the chimeric virus was expressed as percentage growth rate compared to the pNL4-3Dgag-protease plasmid recombined with the wild type HIV-1 NL4-3 gag-protease insert.
Initially we determined whether CD4 cell count alone was associated with VRC. Figure 3a shows that the VRC of the chimeric viruses from patients from different CD4 T cell count strata was significantly different (Figure 3a). The VRC from viral isolates from patients with the lowest CD4 T cell counts (,100 cells/ml, median 91.15%; IQR: 86.1-98.0%) was significantly higher than from patients with high CD4 T cell counts (.500 cells/ml, median 85.19%; IQR: 79.8%-91.8%), p = 0.0004 (Mann-Whitney test), showing that progression to AIDS is associated with an increase in viral fitness.
There was also a weak but statistically significant positive correlation between VRC and log 10 plasma viral load (Figure 3b, p = 0.0003, r 2 = 0.088, n = 142). The slope of the best-fit values (0.02960.0079) indicated that a 34.4% increase in VRC accounted for an increase of one log 10 in plasma viral load.
To determine whether the association between HLA Class I and viral fitness was additionally impacted by disease stage we stratified the VRCs according to high and low CD4 cell counts (Figure 4b).
This sub-group analysis shows that the mean VRC values are higher for each HLA Class I allele in the low CD4 count group suggestive of an increase in viral fitness in advanced disease, although in most cases these values are not statistically significant. This is likely to be an effect of being underpowered, as patient numbers become smaller with each sub-categorisation. Interestingly, there is a significant difference in the VRCs of viruses in the non-protective 'neutral/other' HLA Class I alleles (p = 0.0092, Mann-Whitney test), showing that although there is an HLAspecific effect on viral fitness, there is an additional association with CD4 cell count, independent of the more beneficial HLA Class I alleles.
The impact of specific mutations on VRC in the context of HLA-B*8101 and HLA-B*B57/B*58 These analyses have shown that viral fitness is greater in viruses from patients with advanced disease, and that certain protective HLA Class I alleles (such as B*57, B*8101 and B*5801) are associated with greater fitness costs. The HLA-association analysis presented above (Tables 1,2 and 3) revealed that in patients with low CD4 counts HLA B*57/*5801 was associated with persistence of T242N in the TW10 epitope, whereas HLA B*8101 was associated with a potential loss of selection pressure reflected by the decreased prevalence of T186S mutants in the TL9 epitope. We therefore focused further on these well-characterised escape mutants to investigate how they impacted viral fitness at different stages of disease progression.
For HLA-B*8101, we excluded from the analysis any viruses with key mutations restricted by other HLA Class I alleles (that is, T242N and A163G) which were known to impact fitness. Viruses from patients with HLA B*8101 were significantly less fit than viruses from patients in the rest of the cohort (p = 0.0059, Mann-Whitney test; Figure 5a) and mutations in the B*8101-restricted Gag TL9 epitope were associated with lower VRCs (p = 0.024) both in the whole cohort and in patients with HLA B*8101 (p = 0.025). The main single mutation conferring loss of replicative fitness was the T186S immune escape mutation (p,0.0001). Neither the Q182S or T190X mutations had an impact on VRC in this cohort (data not shown). Although patient numbers were small, there was no significant difference between the VRCs of viruses with the T186S mutation according to high and low CD4 count stratification. These data show that although B*8101 imposes a fitness cost on the virus through selection of T186S, we found no evidence that, when this mutation persists, the fitness cost is restored at low CD4 cell counts.
We compared the fitness costs associated with HLA B*8101 with those associated with B*57 and B*5801, as the HLAassociation analysis suggested that for these latter alleles escape mutations persisted in advanced disease. We pooled the data for patients with HLA B*57 and B*5801 as these alleles have highly related binding motifs, and both present the Gag TW10 epitope and select the T242N escape mutation [36]. As for HLA B*8101, viruses from patients with HLA-B*57 or HLA-B*5801 were significantly less fit than those from the rest of the cohort (p = 0.0299, Mann-Whitney test; Figure 5b), supporting the association between protective HLA Class I alleles and viral replicative cost (Figure 4a). In the whole cohort there was no significant effect of all mutations in the TW10 epitope on VRC. For patients with HLA B*57 or B*5801, there was a nonsignificant trend for all mutations within the TW10 epitope to impair fitness, although the power of this analysis was limited as only one of the 22 patients had not selected a mutant epitope. However, the fitness cost of the T242N mutation was confirmed when the whole cohort was analysed, when it was associated with a significant drop in VRC (p = 0.0285). This result was independent of the effect of mutations in other B*57/B*5801 associated epitopes such as Gag KF11 or Gag IW9 (p = 0.024; data not shown), showing that the fitness cost was associated with T242N, rather than reflecting being positive for HLA B*57/ B*5801.
In the HLA-association analysis we had found that T242N was maintained in late disease, even though these data, as well as previous reports, suggest that this mutation is associated with a fitness cost. We explored this apparent discrepancy by stratifying the fitness analysis of T242N according to CD4 count (Figure 5b), and found that in patients with advanced disease (CD4 counts ,100 cells/ml), viral fitness had been restored even though the T242N persisted (p = 0.0028).     The table shows the results of the association analysis between HLA Class I alleles and HIV viral polymorphisms in HIV Pol. The columns of the table detail the HIV protein, the associated HLA Class I allele and viral polymorphism, whether the association lies in, or within, the flanking region of a known restricted epitope and whether this association is expected to revert in HLA-mismatched hosts. The q values for the statistically significant associations for the whole cohort, the 'high' CD4 count subgroup (.500 cells/ml) and 'low' CD4 count subgroup (,100 cells/ml) are shown, followed by the log 2 -adjusted odds ratios for the 'high' and 'low' groups. In the final column, the p value is reported, where there is a significant difference between the two odds ratios. doi:10.1371/journal.pone.0019018.t002 Table 3. HLA-related associated polymorphisms in the HIV Nef protein in different CD4 count strata. The table shows the results of the association analysis between HLA Class I alleles and HIV viral polymorphisms in HIV Nef. The columns of the table detail the HIV protein, the associated HLA Class I allele and viral polymorphism, whether the association lies in, or within, the flanking region of a known restricted epitope and whether this association is expected to revert in HLA-mismatched hosts. The q values for the statistically significant associations for the whole cohort, the 'high' CD4 count subgroup (.500 cells/ml) and 'low' CD4 count subgroup (,100 cells/ml) are shown, followed by the log 2 -adjusted odds ratios for the 'high' and 'low' groups. In the final column, the p value is reported, where there is a significant difference between the two odds ratios. doi:10.1371/journal.pone.0019018.t003 Figure 2. Changes in associations between HLA Class I alleles and HIV amino acid polymorphisms for patients with CD4 cell counts ,100 cells/ml compared with patients with CD4 cell counts .500 cells/ml. The change in the strength of associations between HLA Class I alleles and specific HIV amino acid polymorphisms in patients with high and low CD4 cell counts is shown for HIV-1 (a.) Gag, (b.) Pol and (c.) Nef. Change in strength of the association is reported on the Y-axis as a log(2)-adjusted value. Associations are recorded on the X-axis according to the restricting HLA Class I allele and the specific viral polymorphism -e.g. B45-H28R in Gag means that in the presence of HLA B*45, Gag amino acid 28 mutates from a Histidine (H) to an Arginine (R). In this case there is at least a 40-fold change in the log(2)-adjusted odds ratio for this association in patients with CD4 counts ,100 cells/ml compared with less progressed patients with CD4 cell counts .500 cells/ml. (All fold changes are capped at a value of 40). doi:10.1371/journal.pone.0019018.g002 Compensatory mutations accrue in advanced HIV infection in patients with HLA B*57 and B*5801

RT
As the T242N mutation has been previously linked with compensatory mutations we analysed our cohort to see if these accrued in late stage disease to explain the rise in viral fitness despite the maintenance of T242N. For the T242N mutation in TW10, compensatory mutations have previously been published (H219Q, I223V and M228) [27], and so we examined the frequency of these mutations in the different CD4 T cell count strata ( Figure 6). We found a cumulative increase in the number of compensatory mutations in patients with T242N in individuals with lower CD4 cell counts, consistent with the increase in fitness in the previous analyses (P = 0.013).
Together these data suggest that advanced HIV disease is associated with an increase in viral fitness, which may contribute to the rise in plasma viral load frequently seen in AIDS. However, this increase in fitness is multifactorial and we report two processes specific to different HLA Class I alleles -reversion due to immune relaxation and compensatory mutations -to potentially explain these findings.

Discussion
In this study, we have explored the hypothesis that progression to AIDS is associated with a rise in viral replicative capacity, and the mechanisms associated with this. We proposed that a rise in viral fitness in AIDS could be due either to a relaxation of CTLimposed selection pressure resulting in the reversion of costly mutations or, alternatively, if immune pressure is maintained, compensatory mutations might restore the fitness costs of persisting escape mutations.
We initially tested the hypothesis by looking for evidence of changes in T cell-imposed selection pressure measured by gamma interferon ELISPOT assays. We found that the breadth of the response narrowed as the CD4 count declined and that, specifically, the contribution of Gag responses to the overall response decreased. It has previously been reported that acute HIV infection is associated with narrow, high affinity T cell responses which broaden as patients progress into chronic infection [37]. Here, we see the reverse of this process with a return to a much narrower range of T cell responses. It would also be interesting to measure whether there was a loss of high affinity T cells in the advanced stage patients in this cohort as this might be compatible with a weakening in selection pressure.
Although we identified a narrowing of the T cell responses in AIDS, the ELISPOT data alone are not sufficient to infer changes in selection pressure. We therefore looked for evidence of differential selection in patients with high and low CD4 cell counts using a statistical HLA Class I association study. By combining sequence and HLA data from the Bloemfontein cohort and the Durban cohort, we were able to define two groups of patients with CD4 counts less than 100 cells/ml (n = 196) and greater than 500 cells/ml (n = 299). These CD4 count strata were defined to represent extremes of HIV-related disease without reducing the power of the analysis. The WHO have defined that antiretroviral therapy should be commenced at a CD4 count of 350 cells/ml. Below 200 cells/ml, opportunistic infections become more common. A count of 100 cells/ml or below, therefore, represents advanced HIV disease and progression to AIDS. Above 500 CD4 cells/ml it would be highly unusual for there to be opportunistic infections and the most recent NIH guidelines do not recommend treating above this level [38].
To identify mutations in the HIV genome associated with individual HLA Class I alleles we used an established technique using phylogenetic dependency networks [31], which has been widely implemented in the HIV literature [12,14,35,39]. In this approach, the risk of error due to viral founder effects and misidentification of co-variant sites is corrected for by integrating the phylogeny of the sequences into the analysis. Identified associations are hypothesised to be the result of cytotoxic T cell imposed selection pressure and in other analyses such associations have been proven experimentally. Although there have been a number of HLA-association studies reported, none have stratified patients according to disease progression and compared the strength of selection in different strata according to log 2 -adjusted odds ratios.
In our cohorts we identified 151 HLA-associated mutations in HIV-1 gag, pol or nef that were significant in patients with either high or low CD4 cell counts. By comparing log 2 adjusted odds ratios in the CD4 strata, we found that there was evidence for both immune relaxation and intensification in patients with low CD4 counts compared with high CD4 counts. This suggests that for some HLA Class I-restricted responses the selection pressure imposed by CTL is maintained at very low CD4 cell counts but that for others there may be reversion of escape mutations associated with a reduction in selection pressure. Interestingly, the majority of events compatible with immune relaxation were in HIV Gag, which was also the protein for which there was a relative narrowing of T cell ELISPOT responses in AIDS patients.
Two key sites potentially reflecting immune 'persistence' and 'relaxation' were the HLA B*57/B*5801 associated T242N mutation and the B*8101 associated T186S mutation, respectively. For the former, the association analysis provided evidence for on-going mutation associated with compensatory mutations, whereas for the latter there was evidence for reversion. A limitation of this study is that it is cross-sectional rather than longitudinal, potentially biasing our data interpretation. An alternative explanation for immune relaxation is that the patients who progress are the ones who do not make an immune response and so do not select mutants, therefore enriching patients with low CD4 counts with wild-type viruses. This would be compatible with the finding that targeting certain key epitopes may be associated with different speeds of clinical progression [40], but does not explain the eventual outcome of those patients enriched for immune escape mutations with high CD4 cell counts. To determine which occurs would require a prospective longitudinal study of untreated progression to AIDS. In our cohort, analyses of HLA B*8101 positive individuals with data on ELISPOT responses and viral sequence, show that for patients with CD4 counts between 200-500 cells/ml, 50% (3/6) who did not recognise TL9 carried the T186S mutant. In contrast, for patients with CD4 counts ,100 cells/ml there were no ELISPOT negative patients with T186S (0/2), suggesting that these had reverted . Viral replication capacity according to HLA Class I and CD4 cell count. (a) Viral replication capacity (VRC) of chimeric NL4-3 viruses containing patient autologous gag-protease stratified by HLA Class I. The VRC is expressed as percentage growth rate compared to control strains. The p-value is calculated using Mann-Whitney test. Panel (b.) shows further stratification according to CD4 cell count. The y-axis depicts VRC, expressed as percentage growth and the x-axis depicts HLA class I, with each HLA class I category divided further into ''low'' CD4 T cell count strata (,100 cells/mm3) and ''high'' CD4 T cell count (.500 cells/mm3). The p-value is calculated using Mann-Whitney test. doi:10.1371/journal.pone.0019018.g004 rather than having been consistently wild type, although in this study numbers are too small to be conclusive. Viral fitness was measured using a recombinant assay in which a patient-derived RT-PCR amplicon of HIV-1 gag-protease was inserted into a gag-protease-deleted pNL4-3 plasmid. The advantage of this approach is that it uses the full gag gene from the patient sequence as well as the associated protease that encodes the enzyme for cleavage of the Gag-Pol protein. The disadvantage of this assay is that it is in a parallel rather than competitive format, and therefore may lack sensitivity for small fitness differences. In addition, the construct is a recombinant of a subtype B backbone with a subtype C insert. Although this approach has been published elsewhere [32], any data should be interpreted relative to the control virus rather than as an absolute measure of viral fitness. Choice of control is important. We used a subtype B wildtype backbone (pNL4-3Dgag-protease) into which we recombined the pNL4-3 gag-protease. This controlled for any change in fitness due to the recombination step.
We found that VRC was significantly greater in patients with low CD4 cell counts and there was a significant, although weak, correlation with plasma viral load. These data, which are consistent with a population-based study of HIV subtype Binfected individuals [33], are amongst the first supporting a link between viral fitness and plasma viral load, and show that this ex vivo assay reflects viral replication in vivo. Analysing our data by HLA Class I revealed that HLA Class I alleles that are associated with clinical advantage (B*8101, B*57, B*5801) were associated with impaired viral fitness, as had been inferred from previous analyses of the reversion of transmitted escape mutations in HLA mismatched hosts [22,23]. Interestingly, however, the increase in viral fitness with lower CD4 cell counts did not appear to be restricted to beneficial HLA alleles and was also present in a selection of viruses from patients with 'neutral/other' alleles.
The HLA-amino acid association analysis showed that for HLA B*8101 there was a decrease in mutations in the Gag TL9 epitope at low CD4 cell counts, whereas for HLA B*57/B*5801 there was persistence of the T242N escape mutation in TW10. The fitness assays showed that the T186S mutation in TL9 was associated with a fitness cost (P,0.0001) and it is therefore possible that in patients with B*8101 and low CD4 counts, the decrease in prevalence of the T186S mutation reflects reversion to a fitter strain coincident with the weakening of T cell responses. For B*57/B*5801 we saw a different pattern. Here, possession of HLA B*57/B*5801 was associated with less fit viruses in conjunction with the T242N mutation in the Gag TW10 epitope, but in patients with low CD4 counts fitness was restored even though T242N persisted. We therefore sought compensatory mutations to explain this finding. For T242N there are four documented compensatory mutations at Gag codons 219, 223, 228 and 248 [27]. For this analysis of subtype C HIV-1 we excluded codon 248, as the 248A variant represents the subtype C consensus. When we stratified patients with T242N according to CD4 strata we found a significant (p = 0.013) increase in numbers of compensatory mutations at lower CD4 counts, consistent with the rise in VRC.
These data show that viral fitness increases with progression to AIDS. Whether this is cause or effect is difficult to determine, although the fact that we see evidence for both reversion and compensatory mutations in the same cohort indicates that progression may be a mixture of the two. What is clear is that viral fitness is a significant component of progression to AIDS and, as such, should be considered as a target for intervention, for example through vaccines aimed at epitopes which escape with Figure 6. Prevalence of compensatory mutations with the T242N mutation in the HLA-B57 and B*5801 restricted epitope, Gag TW10. In the 100% stacked column, the y-axis shows the number of mutations accumulated at the three Gag protein amino acid sites (H219Q, I223V and M228I) that compensate for the T242N mutation [27]. The x-axis depicts the categorical CD4 T cell count strata and the number of patients within each strata (the ''High CD4'' group refers to patients with CD4 T cell counts .500 cells/ml, the ''Intermediate CD4'' group refers to patients with CD4 T cell counts between 200 and 400 cells/ml, and the ''Low CD4'' group refers to patients with CD4 T cell counts ,100 cells/ml). doi:10.1371/journal.pone.0019018.g006 high fitness costs. Trials such as DART have shown that antiretroviral therapy can be extremely effective in regions such as sub-Saharan Africa [41], however the costs and logistics of provision are complex. Any intervention that could prevent new infections or delay a requirement for therapy could have major economic and health implications and therefore combining such a vaccine with antiretroviral provision could be a productive strategy.