HIV-1 epitopes presented by MHC class I types associated with superior immune containment of viremia have highly constrained fitness landscapes

Certain Major Histocompatibility-I (MHC-I) types are associated with superior immune containment of HIV-1 infection by CD8+ cytotoxic T lymphocytes (CTLs), but the mechanisms mediating this containment are difficult to elucidate in vivo. Here we provide controlled assessments of fitness landscapes and CTL-imposed constraints for immunodominant epitopes presented by two protective (B*57 and B*27) and one non-protective (A*02) MHC-I types. Libraries of HIV-1 with saturation mutagenesis of CTL epitopes are propagated with and without CTL selective pressure to define the fitness landscapes for epitope mutation and escape from CTLs via deep sequencing. Immunodominant B*57- and B*27- present epitopes are highly limited in options for fit mutations, with most viable variants recognizable by CTLs, whereas an immunodominant A*02 epitope-presented is highly permissive for mutation, with many options for CTL evasion without loss of viability. Generally, options for evasion overlap considerably between CTL clones despite highly distinct T cell receptors. Finally, patterns of variant recognition suggest population-wide CTL selection for the A*02-presented epitope. Overall, these findings indicate that these protective MHC-I types yield CTL targeting of highly constrained epitopes, and underscore the importance of blocking public escape pathways for CTL-based interventions against HIV-1.

Introduction HIV-1-specific CD8 + cytotoxic T-lymphocytes (CTLs) play a significant protective role in the pathogenesis of HIV-1 infection [1][2][3], but ultimately fail to prevent disease progression in most persons. Myriad failure mechanisms have been proposed, but given the remarkable mutation rate and sequence plasticity of HIV-1 [4,5], the major factor is viral epitope escape mutation resulting in a cascade of viral persistence, CTL exhaustion, dysfunction, and senescence in chronic infection [6]. Indeed, evasion of CTLs is the major determinant of viral evolution in vivo [7][8][9][10]. Moreover, the major histocompatibility complex class I (MHC-I) locus is the best defined genetic determinant of disease progression rate in genome-wide association [11][12][13] and epidemiologic studies [14,15], indicating that MHC-I-associated properties of CTLs are important determinants of their efficacy.
Several studies of persons with "protective" MHC-I types who contain viremia without treatment have shown limited variation in targeted epitopes. Some have suggested that these are limited escape mutations with high fitness costs, based on examination of a few epitope variants observed in vivo [16][17][18][19][20][21][22][23][24]. However, the generality and mechanisms behind this observation are unclear, and the contributions of viral versus immune constraints for HIV-1 escape from CTLs are incompletely understood. Properties of the targeted epitope could be important; HIV-1 sequence plasticity is not uniform and epitopes likely vary in their constraints for mutation [25]. Alternatively, properties of the CTLs could differ; it has been proposed that the T cell receptors (TCRs) associated with protective MHC-I types either have greater cross-reactivity for epitope mutants and thus better limit possibilities for escape [26][27][28], or rather are better matched to common epitope variants [29]. Thus it is unresolved whether the limited escape is due to properties of the epitopes versus CTLs.
Finally, CTL responses against a given epitope are generally comprised of multiple clones with differing TCRs [30,31]. Because individual clones recognizing the same epitope can vary in the recognition of different variants [32][33][34], it has been proposed that clonal breadth may be important for preventing escape [30], but protective MHC-I types do not appear to yield greater TCR breadth overall [31]. This suggests qualitative differences in the composition or function of TCRs, and it is unclear to what degree the constraints for HIV-1 to escape CTLs are shared ("public escape") versus specific for each clone ("private escape").
Such issues are difficult to address in vivo, where the CTL response is polyclonal, the starting sequences of HIV-1 are typically undefined, and it is impossible to normalize selective pressure between epitopes. Here we assess the effect of HIV-1-specific CTLs on the fitness landscape of viral epitope mutation at clonal resolution. Libraries of HIV-1 epitope mutants are propagated under selective pressure to define the options for immune escape for multiple CTL clones associated with protective and non-protective MHC-I types, addressing these issues with an experimentally controlled approach to reveal CTL escape pathways for HIV-1.

Construction of HIV-1 epitope libraries
Saturation mutagenesis was applied to three immunodominant HIV-1 epitopes in Gag (Table 1) Degenerate nucleotide DNA synthesis was utilized for each codon encoding the epitope and its flanking amino acids, as well as every combination of two codons, followed by substitution into the whole proviral genome of HIV-1 strain NL4-3 (Fig 1). The resulting plasmid libraries were found by deep sequencing to contain a full representation (100% for each epitope) of single amino acid variants and partial representation of double amino acid variants (38 to 43%) achieving the threshold frequency of 2.5x10 -5 that was considered adequate for detectable virus production after transfection ( Table 2). As expected, the consensus epitope sequence was overrepresented in each library because consensus amino acids were included in every degenerate codon (Fig 2A). These proviral DNA libraries were transfected into producer cells to yield starting virus libraries after a week of expansion. Deep sequencing of viral RNA in these libraries again demonstrated that the consensus epitope variant was predominant, but also demonstrated that a minority of the adequately represented variants in the plasmid library persisted as replication-competent variants ( Table 2 and Fig 2A), suggesting that most epitope mutations were deleterious (36.4 to 86.6% of single codon mutants, 99.12 to 99.97% of double codon mutants). Epitope variants with a threshold frequency <10 −4 in two experimental replicates of virus libraries were considered to be nonviable, because they tended to decay if present in only one library, indicating insufficient replicative capacity.  (Fig 2A). Library propagation with the addition of epitope-targeted CTLs yielded distinctly different profiles of epitope variants, indicating superimposed selective pressure by the CTLs (Fig 2B). Control CTLs targeting an irrelevant epitope did not induce a profile distinct from passaging without CTLs, and the magnitude of the epitope-specific CTL-induced change was dose-dependent ( Fig 2C).  No CTL" panel) had several tolerated mutations mostly at residues 2, 5, and 6. Evaluation of these epitopes under additional CTL selection also demonstrated patterns of epitope enrichment relative to the consensus epitope sequences (RE +CTL >0). The addition of CTL generally appeared to augment enrichment of epitope variants with intrinsic growth advantages in the absence of CTLs (Figs 3-5), although there were also some intrinsically disadvantageous variants that gained enrichment with the addition of CTLs. Conversely, some intrinsically advantageous variants were selected against with the addition of CTLs, particularly those with substitutions at the -1 position of the SL9 epitope. The net effect of CTL selection (ΔRE = RE +CTL -RE -CTL ) was examined for each epitope variant (Fig 6). The relevance of this value to identify potential CTL escape variants was confirmed by generating HIV-1 clones corresponding to library variants with defined ΔRE values, and testing their susceptibility to inhibition of replication by CTLs (Fig 7). Thus this parameter showed that many single substitutions conferred benefits against CTL selection (Fig 6). A major exception was the N-terminal flanking amino acid of the SL9 epitope (position -1), where most substitutions increased susceptibility to CTLs.
Overall, these data demonstrate epitope-specific constraints for mutation and evasion of CTLs. The number of possible single and double amino acid variants within each library (excluding stop codons) is given, as well as the number of these variants detected above threshold in the plasmid library (! 2.5x10 -5 , assumed to be adequate for representation after transfection of producer cells and passaging in 5 x 10 6 cells thus yielding 20x sampling), and the number that carried forward above threshold (! 10 −4 in both biological replicates, assumed to have replicated after transfection) in the virus libraries. Epitope variants below this frequency tended to decay if present in only one library, and thus were considered nonviable. If a variant was present above threshold in the plasmid library but inadequately represented in the virus library produced by transfection of the plasmid library into producer cells, it was also considered to be nonviable.  KF11 and KK10 escape from CTLs is more constrained than SL9 mostly due to more limited fitness landscapes for epitope mutation Quantitative analyses were extended to all epitope variants in the libraries, including double amino acid mutants (Table 1), to compare epitopes. First examining RE -CTL (Fig 8, S4-S6 Figs first columns), the SL9 library yielded more variants with neutral to moderately decreased replication capacity (RE -CTL !0 or RE -CTL !-0.5) compared to KF11 and KK10, whereas KF11 and KK10 were similar (Fig 9). There were 30 and 59 (0.34% and 0.68% of all single and double SL9 mutants adequately represented in the plasmid library) SL9 variants with RE -CTL !0 and -0.5 respectively, compared to 2 and 17 (0.018% and 0.031%) and 3 and 16 (0.16% and 0.16%) of KF11 and KK10 epitopes reaching those thresholds (Fig 9 top). The distributions of measurements showed increasing numbers of lower RE -CTL variants, consistent with insufficient replicative capacity for the variants in the plasmid library that were not detected in the virus library (Fig 9 bottom). Comparing susceptibilities of epitope variants to CTLs (ΔRE), many variants had neutral to enriched effects under CTL selection (Fig 8). Across all variants in the virus libraries (excluding variants with mutations in epitope flanking residues, to isolate effects of changes in CTL epitope recognition from epitope processing), this parameter displayed a range of values that was normally distributed (Fig 10). The mean ΔRE value across all SL9-specific CTLs was similar to KF11-and KK10-specific CTLs (1.31 versus 1.54 and 1.32 respectively), although the percentages of variants with at least 5-fold advantage under CTL selection (ΔRE!0.7) was significantly higher (92.9% versus 78.1% and 74.8% respectively, Fig 10 top). Over the range of 2-to 10-fold relative enrichment with CTLs, a stable profile of selected variants was observed (S4-S6 third columns), and thus 5-fold selection (ΔRE!0.7) was chosen as a definition of potential escape. Finally, considering the numbers of potential escape variants under this definition with at least moderate replicative capacity (RE -CTL !-0.5) as viable options for escape, SL9 had significantly more options than KF11 or KK10. Using mean values across CTLs, SL9 had 19 variants (0.22% of variants in the virus library) compared to 4 (0.036%) and 3 (0.031%) variants for KF11 and KK10 respectively meeting these criteria ( Fig  11). In summary, the SL9 epitope offers more options for viable mutations than KF11 or KK10, and average CTL coverage of those mutations is similar or perhaps modestly decreased for SL9 compared to KF11 and KK10 (Fig 12). Different CTL clones targeting each epitope overlap considerably in escape mutations. To address the similarity of epitope variant recognition between differing CTL clones, the ΔRE values for clones were compared (Fig 13). In general, CTL susceptibilities of variants were highly correlated for SL9, KF11, and KK10 epitopes, with few exceptions. Overall, these data suggest that most potential escape mutations for each epitope are shared across different CTL clones (representing "public" options).

Discussion
This study addresses the fitness landscape for mutational variation of three HIV-1 epitopes and the restrictions imposed by CTLs. While in vivo observations have revealed the effects of differs significantly between time points (p<0.00001), and additionally significantly different (p<0.00001) between libraries cultured without versus with 3.23T after 1 and 2 weeks. C. Epitope variant frequencies are plotted after one week of passaging in the presence of a control CTL clone 68A62 recognizing an A*02-restricted epitope in reverse transcriptase, or different effector to target (E:T) ratios of clone 3.23T. Fitness landscapes of HIV-1 epitopes presented by protective MHC types CTL on viral evolution to escape, our data dissect this process in greater detail, resolving the interaction at the level of individual CTL clones and defined starting virus quasispecies populations. For each epitope, the effect of every single amino acid polymorphism (as well as about a third of all double amino acid polymorphisms) versus the subtype B consensus sequence is assessed by frequency change as a reflection of fitness during serial passaging, as well as the impact of clonal CTL selection on these variants. Two epitopes presented by protective MHC-I types B Ã 57 (KF11) and B Ã 27 (KK10) and an epitope presented by the non-protective type A Ã 02 (SL9) are examined in detail.
The quantities of mutation options in the absence of CTL selection markedly differ between these epitopes. The SL9 epitope exhibits many variants with similar or higher fitness compared to consensus, whereas KF11 and KK10 epitopes appear to have very few. This finding indicates that the SL9 epitope is much less constrained for mutation than KF11 and KK10, suggesting that HIV-1 generally has fewer options for mutational escape in KF11 and KK10 (Gag p24) than SL9 (Gag p17) epitopes. This result agrees with prior observations that: efficient immune containment of HIV-1 corresponds to CTL targeting of p24 [35], that immunodominance of p24 targeting is commonly associated with protective MHC-I types (including B Ã 57 and B Ã 27) [36,37], and that p24 is highly conserved overall [25]. However, studies delineating associations of particular CTL responses with immune containment of HIV-1 demonstrate that protective epitope targeting is not limited to p24 [37,38], suggesting that sequence constraint at the level of the individual epitope overrides the particular source protein in importance for escape and thus CTL efficacy.
The epitope variants that were enriched under CTL selection further illuminate the constraints for escape mutation. For SL9, there are several highly CTL-enriched variants with intrinsic fitness near the consensus epitope. In contrast, KF11 and KK10 both exhibit few CTL-enriched variants with preserved fitness, in agreement with prior studies showing that CTL escape mutations for these epitopes require high fitness costs [16][17][18][19][21][22][23][24]. Moreover, the variants enriched by CTL selection recapitulate several previously reported escape variants in vivo, such as Y79F in SL9 [16] and A163G in KF11 [18], although some other reported escape variants such as KF11 A163G/S165N [18] were present in the initial plasmid library but appeared replication incompetent. As a whole, these data support the concept that protective MHC-I types such as B Ã 27 and B Ã 57 are beneficial through generating CTL responses against epitopes for which escape occurs only at a high fitness cost to HIV-1.
Regarding the alternative hypothesis that protective MHC-I types yield TCRs with greater promiscuity for epitope variation [26][27][28], our findings do not provide definitive evidence. While KF11-and KK10-specific CTLs do appear to recognize more variants on average than SL9-specific CTLs, the average impacts of CTLs on epitope variants do not vary significantly between epitopes. However, these measurements are limited to CTL interactions only with viable variants, and are thus not a comprehensive evaluation of promiscuity across all epitope variation. Within the subset of viable mutants, there is no clear difference in coverage by CTLs across the three epitopes, and the findings are consistent with a study suggesting that better immune containment of HIV-1 is mediated by CTL responses that are more focused on viable epitope variants despite recognizing fewer epitope variants overall [29].
An unexpected finding is that CTL recognition of SL9 is enhanced by various substitutions at the N-terminus flanking amino acid. This suggests that these substitutions increase epitope presentation compared to the consensus sequence. Although the influence of various detected above threshold in the plasmid library but not in the virus library were considered non-replicating and assigned RE -CTL = -2.0 for these analyses. https://doi.org/10.1371/journal.ppat.1006541.g003 Fitness landscapes of HIV-1 epitopes presented by protective MHC types   mutations within the SL9 epitope reducing its proteasomal processing and presentation have been demonstrated [39], the impairment of processing associated with the N-terminus flanking residue in the consensus sequence has not been reported. Given the high prevalence of A Ã 02 and the capacity of other MHC-I types such as B Ã 40 to present the SL9 epitope, it is plausible that the consensus sequence represents escape adaptation across the human population. Also unexpected is the observation that several SL9 epitope variants had apparently higher fitness than the consensus sequence. Both these findings support the proposal that HIV-1 can accumulate escape mutations in the consensus sequence for circulating strains, as has been suggested specifically for SL9 [40] and more generally across the HIV-1 genome [7,8].
We previously reported the differential ability of CTL clones targeting the same epitope to cross-recognize escape variants [32][33][34]. Here we confirm such differences between clones, but find that the overall options for escape are strikingly similar even between TCRs with entirely Fitness landscapes of HIV-1 epitopes presented by protective MHC types different variable chains. For each epitope, the amino acid substitutions resulting in CTL evasion follow stereotypic patterns mostly sparing the main MHC-I anchor-binding residues. Although such substitutions could affect proteasomal processing, epitope stability, or MHC-I binding, this suggests shared mutational pathways for ablating binding of sequence-distinct TCRs, and that these "public escape" pathways may predominate for these epitopes, consistent with prior population-based studies of HIV-1 escape "footprints" in vivo [8,41].
Several caveats must be considered for the interpretation of our data. Our libraries provide complete coverage for single amino acid polymorphisms in the epitopes, but incomplete coverage for double amino acid polymorphisms, and no coverage for three or more changes. However, most reported escape mutations are single or double polymorphisms compared to consensus, and our data show sharply decreased viability for double mutants compared to single mutants, suggesting that very few triple mutants would be viable. The RE values for epitope variants are semiquantitative reflections of HIV-1 fitness, given the saturating conditions for viral growth that can exaggerate the competitive advantage of the most fit variants. Moreover, the selective pressure exerted by CTLs is dependent on the experimental conditions, i.e. the number of added cells and functional activity of the cells. While these parameters are kept as constant as possible between experiments, there is biologic variability that is difficult to control entirely; thus setting RE values based on consensus sequence epitopes provides a frame of reference for comparisons between different experiments and SL9, KF11, and KK10 epitopes, because HIV-1 with consensus sequences in all three epitopes is shared between all libraries. Finally, fitness costs for sequence polymorphisms can vary considerably in different genomic  Fitness landscapes of HIV-1 epitopes presented by protective MHC types contexts, and our results in HIV-1 strain NL4-3 using single epitope targeting may not reflect the outcome for different virus with CTL pressure on multiple epitopes simultaneously. Related to this point is the inability to assess for compensatory mutations. However, the general patterns we observe are striking, and provide insight into the overall levels of constraints for these epitopes. In summary, our findings indicate that two immunodominant epitopes associated with protective MHC-I types have highly restricted fitness landscapes for mutation compared to one that is not associated with protection, and that this allows very limited options for escape from CTLs. Additionally, most escape pathways appear to be public and shared between different clones recognizing these epitopes. These results have implications for harnessing CTL responses as vaccines and/or immunotherapies. An early attempt at therapeutic adoptive transfer of CTLs resulted in rapid viral escape [42], and analysis of the failed Step trial demonstrated a "sieve" effect in infected individuals, reflecting viral escape from vaccine-induced CTLs [43]. Thus, a successful CTL-based approach will require understanding of the Fitness landscapes of HIV-1 epitopes presented by protective MHC types constraints for escape and strategies to block HIV-1 escape routes through reducing HIV-1 options for mutational escape and/or increasing CTL coverage of mutation options.

Epitope mutational libraries of plasmid HIV-1 NL4-3
Double-stranded DNA spanning the Gag epitope regions of interest were commercially synthesized (gBlock, Integrated DNA Technologies, Coralville, IA) using NNK degenerate codons (where "N" is any nucleotide, and "K" is guanine or thymidine) at each single or double codon position for the epitope and its flanking codons. These gBlock DNA fragments were then PCR Fitness landscapes of HIV-1 epitopes presented by protective MHC types amplified using primers 5'-ATCTCTAGCAGTGGCGCCC-3' with 5'-TTTGGCTGACCTG GCTGTTG-3' for the fragment containing the SLYNTVATL (Gag 77-85, SL9) epitope, and 5'-AGACACCAAGGAAGCCTTAGATAAGA-3' with 5'-TACCTCTTGTGAAGCTT GCTCG-3' for the fragments containing the KAFSPEVIPMF (Gag 162-172, KF11) and KRWIILGLNK (Gag 263-272, KK10) epitopes. These primer sequences corresponded to the start and end sequences of the synthesized DNA fragments. A modified HIV-1 NL4-3 provirus plasmid was created to reduce LTR-driven recombination during cloning, with 5' U3 and 3' U5 regions of the HIV LTR removed (to reduce LTR homology), flanked by the CMV immediate-early promoter and the BGH polyA sequence (Fig 1). Additionally, this vector was modified to delete the synthesized epitope regions except the first and last 15 nucleotides; the junction of the deleted regions were modified to have blunt cutting restriction enzyme sites: Epitope variants (including the immediate flanking residues) with mean ΔRE!0.7 were considered potential escape variants, and RE -CTL !-0.5 were considered viable. The percentages of viability and susceptibility to CTLs for double amino acid variants that were missing in our plasmid library were inferred to be the same as those that were present and tested in this study. Top row: Venn diagrams indicate total numbers of possible single and double amino acid variants (white circles), numbers of viable variants (red circles), and numbers of viable variants susceptible to CTLs (green circles). The surface areas of each circle are approximately proportional to the numbers of variants contained. Bottom panel: Bar graphs indicate the same data, where gray bars indicate non-viable variants, green bars indicate viable variants that are recognized by CTLs, and red bars indicate viable variants that escape CTLs. https://doi.org/10.1371/journal.ppat.1006541.g012 Fitness landscapes of HIV-1 epitopes presented by protective MHC types SfoI for the region containing SL9, AfeI for the region containing KF11 and KK10. After linearizing each plasmid vector with the appropriate enzyme, the PCR-amplified gBlock DNA fragments were inserted via the 15 nucleotide homology by "Infusion" (Clontech, Mountain View, CA) to created whole genome plasmid libraries. The resulting plasmids were then transformed into Stellar chemocompetent E. coli (Clontech, Mountain View, CA), plated onto 100mm LB/ampicillin plates at~2x10 4 colonies/plate and grown for 24 hours at 30˚C. Colonies were collected by washing the bacteria from the plates with Luria broth with ampicillin. The plasmid DNA isolated from these bacteria served as the initial "plasmid libraries" for each epitope.

Creation of HIV-1 epitope libraries
The plasmid libraries of each epitope were lipofected into two T75 flasks of 70% confluent HEK 293T cells (obtained from Dr. Irvin S. Y. Chen, University of California, Los Angeles) using 20μg DNA with BioT lipofection reagent (Bioland Scientific, Paramount, CA). After 24 hours the media was removed, and 10 7 T1 cells [44] (obtained from Dr. Bruce D. Walker, Harvard University) in 20mL RPMI 1640 medium supplemented with 10% FCS, L-glutamine, HEPES, and penicillin-streptomycin (R10) were added to each flask to promote cell-cell infection of the T1 cells. After 24 hours, the nonadherent cells were removed and transferred to a Fitness landscapes of HIV-1 epitopes presented by protective MHC types new flask. These cells were then cultured for 6 to 8 days in R10 media until at least 50% of the cells were infected with HIV-1 (determined by expression of p24 antigen in the cells by intracellular staining and flow cytometry). The supernatant was then filtered through a 0.45 micron filter and cryopreserved to be utilized as the "starting virus library." All virus libraries were produced in duplicate, and all experiments utilized both libraries in parallel, with duplicates for cultures without CTLs (two replicates for each library, four total) and singles for cultures with CTLs (one replicate for each library, two total).

Ethics statement
CTL clones (Table 1) were previously isolated from chronically HIV-1-infected persons and maintained as previously described [46][47][48] from blood obtained with written informed consent under a University of California, Los Angeles Institutional Review Board-approved protocol, with the exception of 68A62 provided by Dr. Bruce D. Walker (Harvard University).

HIV-1-specific CTL derivation and maintenance
In brief, peripheral blood mononuclear cells (PBMCs) were enriched for the CTLs of interest by culture with the appropriate epitope, followed by cloning at limiting dilution. Some experiments utilized KK10-specific CTLs previously produced by stable lentiviral transduction of allogeneic CD8 + T-cells with a KK10-specific T cell receptor (TCR) sequence identified by quantitative spectratyping [31] (TCR5) that had been cloned into a lentiviral vector as previously described [34]. CTLs were maintained by periodic stimulation with 200ng/mL of the monoclonal anti-CD3 12F6 antibody [49] with irradiated allogeneic PBMCs (obtained anonymously through the UCLA AIDS Institute Virology Core Facility) in R10 media supplemented with recombinant human interleukin-2 (NIH AIDS Reference and Reagent Repository) at 50IU/mL (R10-50). For the CTL clones, TCR beta variable (BV) chain sequences were determined after RNA isolation using Trizol reagent (ThermoFisher Scientific, Waltham, MA), amplification and cloning of the BV gene using the SMARTER 5' RACE kit (Clontech, Mountain View, CA) with a constant region primer (5'-CTTCTGATGGCTCAAACAC-3'), and sequencing using the same primer.
Passaging of virus libraries 5x10 6 permissive cells (10 6 cells for the SL9 library passaged with the 1.9 CTL) were infected with the starting virus library, yielding about 10-20% infected cells after 72-96 hours (determined by intracellular staining for p24). The cells were then washed twice and resuspended at 5x10 5 cells/mL in R10-50. CTLs were added at effector:target ratios of 1:8 (except 1:2 for the SL9 library with CTL 1.9), with parallel no-CTL controls. These cultures were fed every 3 days by removing and replacing half of the media. After 7 days the supernatant was filtered through a 0.45 micron filter and cryopreserved; virus in the supernatant was quantified via p24 ELISA (Xpress Bio, Frederick, MD). This virus was utilized to infect cells for a second passage in the same manner using 5x10 3 pg p24 per 10 6 target cells (10 3 pg p24 per 10 6 target cells for the Fitness landscapes of HIV-1 epitopes presented by protective MHC types KK10 library), followed by collection and cryopreservation as before. All passaging with CTLs was performed with duplicate virus libraries, and passaging without CTLs was done in quadruplicate (2 replicates for each virus library).

Deep sequencing of passaged virus libraries
The passaged virus supernatant was treated with DNAse I (New England Biolabs, Ipswich, MA) to remove residual plasmid DNA. HIV-1 RNA was isolated with the QIAmp viral RNA mini kit (Qiagen, Hilden, Germany), and reverse-transcribed with the high capacity cDNA reverse transcription kit (ThermoFisher Scientific, Waltham, MA) and quantified by real-time PCR with ssoFast EvaGreen supermix on a CFX96 (Bio-Rad, Hercules, CA) with gag-specific primers (5'-ATCTCTAGCAGTGGCGCCC-3' and 5'-TTTGGCTGACCTGGCTGTTG-3') compared to NL4-3 plasmid standard to ensure !5x10 5 copies/μL of cDNA per specimen. This cDNA and the starting plasmid libraries were prepared for deep sequencing by PCR amplification using primers tagged with 6 base-pair customized barcodes. The gene specific portions of the primers were:

Deep sequencing analyses
The sequence data were parsed using the SeqIO function of open source BioPython software (http://biopython.org/). Sequences from different samples were de-multiplexed by the barcodes and mapped to the corresponding region in the HIV-1 genome. Since both forward and reverse reads covered the mutated region, paired reads were used to compensate for sequencing errors. A polymorphism was accepted as valid only if observed in both reads and with a quality score !30. Further filtering for errors was done by comparison to control deep sequencing of the index NL4-3 plasmid; variants present at a frequency <10 −4 were only accepted if their frequencies in duplicate virus libraries exceeded 10-fold the observed frequency of the variant in the control plasmid sequences (due to background error). The sequencing depth was >6x10 5 and >4x10 6 for the virus and plasmid libraries respectively. All the data processing and analysis was performed with customized python scripts, which are available upon request. Variants above threshold in initial virus libraries whose frequencies decayed to 0 after passaging were assigned a frequency of 10 −6 for calculation of RE values. All sequences have been uploaded to GenBank (PRJNA394927).

Statistics
Comparisons for correlations of replicate experiments and selection of epitope variants by different CTL clones were performed using Spearman rank correlation. Comparisons of means of two groups were performed using Student's t-test. Comparisons of frequencies between two groups were performed using Fisher's exact test.