Predominance of positive epistasis among drug resistance-associated mutations in HIV-1 protease

Drug-resistant mutations often have deleterious impacts on replication fitness, posing a fitness cost that can only be overcome by compensatory mutations. However, the role of fitness cost in the evolution of drug resistance has often been overlooked in clinical studies or in vitro selection experiments, as these observations only capture the outcome of drug selection. In this study, we systematically profile the fitness landscape of resistance-associated sites in HIV-1 protease using deep mutational scanning. We construct a mutant library covering combinations of mutations at 11 sites in HIV-1 protease, all of which are associated with resistance to protease inhibitors in clinic. Using deep sequencing, we quantify the fitness of thousands of HIV-1 protease mutants after multiple cycles of replication in human T cells. Although the majority of resistance-associated mutations have deleterious effects on viral replication, we find that epistasis among resistance-associated mutations is predominantly positive. Furthermore, our fitness data are consistent with genetic interactions inferred directly from HIV sequence data of patients. Fitness valleys formed by strong positive epistasis reduce the likelihood of reversal of drug resistance mutations. Overall, our results support the view that strong compensatory effects are involved in the emergence of clinically observed resistance mutations and provide insights to understanding fitness barriers in the evolution and reversion of drug resistance.


Introduction
Antibiotics and antiviral drugs have achieved great success in recent history [1]. However, therapeutic failure may occur due to low adherence and the emergence of drug resistance [2,3]. The increasing amount of drug resistant pathogens is a global threat to public health [4][5][6][7][8][9][10][11]. The genetic barrier to drug resistance, defined as the number of mutations needed to acquire resistance, is a major determining factor of treatment outcomes [12][13][14]. Another important but often overlooked aspect of drug resistance is the fitness barrier [15][16][17]. Resistance associated mutations (RAMs) in pathogen proteins may decrease enzymatic activities, interfere with molecular interactions, or destabilize the protein structure [18][19][20][21][22]. Because of the impaired replication capacity without drug selection, drug-resistant mutants cannot normally outcompete wild-type or establish in the population [23][24][25]. However, drug-resistant mutants can sometimes reach substantial frequency in the population. Fluctuating drug concentrations may create time windows when drug-resistant mutants replicate better than wildtype virus [26]. Moreover, compensatory mutations can rescue the impaired replication capacity of mutants and stabilize drug resistance [22,27,[27][28][29]. Thus, comprehensive quantification of the fitness landscape is needed to predict the evolution of drug resistance [30,31].
Epistasis, i.e. genetic interactions between mutations, is prevalent in molecular evolution [30][31][32][33][34]. Negative epistasis decreases fitness of the double mutant, posing constraints on gaining multiple mutations [35,36]. It plays an important role in shaping the local fitness landscape [37]. Positive epistasis increases replication capacity of the double mutant, facilitating pathogens to acquire and maintain drug resistance [38][39][40]. Positive epistasis may create a fitness valley that prevents drug resistant mutations from reversal [41]. Collectively, positive and negative epistasis determine the topography of the fitness landscape [42] and the course of drug resistance evolution [32]. Empirical studies on the genetic interactions between RAMs, especially in high-order mutants, are still rare [43,44].
HIV-1 protease inhibitors are important components of combination antiretroviral therapy [45] that target HIV-1 protease enzymatic activity [46,47]. Second-generation protease inhibitors have extremely high binding affinity to viral protein [48]. Resistance to them typically requires more mutations than resistance to first-generation protease inhibitors and other antiretroviral drugs [49,50]. For example, mutation K103N on reverse transcriptase is sufficient to confer HIV-1 nevirapine (NVP) resistance [51], while more than 4 de novo mutations are needed for protease inhibitor Darunavir (DRV) resistance [52]. Protease inhibitor-resistant viruses with multiple RAMs also have significantly reduced fitness [53,54]. HIV-1 gained RAMs on protease during sub-optimal protease inhibitor therapy [55]. Most resistance mutations directly affect the binding affinity between HIV-1 protease and the inhibitor, but they are likely to be deleterious because they also reduce binding to the native substrate of HIV-1 protease. To compensate the deleterious effect, some other RAMs stabilize HIV-1 protease, allowing drug-resistant virus to replicate as efficiently as its parental wild-type virus [27,56]. The compensatory effects between pairs of RAMs have been studied in several studies and are available on the Stanford HIV drug resistance database [22,[57][58][59][60][61]. Meanwhile, reversals of protease inhibitor resistance-associated mutations were rarely seen clinically, even when therapy was interrupted [62] or when mutant virus infected drug-naïve patients [63,64]. These observations indicate that epistasis may be important for the evolution of protease inhibitor resistance. Recent analyses of sequence co-variation in drug-targeted HIV Pol proteins (protease, reverse transcriptase and integrase) and co-evolutionary Potts model provide evidence that epistasis plays an important role in drug resistance. Despite being disfavored in the wild-type background, primary resistance mutations can become entrenched by the complex mutation patterns which arise in response to drug therapy [65,66].
Here, we present a quantitative high-throughput genetics approach [67,68] to study the fitness distribution and epistasis of HIV-1 protease inhibitor RAMs. Combining these data with clinical data and fitness models, we found that positive epistasis was predominant and especially enriched among RAMs, and prevalent along drug resistance evolutionary paths. Our results suggest that fitness hills created by epistasis result in barriers that entrench RAMs, and thus drug-resistant viruses are unlikely to revert after transmission to drug-naïve patients or discontinuation of anti-retroviral drug treatment.

Fitness profiling of RAMs in HIV protease
To study the interactions among RAMs in HIV protease, we constructed a library of virus mutants that covers combinations of amino acid substitutions at 11 resistance-associated sites in HIV protease ( Fig 1A, Table 1, 2 9 × 3 2 = 4608 genotypes). To ensure sufficient coverage, we harvested more than 30000 colonies after transforming E. coli. These sites have been annotated as major drug resistance sites in Stanford Drug Resistance Database [61,69], and all have been shown to be strongly associated with drug resistance [3]. In our mutant library, 9 sites have one amino acid substitution and the other 2 sites have 2 amino acid substitutions (Fig 1A,  Table 1). 2736 out of 4608 possible genotypes (59.38%) were covered in the plasmid library.
We quantified the relative fitness of mutants using high-throughput fitness profiling ( Fig  1B, See Material and methods for details). We performed 3 independent transfection experiments to validate the reproducibility of fitness profiling. 20 million 293T cells were transfected and 50 million T cells were infected in each experiment. For each biological replicate, relative fitness was calculated independently. The Pearson's correlation coefficients of single, double and triple mutations between replicates range from 0.80 to 0.82 ( Fig 1C and S1 Fig). After filtering out mutants with low frequency or low reproducibility among replicates of input virus libraries (see Material and methods for details), we were able to estimate the relative fitness of 1219 genotypes. The fitnesses of all single mutants, and more than 70% of double and triple mutants, were quantified (S2 Fig).
To validate the quantification of relative fitness, we conducted competition experiments with individually constructed protease mutants. We performed two sets of validation experiments. For the first set, we packaged the mutant virus and wild-type virus independently and mixed them in pairs for head-to-head competition. The frequency of the mutant virus and wild-type virus were quantified by deep sequencing and the relative fitness was calculated in the same way as we did in library screening. A total of 7 mutants were constructed and validated. For the second set of experiments, we mixed all 7 single mutants with wild-type virus in competition experiments. The relative fitness was defined in the same way. The fitness measured in validation experiments was highly correlated with the fitness in library screening ( Fig  1D, R = 0.84 for each independent validation, Pearson's correlation test). In addition, we compared the selection coefficients of HIV-1 protease mutants measured in an independent study . The side chains of selected resistance associated residues are shown. (B) Workflow of the fitness profiling. Protease mutations were introduced into NL4-3 background. T cells were infected by the mutant virus library. The frequency of mutants before (input library) and after (output library) selection were deep sequenced. (C) The correlation of relative fitness between two biological replicates. Pearson correlation coefficient (R) is 0.80. (D) Two independent validation experiments were performed. We constructed 7 protease single mutant plasmids and recovered viruses independently. We mixed each mutant virus with wild-type virus (validation 1, black dots) and passaged in T cells for 6 days. We also mixed all 7 mutant viruses together with wild-type (validation 2, red dots) and infected T cells for 6 days. The relative fitness of each mutant was quantified by the same means as that in the library. Pearson correlation coefficients (R) for validation 1 and validation 2 are both 0.84. Error bar is standard deviation (n = 3). (E) The correlation of relative fitness in this study with the experimental selection coefficients in [71]. Pearson correlation coefficients (R) is 0.79.  Table). The experimental results from two studies show a good correlation (Pearson's correlation coefficient is 0.79), supporting the reliability of our experimental methods.

Positive epistasis rescues the mutational load of RAMs
We first looked at fitness effect of RAMs. In our definition, a mutant virus of relative fitness −1 means that the relative frequency of this mutant drops 10 fold after infection in cell culture. All single mutations were deleterious to virus replication (Fig 2A). The relative fitness of single mutants ranged from -2.33 (V82F) to -0.19 (L90M). This is consistent with previous reports that randomly introduced mutations were mostly deleterious to protease enzymatic activity or HIV-1 replication capacity [34, [72][73][74]. Random mutagenesis in other viruses also revealed a lack of beneficial mutations in well-adapted systems [73,[75][76][77]. RAMs in particular were also reported to be deleterious to virus replication [31,44]. They may destabilize viral protein, affect enzymatic activities or impact other protein-protein interactions [21,78].
We then analyzed epistasis between all pairs of RAMs. Previous studies have shown the prevalence of epistasis among pairs of random mutations [34,37,75] or spontaneously accumulated mutations [79]. However, studies focused on the epistasis among drug resistance mutations are still limited [30,39,72,75,80]. Based on the fitness effect of single RAMs, we predicted the relative fitness of double mutants with the assumption that no epistasis existed among any two single mutations (i.e., the predicted relative fitness of a double mutant was the sum of those of two single mutants)( Fig 2B). Surprisingly, the observed relative fitness of most double mutants were significantly higher than the predicted values (p = 2.2 × 10 −6 , two-sided Wilcoxon rank sum test), suggesting that positive epistasis is prevalent among RAMs (Fig 2B  inset). Pairwise epistasis between two RAMs is quantified as ε i,j = f i,j − f i − f j , f i represents the relative fitness of mutants i. The distribution of epistasis ranged from -0.69 (M46I and L90M) to 2.34 (L76V and V82F) and 86.6% of pairwise interactions between RAMs are positive.
We also analyzed the extent of epistasis among high-order mutants. We observed a trend that relative fitness decreased as the order of mutants increased (S3 Fig). This is consistent with previous reports that mutational load restricted virus replication capacity [30, 81,82]. To better quantify the fitness cost of multiple mutations, we calculated the frequency of viable mutants by different thresholds, f > −2 or f > −4. The frequency of viable mutant virus decreased as the number of mutations increased (Fig 2C), consistent with previous observations in HIV-1 and other RNA viruses [83][84][85][86]. We then predicted the relative fitness of highorder mutants by summing the relative fitness of corresponding single mutants. We observed more viable mutants than would be predicted without epistasis ( Fig 2C). This indicated pervasive positive epistasis rescued high-order mutants from lethal relative fitness, which is consistent with other clinical observations in protease inhibitor resistant virus [30,44]. As a result, positive epistasis partially relieved HIV-1 mutational load and allowed viruses to explore more sequence space.

Enrichment of positive epistasis among RAMs
There are two possible explanations for the observed positive epistasis among RAMs of HIV protease. The first hypothesis is that all mutations in HIV protease tend to interact positively. The second hypothesis is that epistasis among random mutations in HIV protease is on average zero, but positive epistasis is enriched among RAMs. We introduced the Potts model to test our hypotheses, while simultaneously testing whether our finding of prevalent positive epistasis among RAMs carries over to the clinical setting. Potts models, originally developed in statistical physics, have been employed previously to use the population-level frequencies and correlations between different mutations to estimate their fitness effects [87][88][89][90]. In the Potts  effects. We applied a maximum entropy method [91] to an alignment of 20911 HIV-1 clade B protease sequences from drug naïve patients, obtained from the Los Alamos National Laboratory HIV sequence database (hiv.lanl.gov, accessed 24 March 2017) to calculate these two sets of Potts parameters. Then we calculated EðÃÞ for all mutants in our protease library. We found that the Potts energy for single, double or triple mutants (ΔE = E mut − E WT ) is significantly correlated with the relative fitness we measured in our screening (ρ = −0.46, p = 1.2 × 10 −14 , Spearman's correlation test, Fig 3B). The correlation was lower than previous analysis in HIV-1 Gag and Env region [88,90]. This may be due in part to strong phylogenetic bias on the inferred Potts parameters, because protease is highly conserved. It is also possible that epistatic interactions with cleavage sites on other parts of the HIV-1 genome and complicated anti-innate immunity functions of protease [57, 59, 92] obscure the effects of individual mutations on replicative fitness in vitro.
The Potts couplings J ij (A i , A j ) give the contribution of pairwise epistatic interactions between amino acids A i and A j at sites i and j, respectively. We compared the couplings among RAMs and among all other possible mutations on protease ( Fig 3C). Couplings of other protease mutations clustered near 0, while those of RAMs are significantly more positive than that of other mutations (D = 0.22, p = 2.1 × 10 −07 , two-sided K-S test). Moreover, J ij (A i , A j ) among RAMs were also more positive than those between RAMs and other residues (D = 0.22, p = 5.1 × 10 −07 , two-sided K-S test). Although the fields h i (A i ) of RAMs are more negative than other mutations, the difference is not significant (Fig 3D, D = 0.25, p = 0.20, two-sided K-S test). We note that the magnitude and the variation of field parameters is much larger than that of coupling parameters (Fig 3C and 3D). The Interquartile Range (IQR, i.e. the middle 50%) of field parameters is 3.55, while the IQR of coupling parameters is 0.15. The standard deviation of field parameters is 2.29, while the standard deviation of coupling parameters is 0.37. Overall, analysis based on the Potts model is consistent with our experimental results that positive epistasis is enriched among RAMs, and lends support to our second hypothesis that epistasis among random mutations in HIV protease is on average zero.

Implications of positive epistasis in evolution
To study the role of epistasis in evolution, we analyzed the evolutionary pathways covering all genotypes with up to 4 amino acid substitutions from the wild-type virus (13 single mutants, 67 double mutants, 176 triple mutants and 290 quadruple mutants) ( Fig 4A). Mutants are linked if they differ by one amino acid substitution.
We have found that all 13 RAMs are deleterious on the wild-type background (Fig 2A). However, the fitness effect of a single RAM becomes less deleterious on genetic backgrounds where other RAMs have been fixed (S4 Fig). Following the generalized definition of epistasis proposed by Shah et al. [93], we define trajectory-based epistasis ε M,j that measures the deviation of the fitness effect if the order of mutations were reversed.
where f M and f j represent the relative fitness of background M and single mutant j [94]. For example, mutation j can be deleterious on the wild-type background but beneficial on another genetic background that mutation i has been fixed. Trajectory-based epistasis is calculated for each amino acid substitution and averaged over genetic backgrounds with a certain Hamming distance to the wild-type (Fig 4B). For all RAMs profiled in this study, we find that trajectorybased epistasis is overall positive and increases steadily with the number of substitutions, i.e. the fitness contribution of a specific amino acid substitution becomes more positive if more RAMs have been fixed. Our results are consistent with previous analyses of sequence co-variation in HIV-1 protease [65,66], where inferred epistastic interactions among mutations at PI resistance associated sites lead to entrenchment of primary drug resistance mutations. In this study, we combine the analyses of co-variation (Potts model) with comprehensive experimental fitness data of HIV-1 protease mutants (including a large number of higher-order mutants) to provide direct evidence of positive epistasis among RAMs of second-generation PIs.
We tested the hypothesis that positive epistasis prevented resistance associated genotypes from reverting to wild-type [41, 95,96]. Although RAMs incurred significant fitness cost, some drug resistant mutants would not revert to wild-type after transmitting to a drug naïve patient. We quantified the frequency of accessible evolutionary pathways between mutants and wild-type in our experimentally measured fitness landscape of HIV protease RAMs. A reversal path is defined to be accessible if and only if the virus fitness increases monotonically along the path. For example, quadruple mutant V32I_M46I_I54L_V82F has many paths to revert to wild-type ( Fig 4A). Among them, reversing V32I, I54L, V82F and M46I in order is an accessible path (Fig 4A, red line). On the contrary, reversing I54L, V82F, M46I and V32I is not an accessible path because there are 2 steps with decreasing fitness (Fig 4A, blue line). We found that among double mutants, 44 have two accessible reversal paths to the wild type, 20 have only one accessible reversal path, and interestingly 3 of them have none. These 3 mutants (I50V_T74P, M46I_I54M and L76V_V82F) represent local fitness peaks and the reversal to wild-type is blocked by a fitness valley. We found that the number of accessible reversal paths decreased with the accumulation of RAMs (Fig 4C). This indicates that protease mutants become less likely to revert to wild-type as the number of RAMs increases. Our results are consistent with clinical observations that protease inhibitor resistance associated mutations seldom reverted even when therapies were interrupted [25, 62] or drug-naïve patients were infected [63,64]. The difficulty of reversal also explains the rising frequency of drug resistant HIV-1 viruses in acute phase patients [41,96].

Discussion
In this study, we systematically quantified the fitness effect of RAMs of HIV-1 protease. While all RAMs reduced the virus replication fitness, pervasive positive epistasis among RAMs alleviated the fitness cost substantially. Moreover, we analyzed the HIV sequence data from patients by the Potts model. We found the statistical energy inferred from HIV sequences in vivo correlated well with the replication fitness measured in vitro. Based on our fitness data and the mutational couplings inferred by the Potts model, we showed that positive epistasis is enriched among RAMs of HIV-1 protease, in both local fitness landscape and evolutionary paths. Finally, we studied the role of epistasis in evolutionary pathways. We found that positive epistasis among RAMs entrenches drug resistance and blocks the reversal paths to wild-type virus, which has important implications for the design of anti-retroviral therapies. Through this project, we also established a high-throughput platform to quantify the genetic interactions among a group of mutations. Another independent study profiled the fitness effect of all single amino acid change on HIV protease [71]. The data showed significant correlation with our study ( Fig  1E, Pearson's correlation coefficient (R) is 0.79).
There are a few limitations of this study. Firstly, we only measured the fitness effect of RAMs in the absence of protease inhibitors. We are not able to quantify drug resistance of RAMs because protease inhibitors block multiple rounds of virus infection and prevent us from accurate examination of mutant frequency under drug selection. Also, we did not sequence other genes of HIV-1. HIV-1 mutates rapidly due to low fidelity of reverse transcriptase [97,98]. There might be compensatory mutations occurring on other proteins that rescued the protease RAMs. Secondly, the correlation between our validation experiments and high-throughput screening experiments was less than the correlation observed in similar experiments in bacteria and yeast [99,100]. The correlation between Potts energy and experimental fitness is also lower than previous reports on Gag and Env regions [88,90]. Mechanistic difference between logistic growth and viral growth may complicate the quantification of viral fitness [101]. Direct measurement of viral frequency may not linearly correlate to the probability of replication [102]. Moreover, we tested a large number of higher-order mutants (i.e. multiple mutations from the wild-type virus). Our experimental dataset not only contains clinically observed genotypes but also combinations of mutations that was not observed in patients, which are highly deleterious and may suffer from higher experimental errors. If we exclude higher-order mutants and very deleterious genotypes (S5 Fig), the Spearman's correlation between fitness and Potts energy is higher (ρ = −0.54, compared to ρ = −0.46 in Fig 3B). Thirdly, we did not cover all clinically observed polymorphism, given the bottlenecks in virus library screening. We chose to prioritize for RAMs of second-generation protease inhibitors Darunavir (DRV) and Tipranavir (TPV), which are considered to have high genetic barriers (i.e. multiple RAMs are involved in the emergence and reversal of drug resistance) [52]. According to Stanford Drug Resistance Database [61,69], the RAMs that we chose contribute to the resistance to DRV and TPV (S2 Table). The only exception is L90M, which is frequently found in drug resistant viruses. The RAMs and the combinatorial genotypes in our library are prevalent in patients and documented in Stanford Drug Resistance Databases (Table 1). Future work could be extended to cover more clinically observed polymorphism in HIV-1 protease and other drug-targeted proteins. Finally, the correlation between Potts energy and experimental fitness is confounded by many factors, like different selection pressures in vivo and in vitro, or phylogenetic bias. Nonetheless, we observe moderate but statistically significant correlation between the coupling parameters in the Potts model and the experimental epistasis (S6 Fig, Spearman's correlation test, p = 6.8 × 10 −3 ). We note that the coupling parameters in the Potts model and the experimental measure of epistasis (calculated for WT genetic background) are conceptually different, representing Fourier coefficients and Taylor coefficients of the fitness landscape [103]. Our findings are consistent with the literature that Potts model couplings are strongly associated with contact residues in the three-dimensional structure of protein families [104,105]. We tested a series of different statistical models, including the binary (Ising) model inferred via ACE, the Potts model inferred via pseudo-likelihood maximization (a popular approach to analyzing sequence data from protein families), and the Potts model inferred via ACE, to examine the epistatic effects among drug resistance mutations (S7 Fig). We found that the Potts model inferred via ACE is the best choice to analyze epistasis in our study.
Statistical models suggest a pervasive negative distribution of fitness effect for single mutations on HIV-1 [31, 88,106]. Previous models also predicted the entrenchment of deleterious RAMs by positive epistasis [65,66]. This dataset provides a unique chance to experimentally test these statistical hypotheses. The predominance of positive epistasis is also observed in HIV-1 [30] and in other organisms [39,107]. However, they either relied on naturally-occurring resistant clones or indirectly activating gene functions. This report is the first dataset to systematically quantify the epistasis among functional residues in HIV-1 drug resistance evolution, without the bias of drug selection and in vivo evolution. Overall, our results are important for understanding drug resistance evolution. We found positive epistasis plays a critical role in HIV-1 gaining and maintaining drug resistance. Epistasis makes the fitness landscape rugged, preventing RAMs from reversion to wild-type, even when antiviral therapy is interrupted or virus transmits to a healthy individual [95,108].
Positive epistasis involves many kinds of molecular mechanisms. We find that the relative fitness of single mutants is not a significant factor of positive epistasis. We compared h i in the Potts model for all RAMs and other single mutants. They were not significantly different (p = 0.20, K-S test). Physical distance between residues is a significant factor contributing to positive epistasis. The physical distances between these residues were significantly less than those between any two random residues on HIV-1 protease (D = 0.32, p = 3.9 × 10 −10 , twosided K-S test, S8 Fig), suggesting that physical contact among RAMs might contribute to the observed positive epistasis. Notably, their average distance was more than 10 Å, indicating most of them did not have direct contact. Some mutations may have structurally stabilizing effect to other residues. We used FoldX and Rosetta to predict the folding free energy (ΔΔG) as a quantification of protein stability [109,110] for all mutants in our library (S8 Fig). We notice that mutation V82F contributed to the positive epistasis on many genetic backgrounds ( Fig  4B), but it did not contribute much to the stabilizing effect. Thus, structurally stabilizing effects cannot fully explain the predominance of positive epistasis observed in this study. Future studies on the structure and function of HIV-1 protease mutants will help elucidate the molecular mechanisms underlying the interactions among RAMs.

Plasmid library construction
HIV-1 RAMs were picked according to their prevalence in protease inhibitor treated patients [3]. We chose 11 residues with 13 mutations to construct a combination of HIV-1 protease mutant library (Table 1).
We used a ligation-PCR method to construct the library on NL4-3 backbone, which is an infectious subtype B strain. All possible combinations of these 13 mutations are 2 9 × 3 2 = 4608 genotypes. The mutagenesis region spanned 243 nucleotides on HIV-1 genome. We split the region into 5 oligonucleotides and ligate them in order by T4 ligase (from New England Bio-Labs). The sequence of oligonucleotides are shown in S3 Table. After each ligation, we recovered the product by PCR and used restriction enzyme BsaI-HF (from New England BioLabs) to generate a sticky end for the next step ligation.
After making the 243-nucleotide mutagenesis fragment, we PCR amplified the upstream and downstream regions near this fragment and used overlap extension PCR to ligate them together. We then cloned it into full length HIV-1 NL4-3 background. We harvested more than 30,000 E. coli colonies to ensure sufficient coverage of the library complexity.

Virus production
The plasmid DNA was purified by HiPure Plasmid Midi Prep Kit (from Thermo Fisher Scientific). To produce virus, we used 16 μg plasmid DNA and 40 μL lipofectamine 2000 (from Thermo Fisher Scientific) to transfect 2 × 10 7 293T cells, in 3 independent biological replicates. We changed media 12 hours post transfection. The supernatant was harvested 48 hours post transfection, labeled as input virus and frozen at -80˚C. We harvested 40mL viruses from each transfection. Virus was quantified by p24 antigen ELISA kit (from PerkinElmer).

Library screening
CEM cells were cultured in RMPI 1640 (from Corning) with 10% FBS (from Corning). To passage library in T cells, we added 25 mL viruses and 120 μg polybrene to 50 million CEM cells. We achieved 10 ng p24 (10 8 physical viral particles) for every million CEM cells during infection. We washed cells and completely changed media 6 hours post infection. We supplemented the cells with fresh media 3 days post infection and harvested supernatant 6 days post infection. We centrifuged supernatant at 500 × g for 3 minutes to remove the cells and cell debris. The rest of supernatant was frozen at -80˚C.
In summary, we carefully controlled the experiment scales to ensure the library complexity was maintained in every step. Briefly, we harvested >3 × 10 4 E. coli colonies during bacteria transformation, which ensured *6-fold coverage of the expected complexity (4608 genotypes). We then transfected 2 × 10 7 HEK 293T cells with 16 μg plasmid library to package infectious viruses. We used 25 mL viruses (500 ng p24, * 5 × 10 9 viral particles) to infect 2 × 10 7 million CEM cells for each biological replicate.

Sequencing library preparation
We used QIAamp viral RNA mini kit (from QIAGEN) to extract virus RNA from supernatant. We then used DNase I (from Thermo Fisher Scientific) to remove the residual DNA. We used random hexamer and SuperScript III (from Thermo Fisher Scientific) to synthesize cDNA. The virus genome copy number was quantified by qPCR. The qPCR primers are 5' -CCTTGT TGGTCCAAAATGCGAAC-3' and 5' -ATGGCCGGGTCCCCCCACTCCCT-3'.
At least 2 × 10 5 copies of viral genome were used to make sequencing libraries. We PCR amplified the mutagenesis regions using the following primers: 5' -CTAATCCTGGAGTCTTT GGCAGCGACCC-3' and 5' -GAAGACCTGGAGTGCAGCCAATCTGAGT-3'. We then used BpmI (from New England BioLabs) to cleave the primers and ligate the sequencing adapter to the amplicon. We used PE250 program on Illumina MiSeq platform to sequence the amplicon.

Calculation of fitness and epistasis
We used custom python codes to map the sequencing reads to reference NL4-3 genome. Mutations were called if both forward and reverse reads have the same mutation and phred quality scores are both above 30. All codes are available on https://github.com/Tian-hao/ protease-inhibitor. All data were deposited in SRA (short read archive) database under accession PRJNA546460. For each replicate of the virus library from the transfected 293T cells, we reached 4.45 × 10 5 to 6.05 × 10 5 sequencing depth. We filtered out the genotypes with frequency fewer than 5 × 10 −5 in any biological replicate and the genotypes whose frequency differ more than 10 folds between any two biological replicates.
Relative fitness f m,r of mutant m in experiment r (biological replicates) was defined as Eq 1.
F m,r,input is the frequency of mutant m before screening. F m,r,output is the frequency of mutant m after passaging. F WT,r,input is the frequency of wild-type virus before screening. F WT,r,output is the frequency of wild-type virus after passaging.
The relative fitness f m was defined as the average of 3 biological replicates (Eq 2). However, if relative fitness was missing in one replicate, we only average the other two replicates. The relative fitness value of all mutants was shown in S1 Table. f m ¼ where R is the number of biological replicates. Pairwise epistasis ε i,j between mutant i and mutant j was defined as: where f i,j refers to the relative fitness of double mutant i and j. Trajectory-based epistasis ε M,j between a multi-mutation genotype M and another genotype differ by one mutation j was defined as:

Potts model
Data used to infer parameters for the Potts model were downloaded from the Los Alamos National Laboratory HIV sequence database, as described in the main text. Sequences were processed as previously described [111]. Briefly, we first removed insertions relative to the HXB2 reference sequence. We also excluded sequences labeled as "problematic" in the database, and sequences with gaps or ambiguous amino acids present at >5% of residues were removed. Remaining ambiguous amino acids were imputed using simple mean imputation.
Each sequence in the multiple sequence alignment (MSA) is represented as a vector of vari-ablesÃ ¼ fA 1 ; A 2 ; . . . ; A N g, where N = 99 is the length of the sequence. Each of the A i represents a (set of) amino acid(s) present at residue i in the protein sequence. To choose the amino acids at each site that would be explicitly represented in the model, we first computed the frequency p � i ðAÞ of each amino acid A at each site i in the MSA. To compute these frequencies, we weighted the sequences such that the weight of all sequences from each unique patient was equal to one, thereby avoiding overcounting in cases where many sequences were isolated from a single individual. We then explicitly modeled the q i most frequently observed amino acids at each site that collectively capture at least 90% of the Shannon entropy of the distribution of amino acids at that site [111]. All remaining, rarely observed amino acids were grouped together into a single aggregate state. For these data, this choice resulted in an average of three explicitly modeled states at each site (minimum of 2, maximum of 6).
The Potts model is a probabilistic model for the 'compressed' sequencesÃ, where the probability of observing a sequenceÃ is Here the normalizing factor ensures that the probability distribution is normalized. We used ACE [91] to infer the set of Potts fields h i (A i ) and couplings J ij (A i , A j ) that result in average frequencies and correlations between amino acids in the model (5) that match the frequencies p � i ðA i Þ and correlations p � ij ðA i ; A j Þ observed in the data. We used a regularization strength of γ = 7 × 10 −5 in the inference, which is roughly equal to one divided by the number of unique patients from which the sequence data were obtained. We used "consensus gauge," where the fields and couplings for the most frequent residue at each site in the protein are set to zero. We confirmed that the parameters inferred by ACE resulted in a Potts model that accurately recovered the correlations present in the data.

Validation experiments
We constructed 7 single mutants by site-directed mutagenesis. The primers used this experiment are listed in S3 Table. We used overlap-extension PCR to amplify the fragment with mutated nucleotides. We ligated the fragment with NL4-3 backbone using ApaI and SbfI. We transformed competent E.coli and picked single colonies. We sequenced the protease region of plasmids to make sure there is only desired mutant in this region. 7 mutants were L10F, I47V, T74P, L76V, V82F, V82T, L90M.
We produced mutant viruses in 293T cells, mixed them with wild-type and infected CEM cells. The frequencies of mutant virus before and after infection were quantified by deep sequencing. We did 2 biological replicates with each validation method. For validation 1, we pairwisely mixed the mutant and wild-type virus oor competition. For validation 2, we mixed all 7 mutants and wild-type virus.

Protein stability prediction
Mutants stability was predicted using either FoldX or Rosetta. For FoldX, we used the protease structure (PDB: 3S85) as reference and repaired the structure using the RepairPDB function. The free energy of the mutants was computed by using the BuildModel function under default parameters. For Rosetta analysis, we used the protease crystal structure (PDB: 6DGX) as reference and score function ddg_monomer to evaluate the effect of mutations. Each mutants were evaluated 10 times and the average score was used as ΔΔG.

Ethics statement
Reagents were acquired from the NIH AIDS Reagent program. The work is approved by UCLA IRB.   Barbour