HCV Genotypes Are Differently Prone to the Development of Resistance to Linear and Macrocyclic Protease Inhibitors

Background Because of the extreme genetic variability of hepatitis C virus (HCV), we analyzed whether specific HCV-genotypes are differently prone to develop resistance to linear and macrocyclic protease-inhibitors (PIs). Methods The study includes 1568 NS3-protease sequences, isolated from PI-naive patients infected with HCV-genotypes 1a (N = 621), 1b (N = 474), 2 (N = 72), 3 (N = 268), 4 (N = 54) 5 (N = 6), and 6 (N = 73). Genetic-barrier was calculated as the sum of nucleotide-transitions (score = 1) and/or nucleotide-transversions (score = 2.5) required for drug-resistance-mutations emergence. Forty-three mutations associated with PIs-resistance were analyzed (36A/M/L/G-41R-43S/V-54A/S/V-55A-Q80K/R/L/H/G-109K-138T-155K/Q/T/I/M/S/G/L-156T/V/G/S-158I-168A/H/T/V/E/I/G/N/Y-170A/T-175L). Structural analyses on NS3-protease and on putative RNA-models have been also performed. Results Overall, NS3-protease was moderately conserved, with 85/181 (47.0%) amino-acids showing <1% variability. The catalytic-triad (H57-D81-S139) and 6/13 resistance-associated positions (Q41-F43-R109-R155-A156-V158) were fully conserved (variability <1%). Structural-analysis highlighted that most of the NS3-residues involved in drug-stabilization were highly conserved, while 7 PI-resistance residues, together with selected residues located in proximity of the PI-binding pocket, were highly variable among HCV-genotypes. Four resistance-mutations (80K/G-36L-175L) were found as natural polymorphisms in selected genotypes (80K present in 41.6% HCV-1a, 100% of HCV-5 and 20.6% HCV-6; 80G present in 94.4% HCV-2; 36L present in 100% HCV-3-5 and >94% HCV-2-4; 175L present in 100% HCV-1a-3-5 and >97% HCV-2-4). Furthermore, HCV-3 specifically showed non-conservative polymorphisms (R123T-D168Q) at two drug-interacting positions. Regardless of HCV-genotype, 13 PIs resistance-mutations were associated with low genetic-barrier, requiring only 1 nucleotide-substitution (41R-43S/V-54A-55A-80R-156V/T: score = 1; 54S-138T-156S/G-168E/H: score = 2.5). By contrast, by using HCV-1b as reference genotype, nucleotide-heterogeneity led to a lower genetic-barrier for the development of some drug-resistance-mutations in HCV-1a (36M-155G/I/K/M/S/T-170T), HCV-2 (36M-80K-155G/I/K/S/T-170T), HCV-3 (155G/I/K/M/S/T-170T), HCV-4-6 (155I/S/L), and HCV-5 (80G-155G/I/K/M/S/T). Conclusions The high degree of HCV genetic variability makes HCV-genotypes, and even subtypes, differently prone to the development of PIs resistance-mutations. Overall, this can account for different responsiveness of HCV-genotypes to PIs, with important clinical implications in tailoring individualized and appropriate regimens.


Introduction
Chronic hepatitis C virus (HCV) infection remains one of the most pressing health emergencies worldwide, with an estimated global prevalence of more than 170 million people [1]. Despite its devastating impact on cirrhosis and hepatocellular carcinoma, therapeutic options are still limited. Up to 2011, the standard of care treatment for HCV infection was represented by a combination therapy of peg-interferon and ribavirin [2]. Sustained virologic response (SVR) to this regimen was associated with improved liver histology, as well as clinical benefits and mortality [3,4]. However, nearly 50-60% of treated patients infected with the most prevalent genotypes HCV-1a and HCV-1b failed to achieve SVR [4][5][6][7].
The consequent need for innovative therapeutic strategies, has led to the development of several specifically-targeted antiviral drugs, directed against essential HCV proteins [8]. Among these, two NS3-protease inhibitors (PIs), boceprevir and telaprevir, are now approved for clinical use [9] and several other PIs are in development or in clinical trials [10]. These firtst two PIs have been evaluated in early-phase clinical-trials alone and in combination with peg-interferon and ribavirin, appearing to be highly effective in SVR [11][12][13][14][15][16][17].
Nevertheless, these encouraging data have been tempered by studies demonstrating either a differential sensitivity of HCV genotypes to PI-based therapy and an early selection of resistant variants.
Several factors, such as the inadequate fidelity and lack of proofreading activity of the RNA-polymerase, the high genetic variability of HCV (31%-33% nucleotide difference among the 6 known HCV-genotypes and 20%-25% among the nearly 100 HCV-subtypes), and its high replication rate (10 10 -10 12 virions/ day produced in an infected-patient), can indeed have the ability to affect the efficacy of anti-HCV treatment, compromising the achievement of a SVR and strongly increasing the risk of drugresistance development [18][19][20].
The first PIs, have been developed on the basis of HCV-1 NS3protease structure and indeed showed reduced efficacy in clinical trials including other HCV-genotypes. For instance, the first PI BILN-2061 was found to be substantially less effective in individuals infected with HCV-2-3 [21][22][23]. Telaprevir also showed potent activity against HCV-1, less efficacy against HCV-2, and almost no efficacy against HCV-3-4-5 genotypes in vitro and in vivo [10,[24][25][26]. Similarly, recent in vitro results showed marked differences in susceptibility of different genotypes also to macrocyclic inhibitors, such as danoprevir, vaniprevir and TMC435 [10,24,26]. On the contrary, within a small pilot study, boceprevir monotherapy (400 mg TID) recently resulted in a 1.37 and 1.7 log HCV-RNA reduction in HCV-2 and HCV-3 infected patients respectively, a decrease similar to that observed in HCV-1 subjects receiving the same monotherapy dose (M. Silva et al., presented at APASL 2011). Boceprevir also showed similar efficacy when tested in vitro against several isolates from HCV genotypes 2a, 3a, 5a, 6a, with less pronounced changes against HCV-3 than telaprevir or other macrocyclic PIs [26].
Differences were also observed at the level of HCV-subtypes. Indeed, during clinical trials, selection of resistant variants to firstgeneration PIs and viral breakthrough were observed consistently more frequently in patients infected with HCV-1a than HCV-1b [27][28][29], and drug-resistant-variants emerged at frequencies of 5 to 20% of the total virus population as early as the second day after the beginning of treatment when either boceprevir or telaprevir were used as monotherapy [30].
While for HCV-1a and HCV-1b the different antiviral activity, viral-breakthrough and selection of resistant-variants to telaprevir, boceprevir or danoprevir (a new macrocyclic PI) have been associated with nucleotide-variability at position 155 (resulting in a lower genetic-barrier for the development of R155K resistance-mutation in HCV-1a) [10,20,36,37], the reason of a lower efficacy of PIs in HCV-2-3-4 is still largely unknown.
Considering these data, it is indeed conceivable that the genetic variability among HCV genotypes would have a great importance in HCV sensitivity to PIs, determining drug efficacy and even a different rate of selection of pre-existing resistant HCV variants [10,28,38,39]. However, the characterization of HCV genetic variability at NS3 positions critical for PIs drug-resistance is still missing, especially in non-1 HCV genotypes.
Therefore, the aim of this study was to define, at either nucleotide or amino acid level, the HCV-NS3 genetic variability, among all different HCV-genotypes and subtypes commonly spread worldwide, focusing attention on codons associated with development of resistance to either first and second generations (linear and macrocyclic) PIs. To ensure the quality of the data, public-sequences were excluded from the analysis if: a) contained stop-codons in NS3gene; b) contained ambiguities consisting of .2 bases per nucleotide position or .2 ambiguities per codon at individual drug-resistance associated position.

Protease Sequencing
Four home-made protocols for amplification and population sequencing of NS3-protease-gene for HCV-genotypes 1a-1b-2-3-4 were developed. The complete encoding NS3-region (181aa) was amplified from stored serum-samples by nested polymerase-chainreaction (PCR) following a reverse-transcription reaction. HCV RNA was extracted using a standard commercial silica-gel membrane-binding method (QIAamp Viral-RNA Minikit; Qiagen, Valencia, CA), and the NS3-DNA-fragment was synthesized using the commercial SuperScript One-Step RT-PCR System (Invitrogen Corp, Carlsbad, CA). Primers used for amplification and sequencing are reported in Table S1. Two separate PCRs were performed for each sample. Reaction mixtures were prepared with 25 ml of RNA template, 8 ml of 5 mM Mg 2+ , 2 ml of DNase-RNase-free water, 0.75 ml of each primer at a concentration of 10 mM, 1 ml of RNase-out (40 U/ul), 1.5 ml of RT/Taq, 1 ml of dNTPs at a concentration of 10 mM for a total of 40 ml. Then, they were reverse-transcribed for 30 min at 45uC, denatured for 2 min at 94uC, and amplified by 40 cycles at 94uC for 30 seconds, 56uC for 30 seconds, and 68uC for 90 seconds. The amplified product was run on a 1% agarose-gel. When the product was not visible, a nested PCR was performed. Five ml of amplified product was denatured at 94uC for 3 min and amplified with 35 cycles at 94uC for 30 sec, 54uC for 30 sec, and 68uC for 45 sec, using the following reaction mix: 5 ml of 10 X Taq buffer, 4 ml of 25 mM Mg 2+ , 32.5 ml of DNase-RNase free water, 0.9 ml of each primer at a concentration of 10 mM, 1 ml of 10 mM dNTPs, 0.7 ml of Taq (5 U/ml) for a total of 45 ml. Positive samples were sequenced using the BigDye terminator v.3.1-cycle-sequencing-kit (Applied-Biosystems) and run on the automated-sequencer ABI-3100.

Conservation Analysis
Analyzing 1568 sequences, NS3-protease HCV variability was firstly assessed by calculating the prevalence of the most common wild-type nucleotide at each position of NS3 gene. Afterwards, it was determined the impact of nucleotide variability on NS3protein, evaluating the prevalence of wild-type and mutated amino acids.

Structural Analysis
In order to visualize the distribution of conserved/variable NS3residues, Protein Data Bank X-ray structures 3P8N and 2OC8 (available from http://www.rcsb.org/pdb) have been considered as 3D models of HCV-1b [42] and HCV-1a [43] NS3-protease respectively, and graphically inspected by PyMOL (The PyMol-Molecular-Graphics-System, ver.-1.3, Schrödinger, LLC). The crystallographic structures were selected considering the resolution of the models (3P8N 1,90 Å ; 2OC8 2,66 Å ) and excluding those crystals that showed a large number of deletions and mutations if compared to the reference sequences.
The evaluation of boceprevir-protease-interactions has been performed with Maestro-GUI (Maestro-Graphics-User-Interface, ver.-9.8, Schrödinger, LLC). To highlight the most relevant residues for the boceprevir targets recognition, the new computational approach GRID-Based-Pharmacophore-Model (GBPM) has been applied. Such a method, useful for designing pharmacophore models starting from detailed macromolecular structures, has been described in a recent publication [44]. In particular it was developed with the aim to generate pharmacophore models useful for QSAR and virtual screening experiments by means of an unbiased computational protocol. The GRID-based pharmacophore model is created in a 6-step procedure. The first one performs the PDB file pre-treatment producing three different model structures: the complex (subunits a+b), the receptor (subunit a) and the ligand (subunit b). The second step calculates the GRID molecular interaction fields (MIF) with a certain probe onto the three targets above reported. In the third step an energy comparison of the MIFs is performed by the GRID GRAB [45] utility, generating maps with focused information on the interaction areas. The fourth step is related to the identification of most relevant interaction points. With the aim to get a suitable model, these operations should be repeated using at least three different probes: a generic hydrophobic (DRY), an hydrogen bond acceptor (O) and an hydrogen bond donor (N1). In the fifth step the information obtained from the different probes are unified into a preliminary pharmacophore model. We carried out the GBPM analysis up to the fifth step of the procedure, in order to highlight the most involved residues in the recognition areas.
In the GRID [45] calculations the lone pairs, the tautomeric hydrogen atoms and torsion angles, relative to the sp3 oxygen atoms and the amide atoms, have been allowed to be settled on the basis of the probe influence, while the coordinates of all the other atoms have been considered rigid (directive MOVE = 0). Default values have been used for the other parameters.
The component interaction analysis was performed starting from the experimental HCV protease wild-type complex (PDB 2OC8) in the following conditions: a) OPLS2005 as force field; b) GB/SA water implicit solvation model; c) dielectric constant equal to 1; d) a binding pocket defined considering protease residues within 12Å from boceprevir (Maestro Graphics User Interface, ver. 9.8, Schrödinger, LLC).
Because the obtained global energy minimum GRID points (E min ) were ranked in a wide range of values, graphical analysis of the GRID maps was carried out by considering, for each probe, an energy threshold (E cut ) equal to 60% of the protease-boceprevir complex E min , as previously reported [46].

Putative Secondary RNA Structure
A full length HCV-1b genome obtained by GenBank (accession number: AJ000009) was used for RNA secondary structures prediction by using the Mfold program at 37uC, available at the UNAFold server (http://mfold.rna.albany.edu) [47]. This algorithm, based on thermodynamics of RNA structures motifs, including base-paired intramolecular stems and unpaired loops, provides the identification of putative optimal minimum free energy structures. RNA structure models and free energy values were individually predicted using the original viral HCV-1b genome AJ000009, with and without the introduction in the NS3protease coding region of specific resistance mutations at position 156: A156S, A156T, A156G, and A156V.
Secondary RNA structures were individually predicted by using also CONTRAfold-software, analyzing the NS3-fragment covering amino acid from 135 to 181. This software uses probabilistic parameters learned from a set of RNA secondary structures to predict base-pair probabilities and structures using the maximum expected accuracy approach [48,49].

Genetic Barrier Calculation
The genetic barrier for the evolution of any drug-resistance mutation was calculated according to a model previously described for HIV-1 and HBV [50,51], considering not only the number of nucleotide substitutions, but also the nature of them (i.e. transitions vs. transversions). In summary, it was assigned a score of 1 to transitions (A«G and C«T) and a score of 2.5 to transversions (A«C, A«T, G«C, G«T), since transitions have been shown to occur, for steric reasons, on average 2.5 more frequently than transversions [50,52]. An algorithm was built using Java-script to calculate the genetic barrier at each individual NS3 position. The pipeline used to estimate the genetic barrier for drug-resistance development have been described elsewhere [51]. Briefly, due to the degeneration of genetic code, each NS3 amino acid associated with drug-resistance can be encoded by more than one nucleotide codon. Therefore, starting from the wild-type codon detected in drug-naïve patients, we calculated a numerical score by summing the number of nucleotide transitions and/or transversions required to generate a specific RAM. As a result, we obtained different scores for each pathway of nucleotide substitutions required to generate a specific RAM. The minimal genetic barrier score for each drug resistance mutation analyzed was considered.
Only amino acid mutations with prevalence .1% and found in .2 patients were considered.
In the majority of cases, conserved amino acids clustered into small regions, comprising a number of 2 to 8 consecutive residues. For instance, we observed two highly conserved stretches encompassing NS3 positions 135-142 and 154-159 ( Fig. 1). As expected, the catalytic-triad was highly conserved: residues D81 and S139 showed no amino acid variability, while the residue H57 was 100% conserved within HCV-1b strains (data not shown), and presented ,1% variability among other HCV-genotypes. The catalytically oxyanion hole at G137 and the 4 residues involved in Zn 2+ binding (C97-C99-C145-H149) were also highly conserved, showing ,1% amino acid variability among all sequences analyzed (Fig. 1).
The comparison of consensus sequences between subtype 1b (used as reference in the present study) versus HCV-1a showed that 17 residues out of 181 were different among the HCV-1 subtypes, with some mutations found exclusively in HCV-1a and in none of the other HCV genotype (such as at positions 35-40-66-89) (Fig. 1).
Also several positions associated with enhanced replication or compensatory effect if mutated (72-86-89-153-176) [53] were found to be highly variable. In particular, positions 72, 86, 89 and 176 had an amino acid variability .10%, with also evidences of differences in wild-type amino acids usage, while position 162 was highly conserved.

Structural Insight of NS3 Protease
To better characterize the effect of HCV variability in the structure of NS3 protease and specifically in the binding-site to PIs, a NS3 protease-boceprevir contact analysis was carried out on an available HCV-1a NS3 protease-boceprevir complex model (PDB 2OC8). Several amino acidic residues, essential for boceprevir-and substrate-binding, were identified by structural and GRID-Based-Pharmacophore-Model (GBPM) approaches. In particular, the inhibitor was found to establish 3 hydrogen bonds with A157, single hydrogen bonds with residues Q41, G137, S139 and R155, and also numerous (.10) non-bonded contacts with several residues (H57-I132-L135-K136-G137-S139-F154-R155-A156-A157-V158) ( Table 1 and Fig. S1). In addition, the protease residues H57, I132, S139, A156 and A157 were well identified at energy minimum threshold (data not shown), emphasizing their key role in enzyme catalytic activity and stabilization [54].
All together, these structural analyses highlighted the presence of some genotype-specific polymorphisms at positions close to the NS3-protease catalytic site, but also underlined the existence of many highly conserved residues involved in the catalytic functionality of the enzyme, and thus excellent target for a focused pharmacophoric design. NS3 Genetic Variability and RNA Secondary-structure By using CONTRAfold-software, a first analysis of all nucleotide-sequences of the NS3-protease coding-region showed the formation of a very complex RNA secondary-structure, almost exclusively organized in highly stable paired-stems. Analyzing the structure in more detail, by using the Mfold-software and one entire HCV1-b genome sequence, we noticed that base-paired RNA stretches were often composed by highly conserved codon pairs in the protease coding region (Fig. 3). For instance, the highly conserved codons for catalytic residues D81 and S139 were basepaired with the conserved S208-L209, e R196 NS2-residues, respectively (Fig. 3). Differently, the codon for the catalytic residue H57, which presented a high synonymous nucleotide variability (.50%), in our RNA model was only partially base-paired, facing the highly variable codon for P67 (data not shown).
Within the NS3 coding region, a very stable stem-loop, including codons for amino acid positions 145 to 163, was observed (Fig. 3). Highly conserved codons (amino acid variability #1%) were base-paired within this loop, while codons with higher variability among HCV-genotypes, such as those for amino acids at positions 150-151-155, were not base-paired. Interestingly, the codon for residue A156 (gcu), associated to resistance to all linear and some macrocyclic PIs (including the second generation MK-5172), was found to be base-paired with the highly conserved codon for I153 (auc). When the RAMs 156S (ucu codon), 156T (acu codon), 156V (guu codon), or 156G (ggu codon) were introduced in the NS3-protease coding region, our model of RNA stem-loop conformation was not perturbed (Fig. 4). Indeed, all simulation models were associated with similar delta values of free-energy (DG) decrease. The A156S development was associated with the higher decrease of DG values, from 3693.99 kcal/mol of the wildtype model to -3698.29 kcal/mol of the A156S harboring model, suddenly followed by A156G (DG = 23697.99 kcal/mol), A156T (DG = 23694.19 kcal/mol), and A156V (DG = 23694.09 kcal/ mol). This small decrease of DG values indicates a persistence of structural-stability and also suggests that mutations 156S/T/V/G, if occurring during virological failure to PIs, should not drastically alter the HCV RNA secondary structure. These structural and entropic results were confirmed using CONTRAfold software (data not shown).

Genetic Barrier for PIs Resistance
The genetic barrier for the development of RAMs was explored on the whole data set of 1568 NS3-protease sequences. Starting from each wild-type codon detected in the dataset of sequences obtained from PI-naïve patients, we calculated a numerical score by summing the number of nucleotide transitions and/or transversions required to generate a specific RAM. As a result, we obtained different scores for each pathway of nucleotide substitutions required to generate a specific RAM. The minimal genetic barrier score for each drug resistance mutation analyzed was considered.
Regardless of HCV genotype, major RAMs 55A (for boceprevir), 54A/S (for boceprevir or telaprevir), 80R (for TMC435 or asunaprevir), 156T/V (for all linear and several macrocyclic PIs, including MK5172 second generation) and 168E/H (for all first generation macrocyclic PIs) needed only one nucleotide substitution (in the majority of cases, a transition, score = 1) to be generated and were thus associated with the lowest values of genetic barrier (Fig. 5 panel A). Accordingly, this may justify their very rapid selection under PI-treatment. Also several secondary RAMs had a low genetic barrier to development in all HCV genotypes, requiring only 1 transition (such as danoprevir mutations 41R and 43S/V: score = 1) or 1 transversion (138T and 156G/S: score = 2.5).
Differently, either nucleotide and/or amino acid variability among HCV-genotypes affected the calculated genetic barrier for the development of other major and minor RAMs (such as at positions 36,80,109,155,168,170,175). A detailed analysis of codon variability for major and minor RAMs is reported in Table 2 and in Table 3, respectively.
For instance, the NS3-residue R155, critical for both linear and macrocyclic PIs resistance, showed high degree of nucleotide variability (Table 2), leading to a different genetic barrier for the development of all RAMs at this position (Fig. 5 panel B). Notably, the calculated genetic barrier for the major R155T mutation, associated with high-level of resistance to linear PIs, was found to be lower in HCV-1a-2-3-5 genotypes (score = 2.5), in comparison to HCV-1b-4-6 (score = 5). Similarly, the development of R155K mutation, associated with high resistance to linear and macrocyclic PIs, required only 1 transition for its potential development in HCV-1a-2-3-5 genotypes (score = 1), in comparison to 1 transversion or more substitutions required in HCV-1b-4-6 genotypes (scores = 2.5-3.5-6). Also some minor RAMs (R155G/I/M) showed a lower genetic barrier for their development in HCV-1a-2-3-4-5-6 genotypes in comparison to HCV-1b. Differently, the potential development of the major RAM R155Q, associated with resistance to linear PIs and danoprevir, required only one transition in HCV-1b (score = 1) in comparison to other genotypes, such as HCV-1a-2-3-5, where 2 substitutions were required (scores = 3.5) ( Table 2 and Fig. 5, panel B).
According to the different wild-type codon usage at position 36, the calculated genetic barrier for the development of RAMs at this position varied among HCV-genotypes (Table 2, Fig. 5 panel C). For instance, the potential development of 36 M (known to compensate impaired viral fitness of R155K mutation) had a reduced genetic barrier in HCV-1a (score = 1), but also in HCV-2-4 (score = 2.5), in comparison to HCV-1b (score = 3.5), HCV-3 (score = 5), and HCV-5-6 (score = 3.5).
Also the two positions associated with major resistance to macrocyclic PI (80 and 168) were highly variable among genotypes ( Fig. 1 and Fig. 5, panel D). In particular, at position 80, genotypes HCV-1a-1b-3-4-6 showed a Q as wild-type amino acid, and consequently the development of major Q80K and Q80R RAMs will potentially require just a nucleotide substitution (score = 2.5 and 1, respectively). On the contrary, HCV-2 harbored the minor RAM 80G as wild-type, and presented a calculated genetic barrier score of 1 for the development of 80R and of 2 for the development of 80K (two transitions). Lastly HCV-5 already had the major 80K as wild-type, and had a genetic barrier for major 80R development scored as 1.
Taken together, these results indicate that the high level of variability in codon usage among HCV-genotypes can favor genotype-specific pathways for resistance-mutations development. This can result in different responsiveness of HCV-genotypes to PIs and very rapid selection for specific resistance patterns for both linear and macrocyclic PIs.

Discussion
Analyzing more than 1500 HCV NS3-protease sequences, a high degree of genetic variability among all HCV-genotypes was found in PI-naïve HCV-infected patients, with only 85/181 (47.0%) conserved amino acids. This genetic heterogeneity among genotypes translated into significant molecular and structural differences, making HCV-genotypes, and even subtypes, differently sensitive to PIs treatment and differently prone to the development of PI resistance-mutations, for both linear and macrocyclic compounds. Indeed, the linear PI telaprevir showed less efficacy against HCV-2, and almost no efficacy against HCV-3-4-5 genotypes in vitro and in vivo [10,[24][25][26], and similar results were also obtained for macrocyclic inhibitors, such as danoprevir, vaniprevir and TMC435 [10,24,26].
As a first consequence of HCV sequence heterogeneity, we observed that four resistance-mutations (80K/G and 36L-175L) were already present, as natural polymorphisms, in selected genotypes. In particular, the major RAM 80K (for macrocyclic compounds TMC435 and Asunaprevir) was detected in 41.6% of HCV-1a, in 100% of HCV-5 and in 20.6% of HCV-6 sequences. Secondly, a different codon usage among genotypes led to a different genetic-barrier for the development of some major and minor RAMs at positions 36-80-109-155-168-170.
Notably, among all HCV-genotypes, the more difficult-to-treat HCV-3 presented several polymorphisms at positions close to the PI-binding site (42-45-123-132-133-134-168-170), which probably might be related to the low antiviral efficacy of several PIs observed in vivo and in vitro against this genotype [10,24,26,33]. In particular, different wild-type amino acids at positions 123 and 168 resulted in non-conservative changes of charge. In cocrystalized structures of PIs and HCV-1 NS3-protease, the negatively charged D168 forms strong salt bridges with positively-charged residues R123 and R155 [20]. It has been proposed that mutations at either positions 155 or 168 could disrupt this salt bridge and affect the interaction with PIs, potentially leading to drug-resistance [20]. The substitution of D168 residue in HCV-3 with the polar uncharged Q168, and the replacement of R123 with the polar T123 can thus abrogate these key structural salt bridges, potentially altering the active site conformation of NS3 protease, and in turn impact the HCV-3 sensitivity to PIs.
Furthermore, HCV-3, together with HCV-2-4-5 genotypes, also presented two minor RAMs as natural polymorphisms (36L and 175L), known to confer low-level resistance to boceprevir and/or telaprevir in vitro [23,25]. Interestingly, both residues 36 and 175 are located near the protease catalytic domain of HCV NS3, but not close to the boceprevir and telaprevir binding sites in their respective complexes with HCV NS3-NS4 protease (Fig. 2) [20]. Probably, even if mutations at position 36 and 175 should not be directly involved in resistance to PIs, they can influence the viral replication capacity. For instance, viruses with mutations V36A/ L/M (as well as with other PI-resistance mutations such as R109K and D168E) demonstrated a comparable fitness to wild type reference virus [55]. However, since no crystallized structures are to date available for non-1 HCV proteases, the overall impact of such polymorphisms on the three-dimensional protein structure (and functionality) will need further investigations.
It is important to mention that very recent data demonstrated a pan-genotypic activity of the second generation macrocyclic PI MK-5172, even against HCV-3 genotype (Barnard R, presented at International Congress of Viral Hepatitis 2012, abstract nu 79340). Furthermore, MK-5172 retained activity also against HCV-1 viral strains harbouring key first generation PI RAMs, thus providing a great opportunity for patients infected with all different HCV-genotypes, including those without virological response to previous regimens.
For instance, HCV-1a and HCV-1b consensus sequences showed different wild-type amino acids at 17/181 (9.4%) NS3protease positions, including some (i.e. 72-80-89-175) associated with resistance, enhanced replication or compensatory effects if mutated [35,53]. This amino acidic variability (together with the nucleotide one) may potentially facilitate viral breakthrough and Figure 5. Calculated genetic-barrier for resistance-mutations. Mutations reported in panel (A) are those for which the calculated geneticbarrier was not affected by inter-genotype variability. Histograms in panel below represent the calculated genetic-barrier score for RAMs at positions 155 (B), 36 (C), and 80 (D). The score was calculated by summing the number of transitions (score = 1) and transversions (score = 2.5) required for the generation of any degenerated codon associated with drug-resistance, starting from the predominant wild-type codon found in each HCV-genotype. doi:10.1371/journal.pone.0039652.g005 Table 2. Codon variability at HCV NS3 positions associated with major drug resistance to PIs and its impact on the genetic barrier to drug resistance development in HCVgenotypes 1-6. WT, wild-type. doi:10.1371/journal.pone.0039652.t002 Table 3. Codon variability at HCV NS3 positions associated with minor drug resistance to PIs and its impact on the genetic barrier to drug resistance development in HCVgenotypes 1-6.  Table 3. Cont. The wild-type amino-acid of HCV genotype 1b at each position associated with drug resistance is shown. selection of specific resistant variants, that have been indeed observed consistently more frequently in patients infected with HCV-1a than HCV-1b, using both linear and macrocyclic PIs [27,29]. On the other hand, according to our GBPM structural analysis, highly conserved NS3-protease positions among all HCV genotypes were those pivotal for enzyme functionality and stability, such as the catalytic-triad (H57-D81-S139), the oxyanion hole at G137 and the residues involved in Zn 2+ binding (C97-C99-C145-H149), and also comprised the majority of residues essential for boceprevir-binding (Q41-F43-L44-H57-L135-K136-G137-S138-S139-F154-R155-A156-A157-V158-C159) [36,42]. Interestingly, we also observed two highly conserved stretches encompassing NS3 positions 135-142 and 154-159 that could assist in the rational design of new HCV inhibitors with more favourable resistance profiles.
A correlation among conserved NS3 amino acid residues and base-paired organization on the putative RNA secondary structure was also observed. Indeed, highly conserved positions at both amino acid and nucleotide levels were located in highly stable RNA paired stems. Probably, the requirement for base-pairing in these structures severely limits the number of ''neutral'' sites in the genome, constraining neutral HCV drift, since even synonymous mutations could potentially affect and disrupt the RNA-folding [56].
Interestingly, in our predicted RNA structure model, the conserved codon for resistance-associated residue A156 [10] was base-paired with the conserved codon for residue I153. The presence of RAMs at this position (156S/T/G/V), associated to resistance to all linear and some macrocyclic PIs (including the second generation MK-5172) [29,33,57], did not perturb the overall RNA structural conformation and was associated with a delta free-energy decrease similar to that observed in the wild-type model, suggesting that the selection of such RAMs might determine a phenotypic drug-resistance without altering the secondary RNA-structure stability.
Another specific aim of the study was to compare the genetic barrier for the evolution of PIs resistance among all HCV genotypes. The calculation of the genetic barrier was performed considering not only the number of nucleotide substitutions, but also the nature of them (i.e. transitions, score = 1 vs. transversions, score = 2.5), according to recently published papers [50][51][52].
All together, these results help explaining experimental and clinical observations, indicating that mutations appearing rapidly and frequently in PI-treated patients are actually those with a lower genetic barrier in the specific genotype/subtype considered. Indeed, in both telaprevir and/or boceprevir failing patients, the most common resistance mutations detected in HCV-1a infected patients were V36M, T54S, and R155K (all score = 1), whereas mutations T54A/S, V55A, A156S, and V170A (all score = 1 or 2.5) were specifically developed in HCV-1b patients [27,29,39] (Barnard R. et al., presented at AASLD 2011).
Furthermore, classically the genetic barrier calculation is performed referring to the most prevalent wild-type codon found in each genotype. Nevertheless, as it appears clearly from Table 2 and Table 3, the variability of codon usage exists at high level even within the single genotypes. For instance, we found 41.6% of HCV-1a sequences harboring the RAM 80K, and 4% of HCV-1b sequences with a reduced genetic barrier (score = 1) to develop R155K, suggesting that also individual isolates may differently respond to treatment and develop specific PI resistance mutations. At this regard, it is important to mention that natural HCV resistance has been described in few reports [53,[58][59][60][61], with a rare (,1%) natural presence of 155K found by population sequencing, exclusively in patients infected with HCV-1a [53,[58][59][60][61].
In conclusion, the high degree of HCV genetic variability makes HCV-genotypes, and even subtypes, differently prone to responsiveness to PIs and to the development of linear and macrocyclic RAMs. Learning also from the anti-HIV treatment experiences, the HCV genotypic resistant test will thus provide to clinicians important information for the management of HCV infection and for the individual tailoring of antiretroviral therapy. In this direction, a better knowledge of the extend of genetic variability among genotypes could assist the identification of RAMs with higher probability of development in that particular setting, highlighting patients with a higher risk of failure. Figure S1 2D representation of boceprevir interactions in the HCV-1 NS3-protease binding pocket within 5Å (PDB 2OC8). Hydrogen bonds are reported as light magenta lines. Grey, green, cyan, pink and violet areas are related, respectively, to non polar uncharged, hydrophobic, polar uncharged, polar negatively charged and polar positively charged protease residues. (DOC) Table S1 Primers for amplification and sequencing of NS3 protease of HCV-genotypes 1-2-3-4. (DOC)