Naturally Occurring Mutations in the Nonstructural Region 5B of Hepatitis C Virus (HCV) from Treatment-Naïve Korean Patients Chronically Infected with HCV Genotype 1b

The nonstructural 5B (NS5B) protein of the hepatitis C virus (HCV) with RNA-dependent RNA polymerase (RdRp) activity plays a pivotal role in viral replication. Therefore, monitoring of its naturally occurring mutations is very important for the development of antiviral therapies and vaccines. In the present study, mutations in the partial NS5B gene (492 bp) from 166 quasispecies of 15 genotype-1b (GT) treatment-naïve Korean chronic patients were determined and mutation patterns and frequencies mainly focusing on the T cell epitope regions were evaluated. The mutation frequency within the CD8+ T cell epitopes was significantly higher than those outside the CD8+ T cell epitopes. Of note, the mutation frequency within predicted CD4+ T cell epitopes, a particular mutational hotspot in Korean patients was significantly higher than it was in patients from other areas, suggesting distinctive CD4+ T cell-mediated immune pressure against HCV infection in the Korean population. The mutation frequency in the NS5B region was positively correlated with patients with carrier-stage rather than progressive liver disease (chronic hepatitis, liver cirrhosis and hepatocellular carcinoma). Furthermore, the mutation frequency in four codons (Q309, A333, V338 and Q355) known to be related to the sustained virological response (SVR) and end-of treatment response (ETR) was also significantly higher in Korean patients than in patients from other areas. In conclusion, a high degree of mutation frequency in the HCV GT-1b NS5B region, particularly in the predicted CD4+ T cell epitopes, was found in Korean patients, suggesting the presence of distinctive CD4+ T cell pressure in the Korean population. This provides a likely explanation of why relatively high levels of SVR after a combined therapy of pegylated interferon (PEG-IFN) and ribavirin (RBV) in Korean chronic patients with GT-1b infections are observed.


Introduction
According to the WHO, 3% of the global population is infected with the hepatitis C virus (HCV), with 3-4 million people newly infected each year [1][2][3][4]. Most HCV infections persist, with up to 80% of all cases leading to chronic hepatitis associated with liver fibrosis, liver cirrhosis (LC) and hepatocellular carcinoma (HCC) [5][6][7]. A combinatorial treatment with pegylated interferon (PEG-IFN) and ribavirin (RBV) provides good clinical efficacy in patients infected with genotypes (GTs) 2 and 3 but is less efficacious in patients infected with the most prevalent GT-1b, thereby emphasizing the urgent need for more effective specifically targeted antiviral therapies for GT-1b [8][9][10][11].
The HCV RNA-dependent RNA polymerase (RdRp) is an essential enzyme that lacks proofreading activity, thus leading to a population of distinctive but closely related viral variants, termed viral quasispecies, within an infected individual [12][13][14]. Monitoring of the diversity of HCV quasispecies is important for the prediction of liver disease progression as well as HCV treatment outcomes [15][16][17][18][19]. Currently, studies regarding HCV quasispecies mainly focus on structural genomic regions; therefore, relatively limited data are available regarding nonstructural regions. Recently, variations in the nonstructural 5B (NS5B) protein, particularly in specific codons, were reported to be positively related to a sustained virological response (SVR) and end-of treatment response (ETR) of patients infected with GT-1b [15,16].
It was also reported that the SVR rate in patients with HCV GT-1b treated with PEG-IFN plus RBV are higher in Asian patients as compared with Caucasians [10,20]. In particular, previous studies have shown that SVR rates in Korea patients infected with GT-1b range from 56% to 62% [21,22]. Recently, two SNPs, rs12979860 and rs8099917 of the IL28B gene, showing the strongest association with treatment response, have been reported at a high frequency in Korean patients with HCV GT-1b compared to the frequencies of other ethnic groups [23,24]. Although prior investigations can partly explain the high SVR rates in Korean patients, other mechanisms may also contribute to this effect. In the present study, to address this issue, we investigated via quasispecies analysis the mutation frequencies and patterns in the partial NS5B from Korean patients infected with HCV GT-1b, as these are known to be related to the SVR rates,

Patients and HCV RNA Extraction
Serum samples were collected from a total of 73 treatmentnaïve HCV-positive patients who visited Seoul National University Hospital in 2003. The clinical statuses of the HCV-positive patients were defined as carrier (C), chronic hepatitis (CH), LC or HCC. General definitions of the C and chronic liver disease types are as follows: the diagnosis of C can be made in the presence of positive anti-HCV antibodies, of a positive HCV RNA by RT-PCR, and of normal alanine aminotransferase (ALT) levels (,40 IU/L, assay dependent) in at least three tests carried out at least two months apart over a period of six months [25,26]; CH was defined as an elevation of or fluctuation in serum ALT levels over 6 months without any evidence of any other chronic liver disease [27]; LC was diagnosed through evidence of clinically relevant portal hypertension (esophageal varices and/or ascites, splenomegaly with a platelet count of 100,000/mm 3 ) [28], ultrasonographic imaging features suggestive of liver cirrhosis [29], and a histological diagnosis with one of the following features: nodular regeneration, fragmentation of the biopsy with fibrosis at the margins and a wide postnecrotic collapse with an abnormal relationship between portal tracts and central veins, and evidence of active liver-cell hyperplasia [30]. Finally, HCC in cirrhotic patients was diagnosed either through radiological criteria (focal lesion .2 cm with arterial hypervascularization according to two coincident imaging techniques) or through combined criteria (focal lesion .2 cm with arterial hypervascularization according to one imaging technique associated with AFP levels .400 ng/ml) [31]. HCV RNA was purified using the Viral Gene-Spin Viral DNA/RNA Kit (iNtRON Biotechnology Inc., Seongnam, Korea) according to the manufacturer's guideline. This work was approved by the institutional review board of Seoul National University Hospital (IRB No. C-1304-032-479). The experiment was mainly based on the viral RNA extracted from isolates; therefore, the research was done without informed consent and a waiver of informed consent was agreed upon by the IRB.
Quantitative PCR (qPCR) and cDNA synthesis A qPCR method was used to analyze viral RNA with an ABI7500 system (Perkin-Elmer Applied Biosystems, Warrington, UK). The primers were designed to amplify the NS2 region and the sequences were as follows: sense primer HCVF (59-CGA CCA GTA CCA CCA TCC TT-39) and antisense primer HCVR (59-AGC ACC TTA CCC AGG CCT AT-39). For the detection of HCV RNA, the SensiFAST SYBR Lo-ROX kit (Bioline, Taunton, MA, USA) was used according to the manufacturer's instructions. Absolute quantification of extracted HCV RNA relies on the accuracy with the amount of HCV RNA standard measured with a lower limit of detection of 1,350 copies/ml (500 IU/ml) on the basis of earlier research (data not shown) [32,33]. Viral cDNA synthesis for Reverse-transcriptase (RT) PCR was done using the Maxime RT PreMix kit (iNtRON Biotechnology Inc., Seongnam, Korea) according to its own protocol.

Cloning and Sequencing Analysis
The PCR products of GT-1b were cloned using the TOPO TA Cloning kit (Invitrogen Corporation, Carlsbad, CA, USA). The NS5B regions were sequenced using the M13 primer. For each subject, 10 to 12 subclones were sequenced [35,36]. Sequencing was conducted using the Applied Biosystems model 377 DNA automatic sequencer (Perkin-Elmer Applied Biosystems, Warrington, UK). If there were sequence variations between the clones of a sample, the dominant sequence at each position was determined as the major sequence. Nucleotides were aligned and their similarities were calculated using the multiple-alignment algorithm in Megalign (DNASTAR, Windows Version 3.12e). A mutation in this study was defined as a sequence different from the consensus sequence of 20 GT-1b reference strains obtained from the LANL HCV database (http://hcv.lanl.gov) [accession numbers AB442219, AB691953, AF165047, D11168, D13558, D16435, D50485, D85516, D90208, EU256084, EU482859, FJ478453, HQ110091, HQ912958, J238799, L02836, M58335, M96362, S62220 and X61596] [37]. Because at aa 316 and 464, the two types of subclonal amino acids were conserved in each subject, both amino acids were considered to be a consensus sequence [38,39]. For a further comparison of the analyzed sequences, 45 HCV GT-1b sequences from other countries (China: 15, Japan: 15, Switzerland: 15 and the United States: 15) were also retrieved from the LANL HCV database and relevant nucleotide positions were compared with the consensus sequence of 15 subjects.
Prediction of novel CD4+ T cell epitopes and determination of mutations inside and outside CD4+ or CD8+ T cell epitopes 15-mer peptides containing an association between a particular HLA class II molecule and the sequenced NS5B with binding capacity ,500 mM were screened in silico for the presence of the relevant HLA-binding motif [42]. Mutations within the CD4+ and CD8+ T cell epitopes were defined as a sequence different from the consensus sequence within the four selected CD4+ T cell epitopes with above criteria and six known CD8+ T cell epitopes, respectively, on the basis of previous studies [37,[43][44][45]. Mutations outside the CD4+ or CD8+ T cell epitopes were counted according to the total number of mutations minus the sum of the respective epitope regions.

Statistical analyses
The results were expressed as percentages, means 6 SD, or as medians (range). The differences between the categorical variables were analyzed using Fisher's exact test or a Chi-square test. For continuous variables, the Student's t-test was used when the data showed a normal distribution, or the Mann-Whitney U test was used when the data was not normally distributed. The level of significance of each test was adjusted for multiple tests via Bonferroni correction. A p-value of ,0.05 (two-tailed) was considered to be statistically significant. Statistical analysis of the

Phylogenetic analysis of GT-1b and its characteristics
A phylogenetic analysis based on the 492bp GT-1b sequenced NS5B region of randomly selected subclones showed distinct sequence variation between each subject (Fig. 1 Table S3). This finding indicates a positive correlation between viral replication and the clinical severity of liver disease. The nucleotide sequence of 166 subclones is available in the GenBank nucleotide sequence databases with the following accession numbers: KF422017-KF422027.

Distribution of mutations in the sequenced NS5B region
The distribution of the mutations from the sequenced GT-1b NS5B region aa 164 is shown in Fig. 2. There were six known CD8+ T cell epitopes (Table S4) [37,[43][44][45], and the mutation frequencies inside the CD8+ T cell epitope regions (2.9%) were significantly higher than those outside the epitope regions (2.3%, p = 0.001). The mutation frequencies inside the predicted CD4+ T cell epitopes (4.8%) were significantly higher than those outside the CD4+ T cell epitope (1.4%) and were even higher than those inside the known CD8+ T cell epitopes (p,0.001) (Table S5). We designated the region including the aa 333-355 section of the CD4+ T cell epitopes as a mutational hotspot, as which an extraordinary high mutation frequency (6.7%) was observed (Fig. 2). Of note, the region was predicted to have high binding affinity for the various MHC class II HLA types prevalent in Koreans, raising the possibility that there may be distinctive MHC class II restricted immune pressure against HCV GT-1b in the mutational hotspot ( Table 2).

Comparison of synonymous (d S ) and nonsynonymous mutations (d N ) according to the NS5B region
The distinctive CD4+ T cell-mediated immune pressure was examined by comparing d N to d S . The d N /d S ratio inside the known CD8+ T cell epitopes (0.29) was slightly higher than that of the outside (0.21) region with d N frequencies of 2.9% and 2.3%, respectively. The d N /d S ratio inside the predicted CD4+ T cell epitopes (0.49) was statistically higher than that outside (0.13), with d N frequencies of 4.8% and 1.4%, respectively, although the d S frequency outside the predicted CD4+ T cell epitopes was higher at a statistically significant level. The odds ratio of the d N inside and outside the predicted CD4+ T cell epitopes was 3.55. In the mutational hotspot, d N frequencies (11.1%) were found to be higher than d S frequencies (4.8%), resulting in an elevated d N /d S ratio (1.4). This suggests there are strong MHC class II restricted  immune pressures against HCV NS5B in chronic Korean patients ( Table 3).

Comparisons of d S and d N in the NS5B region between Korean patients and patients from other countries
To examine whether there was distinctive immune pressure against HCV NS5B at the CD4+ T cell level in Koreans, we compared d S and d N in the NS5B region between 15 Korean patients and 60 patients from other countries (China: 15, Japan: 15, Switzerland: 15 and the United States: 15). In the Koreans subjects, we used the consensus sequences of NS5B from more than 10 subclones of patients. For the patients from other countries, we used sequences retrieved from the LANL HCV database. In the NS5B region, the d N /d S ratio for the Korean subjects (0.23) was higher than it was for those from other countries (1.4) with statistical support (p = 0.002). The d N frequency (3.1) in the known CD8+ T cell epitopes from Korean patients was higher than that for the patients from other countries (2.1), but the difference was not statistically significant (p = 0.078). However, the d N frequency (4.5%) in the predicted CD4+ T cell epitope regions in the Korean patients was significantly higher than that in those from other countries (2.2%) (p,0.001). The d N /d S ratios in the predicted CD4+ T cell epitope regions were higher in the Koreans (0.52) by nearly twofold compared to those of the patients from other areas (0.26). In particularly, the difference in the d N frequency between the Koreans (6.4%) and the patients from other countries (2.3%) was more pronounced in the mutational hotspot. Collectively, these results suggest the presence of distinctive CD4+ T cell mediated immune pressure against HCV NS5B in Koreans (Table 4).

Correlation between NS5B mutations and the severity of liver disease
The overall mutation frequency of the entire NS5B region in C (2.8%) was significantly higher than in the comparison group, patients with CH, those with liver cirrhosis LC and those with HCC (2.2%) (p = 0.002). The mutation frequency in known CD8+ T cell epitopes was also significantly higher in C than in the comparison group [C (3.4%) vs. CH + LC + HCC (2.6%), p = 0.05]. This tendency was also found in the predicted CD4+ T cell epitopes [C (5.7%) vs. CH + LC + HCC (4.2%), p = 0.001] and in the mutational hotspot [C (7.7%) vs. CH + LC + HCC (5.9%), p = 0.004] with an increased frequency of mutations at a statistically significant level. This shows that increases in the mutation rate in the NS5B region are negatively correlated with the progression of liver disease in chronic hepatitis C patients (Table 5).

Mutation frequency in codons related to SVR and ETR in Korean patients
Mutations at the 309, 333, 338 and 355 codons are reportedly related to SVR and ETR groups as compared to non-responders (NR) [15]. Interestingly, a very high mutation rate in four SVRrelated codons was found in Korean treatment-naïve patients, with an average mutation frequency of 28.9% (192/664) in the quasispecies distributions. Of note, the average mutation frequency (31.7%) in four codons as calculated from 15 Korean patients was significantly higher than any of the other regions, including that from Japan (Table 6).
A quasispecies analysis showed a total of 10 mutations, including SVR and antiviral resistance in the sequenced NS5B region. These can be divided into two distinct groups. One is the diverse (D) type, which coexists with other quasispecies members in a patient, and the other is made up of conserved (C) types which exist alone without a quasispecies counterpart in a patient (Fig. 2, Table 1). The coexistence of diverse quasispecies at a specific codon may be indirect evidence of an important target for immune pressure or/and viral fitness. Notably, the coexistence of Q and R at codon 309, located in one of the CD8+ T cell epitopes (aa 308 and 315), was found in all 15 Korean subjects via a quasispecies distribution analysis; this may be due to the distinct CD8+ T cell immune pressure against a region between aa 308 and 315 among Koreans (Table S6). In addition, there were other D type mutations: A333V, S335N, V338A, P353L, E440G/K and C451H. On the other hand, there were only three C types of mutations (C316N, Q355K/R and E464Q). Interestingly, in all the three C-type mutations, significantly different Cq values between two counterparts in the respective mutation type were found (Table S3).

Discussion
The presence of distinct HLA types among an ethnic group could lead to distinct MHC class I or II restricted immune pressures within its population [37,43,44,47]. Therefore, the frequency and patterns of escape variants against structural and nonstructural HCV proteins reflect the background HLA types among an ethnic group [48,49]. The aim of the present study is to investigate the background mutation frequency and patterns of HCV NS5B, reportedly related to a high SVR, from treatment-naïve Korean patients chronically infected with GT-1b in an effort to explain the high SVR in Korean patients. The significant findings of this study are discussed below.
First, the entire mutation frequency in the sequenced NS5B region was positively correlated with Cs but not with patients showing disease progression (CH, LC and HCC) [C (2.8%) vs. CH + LC + HCC (2.2%), p = 0.002]. Furthermore, similar mutation frequencies were noted within both the CD4+ (p = 0.001) and CD8+ T cell epitope regions (p = 0.05) ( Table 5). This suggests that the accumulation of multiple mutations in NS5B may be induced by vigorous and multi-specific immune pressure in the HCV-acute infection phase and may lead to the functional abnormality of HCV RdRp activity, resulting in the attenuation of HCV pathogenic potentials [19]. This strongly supports previous results which showed that mutations in NS5B were related to the high SVR and EVR of GT-1b chronically infected patients [15].
Second, a pronounced d N frequency in the predicted CD4+ T cell epitopes in the NS5B region [Korean (4.5%) vs. those of patients from other countries (2.1%), p = 0.001], particularly in the mutational hotspot [Korean (6.4%) vs. other countries (3.1%),  (Table 4). This suggests that there is distinct intrahepatic MHC class II restricted immune pressure at least against HCV NS5B among the Korean population [19]. Broadly directed virus-specific immune pressure at the CD4+ T cell level was recently reported to play a very pivotal role in spontaneous resolution at a very early phase of HCV-acute infection [50]. Furthermore, the presence of the multi-specific CD4+ T cell response against HCV can aid not only the induction of a vigorous antiviral CD8+ T cell response but also antibody production for the inhibition of the spread of the virus [51]. Particularly, because three codons (A333, V338 and Q355) out of four reported to be related to the high SVR are located in the mutational hotspot, the acquisition of mutations within this region induced by the distinctive Korean immune pressure at the CD4+ T cell level may contribute to the high SVR found in Korean patients infected with GT-1b. In fact, the prediction of the MHC class II HLA allele showed that a region of the CD4+ T cell epitope from NS5B, covering aa 333 to 347, one of two predicted epitopes comprising the mutational hotspot, has high binding affinity for most HLA DRB1 alleles prevalent in Korean populations [52]. In addition, HLA DQB1 03:01 and 03:02, prevalent at frequencies higher than 10% in Koreans, also are noted to be associated with viral clearance [53][54][55]. Our previous study also showed that there are distinct mutation patterns and a very high mutation frequency of the CD4+ T cell epitopes of the HBV preC/Core region in chronic Korean patients, strongly supporting the hypothesis of this study [56]. Third, the frequency of d N within the CD8+ T cell epitope region of NS5B was significantly higher than that outside the CD8+ T cell epitope region [inside CD8+ (2.9%) vs. outside (2.3%), p = 0.001], suggesting the presence of immune pressure at the CD8+ T cell level against HCV NS5B among Korean patients, as shown in patients from other areas (Table 3) [37,44,47,57]. However, pronounced differences in the mutation frequency between six regions of CD8+ T cell epitopes were found. Two of the six CD8+ T cell epitopes (308 to 315 aa and 451 to 459) with high binding affinity to two HLA allele types, HLA-A02:01 and HLA-A24:02, prevalent in Koreans, showed a higher d N frequency compared to other epitopes [308-315: 96 (7.2%) and 451-459: 49 (3.3%)], suggesting the presence of distinct MHC class I restricted immune pressure in Korean patients [52]. Particularly, it is noteworthy that the extraordinary high d N /d S ratio (2.04) found in a region of the CD8+ T cell epitope covering codons 308 to 315, was mainly due to the presence of frequent mutations in codon 309, one of four codons related to SVR rates (Table S4, Fig. 2). The mutation type, Q309R, is known to be frequently mutated in NS5B, particularly in Asian patients. However, even compared to Japanese patients, also an Asian country like Korea, the strikingly high mutation frequency of Q309R was observed in only the Korean patients [15,16]. All of the 15 patients harbored this mutation in their quasispecies distribution and more than half (96/166, 57.8%) of all quasispecies from the 15 patients had the mutation type R309. Interestingly, the co-existence of both mutated and wild types, not exclusive of the existence of one type alone, was found in all 15 patients, suggesting the advantage of the coexistence of two variants in a patient over the exclusive existence of either type alone in an escape of host immune surveillance or viral fitness (Table S6). Therefore, the high frequency of the Q309R mutation in Korean patients may be induced by CD8+ T cell immune pressure which may in part provide a likely explanation for the high SVR rates in Koreans.
Finally, it is well known that mutations in NS5B can affect the HCV replication capacity [19]. We found a total of three types of mutations (C316N, Q355K/R and E464Q) which had a significant effect on HCV replication (Cq value: C316N and E464Q p = 0.033, Q355K/R p = 0.003) (Table S3). Interestingly, our quasispecies analysis showed that two polymorphisms in aa 316, C316 and N316, were strongly related to two polymorphisms in codon 464, Q464 and E464, respectively, in an exclusive manner ( Figure 1). The type with both C316 and Q464 signatures showed a significantly higher HCV replication capacity and was more related to patients with advanced liver disease compared to the type with both the N316 and E464 signatures. The exclusive combination of the SNPs of two codons may be due to the structural constraint of NS5B. Furthermore, the coexistence of both types (C316/Q464 and N316/E464) was not found in any patients, suggesting that these two types may be from completely different resources and not a different quasispecies version induced by immune pressure from a patient. Our data showing phylogenetic segregation between the two types also supports the above hypothesis.
Our study has three potential limitations. First, the nested PCR protocol used in this study showed low sensitivity, with the amplification of only 23 samples out of 73 samples (31.5%). The strategies for the nested PCR protocol including primer sets and a PCR condition should be modified in the future study. Particularly, PCR negative amplifications were found with high frequencies in samples with lower HCV viral loads, suggesting novel nested PCR protocol to increase the degree of sensitivity should be applied in a future study. Second, the modest population size (15 patients) is relatively small to lead to a meaningful conclusion about the relationship between NS5B mutations and liver disease progression. Third, as single-genome amplification and an end-point dilution strategy were not utilized, the cloning strategy employed in this study is limited when used to represent genuine viral quasispecies in serum samples.
In conclusion, our data suggest that the distinct MHC class II restricted immune pressure against HCV NS5B in Korean patients leads to a pronounced high mutation frequency and distinct mutation patterns in HCV NS5B in Korean patients. This finding provides important insight into the high SVR and ETR rates during the treatment of GT-1b infected Korean patients.       Author Contributions