Dynamics of Hepatitis B Virus Quasispecies in Association with Nucleos(t)ide Analogue Treatment Determined by Ultra-Deep Sequencing

Background and Aims Although the advent of ultra-deep sequencing technology allows for the analysis of heretofore-undetectable minor viral mutants, a limited amount of information is currently available regarding the clinical implications of hepatitis B virus (HBV) genomic heterogeneity. Methods To characterize the HBV genetic heterogeneity in association with anti-viral therapy, we performed ultra-deep sequencing of full-genome HBV in the liver and serum of 19 patients with chronic viral infection, including 14 therapy-naïve and 5 nucleos(t)ide analogue(NA)-treated cases. Results Most genomic changes observed in viral variants were single base substitutions and were widely distributed throughout the HBV genome. Four of eight (50%) chronic therapy-naïve HBeAg-negative patients showed a relatively low prevalence of the G1896A pre-core (pre-C) mutant in the liver tissues, suggesting that other mutations were involved in their HBeAg seroconversion. Interestingly, liver tissues in 4 of 5 (80%) of the chronic NA-treated anti-HBe-positive cases had extremely low levels of the G1896A pre-C mutant (0.0%, 0.0%, 0.1%, and 1.1%), suggesting the high sensitivity of the G1896A pre-C mutant to NA. Moreover, various abundances of clones resistant to NA were common in both the liver and serum of treatment-naïve patients, and the proportion of M204VI mutants resistant to lamivudine and entecavir expanded in response to entecavir treatment in the serum of 35.7% (5/14) of patients, suggesting the putative risk of developing drug resistance to NA. Conclusion Our findings illustrate the strong advantage of deep sequencing on viral genome as a tool for dissecting the pathophysiology of HBV infection.


Introduction
Hepatitis B virus (HBV) is a non-cytopathic DNA virus that infects approximately 350 million people worldwide and is a main cause of liver-related morbidity and mortality [1][2][3]. The absence of viral-encoded RNA-dependent DNA polymerase proofreading capacity coupled with the extremely high rate of HBV replication yields the potential to rapidly generate mutations at each nucleotide position within the entire genome [4]. Accordingly, a highly characteristic nature of HBV infection is the remarkable genetic heterogeneity at the inter-and intra-patient level. The latter case of variability as a population of closely-related but nonidentical genomes is referred to as viral quasispecies [5,6]. It is well recognized that such mutations may have important implications regarding the pathogenesis of viral disease. For example, in chronic infection, G to A point mutation at nucleotide (nt) 1896 in the pre-core (pre-C) region as well as A1762T and G1764A mutations in the core-promoter region are highly associated with HBeAg seroconversion that in general results in the low levels of viremia and consequent clinical cure [7][8][9]. In contrast, acute infection with the G1896A pre-C mutant represents a high risk for fulminant hepatic failure [10,11]. Although these facts clearly illustrate the clinical implications of certain viral mutation, increasing evidence strongly suggests that the viral genetic heterogeneity is more complicated than previously thought [12,13].
The major goals of antiviral therapy in patients with HBV infection are to prevent the progression of liver disease and inhibit the development of hepatocellular carcinoma [14]. Oral nucleos(t)ide analogue (NA) have revolutionized the management of HBV infection, and five such antiviral drugs, including lamivudine, adefovir, entecavir, tenofovir, and telbivudine, are currently approved medications [15,16]. These agents are well-tolerated, very effective at suppressing viral replication, and safe, but one of the major problems of NA therapy is that long-term use of these drugs frequently causes the emergence of antiviral drug-resistant HBV due to substitutions at specific sites in the viral genome sequences, which often negates the benefits of therapy and is associated with hepatitis flares and death [16,17]. It is unclear whether viral clones with antiviral resistance emerge after the administration of antiviral therapy or widely preexist among treatment-naïve patients.
There has been a recent advance in DNA sequencing technology [18]. The ultra-deep sequencers allow for massively parallel amplification and detection of sequences of hundreds of thousands of individual molecules. We recently demonstrated the usefulness of ultra-deep sequencing technology to unveil the massive genetic heterogeneity of hepatitis C virus (HCV) in association with treatment response to antiviral therapy [19]. On the other hand, there are a few published studies in which this technology was used to characterize genetic HBV sequence variations [20][21][22]. Margeridon-Thermet et al reported that the 454 Life Science GS20 sequencing platform provided higher sensitivity for detecting drug-resistant HBV mutations in the serum of patients treated with nucleoside and nucleotide reversetranscriptase inhibitors [20]. Solmone et al also reported the strong advantage conferred by the same platform to detect minor variants in the serum of patients with chronic HBV infection [21]. Although in these previous studies low-abundant drug-resistant variants were successfully detected, the analyses were focused on the reverse-transcriptase region of circulating HBV in the serum and thus the whole picture of HBV genetic heterogeneity as well as the in vivo dynamics of HBV drug resistant variants in response to anti-viral treatment remains to be clarified. Moreover, intrahepatic viral heterogeneity in patients that achieved the clearance of circulating HBV is largely unknown. By taking the advantage of an abundance of genetic information obtained by utilizing the Illumina Genome Analyzer II (Illumina, San Diego, CA) as a platform of ultra-deep sequencing, we determined the whole HBV sequence in the liver and serum of patients with chronic HBV infection to evaluate viral quasispecies characteristics. Moreover, we investigated the prevalence of rare drug-resistant HBV variants as well as detailed dynamic changes in the viral genetic heterogeneity in association with NA administration. Based on the abundant genetic information obtained by ultra-deep sequencing, we clarified the precise prevalence of HBV clones with G1896A pre-C mutations in association with HBe serostatus in chronically infected patients with or without NA treatment. We also detected a variety of minor drug-resistant clones in treatment-naïve patients and their dynamic changes in response to entecavir administration, demonstrating the potential clinical significance of naturallyoccurring drug-resistant mutations.

Ethics Statement
The Kyoto University ethics committee approved the study, and written informed consent for participation in this study was obtained from all patients. The study was conducted in accordance with the principles of the Declaration of Helsinki.

Patients
The liver tissues of 19 Japanese patients that underwent livingdonor liver transplantation at Kyoto University due to HBVrelated liver disease were available for viral genome analyses. These individuals included 13 men and 6 women, aged 41 to 69 years (median, 55.2 years) and all but one were infected with genotype C viruses. Participants comprised 19 patients with liver cirrhosis caused by chronic HBV infection, including 14 antiviral therapy-naïve cases (chronic-naïve cases) and 5 cases receiving NA treatment, with either lamivudine or entecavir (chronic-NA cases) ( Table 1). Serum HBV DNA levels were significantly higher in chronic-naïve cases than in chronic NA cases (median serum HBV DNA levels were 5.6, and ,2.6 log copies/ml, respectively, Table 1). Liver tissue samples were obtained at the time of transplantation, frozen immediately, and stored at 280uC until use. Serologic analyses of HBV markers, including hepatitis B surface antigen (HBsAg), antibodies to HBsAg, anti-HBc, HBeAg, and antibodies to HBeAg, were determined by enzyme immunoassay kits as described previously [23]. HBV DNA in the serum before transplantation was examined using a polymerase chain reaction (PCR) assay (Amplicor HBV Monitor, Roche, Branchburg, NJ). To examine the dynamics of viral quasispecies in response to anti-HBV therapy, paired serum samples of 14 treatment-naïve patients before and after administration of daily entecavir (0.5 mg/day) were subjected to further analyses on viral genome.

Direct population Sanger sequencing
DNA was extracted from the liver tissue and serum using a DNeasy Blood & Tissue Kit (Qiagen, Tokyo, Japan). To define the consensus reference sequences of HBV in each clinical specimen, all samples were first subjected to direct population Sanger sequencing using the Applied Biosystems 3500 Genetic Analyzer (Applied Biosystems, Foster City, CA). Oligonucleotide primers for the HBV genome were designed to specifically amplify whole viral sequences as two overlapping fragments using the sense primer 169_F and antisense primer 2847_R to yield a 2679-bp amplicon (amplicon 1), and the sense primer 685_F and antisense primer 443_R to yield a 2974-bp amplicon (amplicon 2; Table S1). HBV sequences were amplified using Phusion High-Fidelity DNA polymerase (FINZYMES, Espoo, Finland). All amplified PCR products were purified using the QIAquick Gel Extraction kit (Qiagen) after agarose gel electrophoresis and used for direct sequencing. The serum of a healthy HBV DNA-negative volunteer was used as a negative control.

Viral genome sequencing by massively-parallel sequencing
Massively-parallel sequencing with multiplexed tags was performed using the Illumina Genome Analyzer II as described [19]. The end-repair of DNA fragments, addition of adenine to the 39 ends of DNA fragments, adaptor ligation, and PCR amplification by Illumina PCR primers were performed as described previously [24]. Briefly, the viral genome sequences were amplified by high-fidelity PCR using oligonucleotide primers as described above, sheared by nebulization using 32 psi N2 for 8 min, and then the sheared fragments were purified and concentrated using a QIAquick PCR purification Kit (Qiagen). Nucleotide overhangs resulting from fragmentation were then converted into blunt ends using T4 DNA polymerase and Klenow enzymes, followed by the addition of terminal 39 A-residues. An adaptor containing unique 6-bp tags, such as ''ATCACG'' and ''CGATGT'' (Multiplexing Sample Preparation Oligonucleotide Kit, Illumina), was then ligated to each fragment using DNA ligase. We then performed agarose gel electrophoresis of adaptorligated DNAs and excised bands from the gel to produce libraries with insert sizes ranging from 200 to 350 bp. These libraries were amplified independently using a minimal PCR amplification step of 18 cycles by Illumina PCR primers with Phusion High-Fidelity DNA polymerase. The DNA fragments were then purified with a MinElute PCR Purification Kit (Qiagen), followed by quantification using the NanoDrop 2000C (Thermo Fisher Scientific, Waltham, MA) to make a working concentration of 10 nM. Cluster generation and sequencing was performed for 64 cycles on the Illumina Genome Analyzer II according to the manufacturer's instructions. The obtained images were analyzed and base-called using GA pipeline software version 1.4 with the default settings provided by Illumina.

Genome Analyzer sequence data analysis
Using the high performance alignment software ''NextGene'' (SoftGenetics, State College, PA), the 64 base-pair reads obtained from the Genome Analyzer II were aligned with the reference sequences of 3215 bp that were determined by direct population Sanger sequencing of each clinical specimen. Reads with 90% or more bases matching a particular position of the reference sequences were aligned. Furthermore, two quality filters were used for sequencing reads: the reads with a median quality score of more than 30 and no more than 3 uncalled nucleotides were allowed anywhere in the 64 bases. Only sequences that passed the quality filters, rather than raw sequences, were analyzed and each position of the viral genome was assigned a coverage depth, representing the number of times the nucleotide position was sequenced.
Allele-specific quantitative real-time PCR and semiquantitative PCR to determine the relative proportion of G1896A pre-C mutant To determine the relative proportion of the G1896A pre-C mutant, allele-specific quantitative real-time PCR was performed based on the previously described method [25,26]. Oligonucleotide primers were designed individually to amplify the pre-C region of wild-type and the G1896A pre-C mutant HBV. Three primers were used for this protocol, two allele-specific sense primers, 1896WT_F (for wild-type) and 1896MT_F (for the G1896A pre-C mutant), and one common antisense primer, 2037_R (Table S1). Quantification of wild-type and the G1896A pre-C mutant was individually performed by real-time PCR using a Light Cycler 480 and Fast Start Universal SYBR Master (Roche, Mannheim, Germany) [27]. The relative proportion of the G1896A pre-C mutant was determined to calculate the G1896A pre-C mutant/total HBV ratios. Performance of this assay was tested using mixtures of two previously described plasmids, pcDNA3-HBV-wt#1 and pcDNA3-HBV-G1896A pre-C mutant [28]. Semiquantitative PCR was performed using primers described above, then agarose gel electrophoresis was performed.

Statistical analysis
Results are expressed as mean or median, and range. Pretreatment values were compared using the Mann-Whitney U-test or the Kruskal Wallis H-test. P values less than 0.05 were considered statistically significant.
The viral quasispecies characteristics were evaluated by analyzing the genetic complexity based on the number of different sequences present in the population. Genetic complexity for each site was determined by calculating the Shannon entropy using the following formula: where n is the number of different species identified, fi is the observed frequency of a particular variant in the quasispecies, and N is the total number of clones analyzed [12,13]. The mean viral complexity in each sample was determined by calculating the total amounts of the Shannon entropy at each nucleotide position divided by the total nucleotide number (e.g., 3215 bases) of each HBV genome sequence.

Nucleotide sequence accession number
All sequence reads have been deposit in DNA Data Bank of Japan Sequence Read Archive (http://www.ddbj.nig.ac.jp/indexe.html) under accession number DRA000435.

Validation of multiplex ultra-deep sequencing of the HBV genome
To differentiate true mutations from sequencing errors in the determined sequences, we first generated viral sequence data from the expression plasmid, pcDNA3-HBV-wt#1, encoding wild-type genotype C HBV genome sequences [28]. For this purpose, we determined the PCR-amplified HBV sequences derived from the expression plasmid using high-fidelity Taq polymerase to take the PCR-induced errors as well as sequencing errors into consideration. Viral sequences determined by the conventional Sanger method were used as reference sequences for aligning the amplicons obtained by ultra-deep sequencing. Three repeated ultra-deep sequencing generated a mean of 77,663 filtered reads, corresponding to a mean coverage of 38,234 fold at each nucleotide site (Table S2). Errors comprised insertions (0.00003%), deletions (0.00135%), and nucleotide mismatches (0.037%). The mean overall error rate was 0.034% (distribution of per-nucleotide error rate ranged from 0 to 0.13%) for the three control experiments, reflecting the error introduced by highfidelity PCR amplification and by multiplex ultra-deep sequencing that remained after filtering out problematic sequences. We also confirmed that multiplex ultra-deep sequencing with and without the high-fidelity PCR amplification with HBV-specific primer sets showed no significant differences in the error rates on the viral sequence data (mean error rate 0.034% vs 0.043%). Accordingly, we defined the cut-off value in its current platform as 0.3%, a value nearly 1 log above the mean overall error rate.
Next, we performed additional control experiments to verify the detectability of the low abundant mutations that presented at a frequency of less than 0.3%. For this purpose, we introduced expression plasmids with a single-point mutation within that encoding a wild-type viral sequence with a ratio of 1:1000 and assessed the sensitivity and accuracy of quantification using highfidelity PCR amplification followed by multiplex ultra-deep sequencing in association with the different coverage numbers (Table S3). Repeated control experiments revealed that the threshold for detecting low-abundant mutations at an input ratio of 0.10% among the wild-type sequences ranged between 0.11% and 0.24%, indicating that there was no significant difference in the detection rate or error rates under the different coverage conditions. Based on these results, the accuracy of ultra-deep sequencing in its current platform for detecting low-level viral mutations was considered to be greater than 0.30%.

Viral complexity of the HBV quasispecies in association with clinical status
To clarify HBV quasispecies in association with clinical status, we performed multiplex ultra-deep sequencing and determined the HBV full-genome sequences in the liver and serum with chronic HBV infection. First, we compared the sequences of the viral genome determined in the liver tissue with those in the serum and found no significant differences in the viral population between the liver and serum of the same individual. Indeed, the pattern and distribution of genetic heterogeneity of the viral nucleotide sequences in the liver tissue were similar to those observed in the serum of the same patient ( Figure S1), suggesting that a similar pattern of viral heterogeneity was maintained in the liver and serum of patients with chronic HBV infection.
Next, we compared the viral heterogeneity in the liver of chronic-naïve and chronic-NA cases. A mean of 5,962,996 bp nucleotides in chronic-naïve cases and 4,866,783 bp nucleotides in chronic-NA cases were mapped onto the reference sequences, and an overall average coverage depth of 1,855 and 1.514 was achieved for each nucleotide site of the HBV sequences, respectively ( Table 2). The frequencies of mutated positions and altered sequence variations detected in each viral genomic region are summarized in Table 2. The overall mutation frequency of the total viral genomic sequences was determined to be 0.87% in chronic-naïve cases and 0.69% in chronic-NA cases. Most genomic changes observed in viral variants were single base substitutions, and the genetic heterogeneity of the viral nucleotide sequences was equally observed throughout the individual viral genetic regions, including the pre-surface (preS), S, pre-core,core (preC-C), and X ( Table 2). Consistent with the findings obtained from the viral mutation analyses, the overall viral complexity determined by the Shannon entropy value was 0.047 in chronicnaïve and 0.036 in chronic-NA cases, and the viral complexity was equally observed throughout the individual viral genetic region ( Figure 1A). Among chronic-naïve cases, we observed no significant differences in the viral complexity in HBV DNA level, age, or degree of fibrosis ( Figure 1B).
High sensitivity of the G1896A pre-C mutant to nucleos(t)ide analogues Emergence of G1896A mutation in the pre-C region, and A1762T and G1764A mutations in the core-promoter region is well known to be associated with HBe-seroconversion [7][8][9]. We then evaluated the prevalence of these three mutations in the chronically HBV-infected liver, in association with HBe serologic status and the NA treatment history. In chronic-naive cases, 6 and 8 patients showed the pre-and post-HBeAg seroconversion status, respectively ( Table 3). The mean prevalence of the G1896A pre-C mutant in HBeAg-positive cases was lower than that in the HBeAg-negative cases (27.4% and 46.5%, respectively). Importantly, however, 4 of 8 HBeAg-negative cases showed a relatively low prevalence of the G1896A pre-C mutant (Liver #8, #12, #13, #14), and all but one case (Liver #10) showed a high prevalence of the A1762T and G1764A mutations, irrespective of HBe serologic status and NA treatment history (Table 3). These findings suggested that other mutations except G1896A, A1762T and G1764A were also involved in the HBeAg seroconversion status. Notably, liver tissues of all but one (Liver #17) chronic-NA cases showed extremely low levels of the G1896A pre-C mutant (0.0, 0.0, 0.1, and 1.1%), suggesting the high sensitivity of the G1896A pre-C mutant to NA (Table 3).
To confirm the difference of the sensitivity to NA between the wild-type and the G1896A pre-C mutant, we examined the dynamic changes of the relative proportion of the G1896A pre-C mutant in the serum of 14 treatment-naïve patients before and after entecavir administration. Consistent with the findings obtained by ultra-deep sequencing, quantitative real-time PCR revealed that entecavir administration significantly reduced the proportion of the G1896A pre-C mutant in 13 of 14 cases (92.9%) irrespective of their HBeAg serostatus, while the G1896A pre-C mutant were detectable in substantial proportion before treatment in all cases (Figure 2A, 2B and 2C; p = 0.001). These results further support the findings that HBV clones comprising the G1896A mutation were more sensitive to NA than those with wild-type sequences.

Prevalence of drug-resistant HBV clones in the liver of treatment-naïve patients
Increasing evidence suggests that drug-resistant viral mutants can be detected in the serum of treatment-naïve patients with chronic HBV infection [20,21]. Thus, we next determined the actual prevalence of spontaneously-developed drug-resistant mutants in chronically-infected liver of treatment-naïve patients to evaluate whether NA treatment potentiates the expansion of drug-resistant clones. The drug-resistant mutations examined included two mutations resistant to lamivudine and entecavir, four mutations resistant to entecavir, and three mutations resistant to adefovir [16,17]. Based on the detection rate of the low-level viral clones determined by the control experiments, we identified the drug-resistant mutants present in each specimen at a frequency of more than 0.3% among the total viral clones. Based on these criteria, at least one resistant mutation was detected in the liver of all of the chronic-naïve cases with chronic HBV infection (Table 4).  The prevalence of the 9 drug-resistant mutations detected by ultradeep sequencing in 14 chronic-naïve cases ranged from 0.3% to 30.0%, indicating that the proportion of resistant mutations substantially differed in each case. The most commonly detected mutation was M204VI (9 cases) and M250VI (11 cases), which were resistant to lamivudine and entecavir, and entecavir, respectively. Other mutations resistant to adefovir were detected in 7 (50.0%) and 3 (21.4%) cases at A181TV and N236T, respectively (Table 4). Nine (64.2%) chronic-naïve cases possessed the M204VI mutants in their liver tissues and the proportion of mutant clones among the totally infected viruses ranged from 0.3% to 1.1% among the M204VI mutant-positive patients. In chronic-NA cases, 4 of 5 (80.0%) liver tissues harbored the M204VI mutants with the proportion among the totally infected viruses ranging from 0.4% to 18.7% (Table 4), while the mean serum HBV DNA was suppressed below 2.6 log copies/ml (Table 1). These results suggest that the mutant HBV clones comprising various drugresistant mutations could latently exist even in the liver of NA treatment-naïve cases.

Expansion of drug-resistant HBV clones harboring M240VI mutations in response to NA administration
To clarify the risk of latent expansion of drug-resistant mutations due to NA treatment, we next examined the early dynamic changes of the prevalence of M204VI mutants in the serum of treatment-naïve patients in response to entecavir treatment. Ultra-deep sequencing provided a mean 40,791-and 38,823-fold coverage of readings, which were mapped to the M204VI nucleotide position at the YMDD sites of each reference sequence in patients before and after entecavir treatment.
Five of 14 (35.7%) patients harbored the M204VI mutations prior to entecavir treatment. Although the serum HBV DNA levels were significantly reduced in response to entecavir in all cases, the M204VI mutant clones were detected in 9 cases (64.3%) after entecavir administration (Table 5). Notably, one patient (Serum #3) who harbored the M240VI mutant clones at baseline had a relatively large expansion of drug-resistant clones among the total viral population in a time-dependent manner in response to entecavir treatment (Table 5). Similarly, M240VI mutant clones became detectable after entecavir administration in four patients (Serum #1, #7, #12, #13) that harbored no resistant mutants at baseline (Table 5). We found no correlation between the degree of the increase in the relative prevalence of M204VI mutant clones and that of the reduction in serum HBV DNA levels. Although only a limited number of patients exhibited a substantial increase in M204VI mutant clones after administration of anti-viral therapy, our findings might suggest that entecavir treatment latently causes selective survival of drugresistant mutants in treatment naïve patients with chronic HBV infection. Table 3. The prevalence of G1896A mutation in the pre-C region, and A1762T and G1764A mutations in the core-promoter region in the liver of patients chronically infected with HBV. Values in parenthesis show mutation frequency (%): the ratio of total mutant clones to total aligned coverage at each nucleotide sites. NA: nucleotide analogue, pre C: precore, CP: core promoter, LAM: lamivudine, ETV: entecavir. doi:10.1371/journal.pone.0035052.t003

Discussion
Direct population sequencing is the most common method for detecting viral mutations [29]. Conventional sequencing techniques, however, are not efficient for evaluating large amounts of genetic information of the viruses. Newly developed ultra-deep sequencing technology have revolutionized genomic analyses, allowing for studies of the dynamics of viral quasispecies as well as rare genetic variants of the viruses that cannot be detected using standard direct population sequencing techniques [30,31]. The sensitivity of ultra-deep sequencing analysis is primarily limited by errors introduced during PCR amplification and the sequencing reaction, thus it is a challenge to distinguish rare variants from sequencing artifacts. In the present study, we optimized the ultradeep sequencing with a multiplex-tagging method and reproducibly detected variants within HBV quasispecies that were as rare as 0.3%. Based on this ultra-deep sequencing platform, we determined the abundant genetic heterogeneity of HBV at the intra-and inter-individual levels.
Because of its ability to handle abundant viral genome information, ultra-deep sequencing allowed us to evaluate lowabundant virus variants of patients with chronic HBV infection in detail. It is widely accepted that HBe seroconversion is highly associated with the emergence of G1896A pre-C and/or A1762T and G1764A core promoter mutant clones [7][8][9]. Unexpectedly, however, our results showed a diverse range of G1896A frequency (0-99.9%) in HBeAg-negative subjects and a high prevalence of core promoter mutations, irrespective of HBe serostatus. Consistent with our observation, previous studies utilizing conventional sequencing methods reported that the frequency of the G1896A pre-C mutant ranged from 12% to 85% [32]. All but one patient (Liver #10) showing a predominance of A1762T and G1764A were infected with genotype C, while patient#10 was infected with genotype B. Because A1762T and G1764A are reported to be significantly more frequent in genotype C [33], the difference in the prevalence of A1762T and G1764A in our study might be a reflection of the viral HBV genotype rather than HBe serostatus. Further investigation of the actual prevalence of these mutations  and the elucidation of other unknown mutations involved in HBe seroconversion are necessary for a better understanding of the underlying mechanisms of HBe seroconversion. One thing to be noted is that the majority of the chronic-NA cases had extremely low levels of the G1896A pre-C mutant in their liver tissues, even though those cases were serologically positive for anti-HBe and negative for HBeAg. Moreover, entecavir administration significantly reduced the proportion of the G1896A pre-C mutant in the serum of the majority of patients irrespective of their HBeAg serostatus, while the G1896A pre-C mutant clones were detectable in a substantial proportion before treatment in all cases. These findings suggest that the G1896A pre-C mutant have higher sensitivity to NA than the wild-type viruses. Consistent with this hypothesis, several previous studies reported that NA is effective against acute or fulminant hepatitis caused by possible infection with the G1896A pre-C mutant [34,35]. Based on these findings, early administration of NA might be an effective strategy for treating patients with active hepatitis infected predominantly with the G1896A pre-C mutant.
Ultra-deep sequencing has a relatively higher sensitivity than conventional direct population sequencing and is thus useful for detecting drug-resistant mutations not detected by standard sequencing [20,21]. Recently, we revealed that drug-resistant mutants were widely present in treatment-naïve HCV-infected patients, suggesting a putative risk for the expansion of resistant clones to anti-viral therapy [19]. Here, we demonstrated that various drug-resistant HBV variants are present in a proportion of chronically HBV-infected, NA-naïve patients. Several studies using ultra-deep sequencing provided evidence that naturallyoccurring drug-resistant mutations are detectable in treatmentnaïve individuals with human immunodeficiency virus-1 infection [30,36,37]. Consistent with the cases of human immunodeficiency virus-1 infection, a few studies detected minor variants resistant to NA in the plasma of treatment-naïve patients with chronic HBV infection [20,21]. It remains unclear, however, whether these minor drug-resistant mutations have clinical significance. Our observation of the relative expansion of viral clones with the M204VI mutation during entecavir therapy in some cases indicates the possibility that preexisting minor mutants might provide resistance against NA through the selection of dominant mutant clones. Future studies with a larger cohort size are required to clarify the clinical implications of the latently existing lowabundant drug-resistant mutations.
The current ultra-deep parallel sequencing technology has limitations in the analyses of viral quasispecies. First, because the massively-parallel ultra-deep sequencing platform is based on a multitude of short reads, it is difficult to evaluate the association between nucleotide sites mapped to different genome regions in a single viral clone. Indeed, potential mutational linkages between the pre-C and reverse transcriptase regions were difficult to elucidate due to the short read length of the shotgun sequencing approach. Second, accurate analysis of highly polymorphic viral clones by ultra-deep sequencing is also difficult because the identification of mutations depends strongly on the mapping to the reference genome sequences.
In conclusion, we demonstrated that the majority of patients positive for anti-HBe and negative for HBeAg lacked the predominant infection of the G1896A pre-C mutant in the presence of NA treatment, suggesting that the G1896A pre-C mutant have increased sensitivity to NA therapy compared with wild-type HBV. We also revealed that drug-resistant mutants are widely present, even in the liver of treatment-naïve HBVinfected patients, suggesting that the preexisting low-abundant mutant clones might provide the opportunity to develop drug resistance against NA through the selection of dominant mutations. Further analyses utilizing both novel and conventional sequencing technologies are necessary to understand the significance and clinical relevance of the viral mutations in the pathophysiology of various clinical settings in association with HBV infection. Supporting Information Figure S1 Comparison of the viral complexity between the liver and serum of the same individual. Shannon entropy values throughout the whole viral genome of the liver and serum of the representative two cases are shown. (upper two panels, case #11; lower two panels, case #14). preC-C: precore,core, preS: pre-surface, P: polymerase. (TIF) Table S1 The oligonucleotide primers for amplifying HBV sequences in each clinical specimen.
(DOCX )   Table S2 Error frequency of Ultra-deep sequencing for the expression plasmid encoding wild-type genotype C HBV genome sequences by the three control experiments. (DOCX)