Sequence Variations of Full-Length Hepatitis B Virus Genomes in Chinese Patients with HBsAg-Negative Hepatitis B Infection

Background The underlying mechanism of HBsAg-negative hepatitis B virus (HBV) infection is notoriously difficult to elucidate because of the extremely low DNA levels which define the condition. We used a highly efficient amplification method to overcome this obstacle and achieved our aim which was to identify specific mutations or sequence variations associated with this entity. Methods A total of 185 sera and 60 liver biopsies from HBsAg-negative, HBV DNA-positive subjects or known chronic hepatitis B (CHB) subjects with HBsAg seroclearance were amplified by rolling circle amplification followed by full-length HBV genome sequencing. Eleven HBsAg-positive CHB subjects were included as controls. The effects of pivotal mutations identified on regulatory regions on promoter activities were analyzed. Results 22 and 11 full-length HBV genomes were amplified from HBsAg-negative and control subjects respectively. HBV genotype C was the dominant strain. A higher mutation frequency was observed in HBsAg-negative subjects than controls, irrespective of genotype. The nucleotide diversity over the entire HBV genome was significantly higher in HBsAg-negative subjects compared with controls (p = 0.008) and compared with 49 reference sequences from CHB patients (p = 0.025). In addition, HBsAg-negative subjects had significantly higher amino acid substitutions in the four viral genes than controls (all p<0.001). Many mutations were uniquely found in HBsAg-negative subjects, including deletions in promoter regions (13.6%), abolishment of pre-S2/S start codon (18.2%), disruption of pre-S2/S mRNA splicing site (4.5%), nucleotide duplications (9.1%), and missense mutations in “α” determinant region, contributing to defects in HBsAg production. Conclusions These data suggest an accumulation of multiple mutations constraining viral transcriptional activities contribute to HBsAg-negativity in HBV infection.


Introduction
Hepatitis B virus (HBV) infection leads to a wide spectrum of liver injuries, ranging from acute self-limiting infection and fulminant hepatitis to chronic hepatitis, liver cirrhosis and hepatocellular carcinoma (HCC) [1]. The viral genome contains four partially overlapping open reading frames (ORFs) (pre-S/S, pre-C/C, P and X) [2]. HBV infection is usually diagnosed when the circulating HBV surface antigen (HBsAg) is detected. Spontaneous loss of HBsAg is a rare event in chronic hepatitis B (CHB) infection [3][4][5][6]. Loss of serum HBsAg is also observed in patients receiving treatment for CHB with interferon and nucleoside/nucleotide analogues [7,8].
Although the continued presence of detectable HBsAg in serum is a prominent feature of HBsAg-positive CHB infection or overt CHB infection, HBsAg seroclearance may not signify eradication of HBV. The introduction of sensitive HBV DNA detection tests has revealed the existence of HBsAg-negative HBV infection. This new entity of HBV infection is defined as ''the presence of viral DNA in the liver (with detectable or undetectable HBV DNA in the serum) of individuals testing negative for the HBsAg'' [9,10]. Some of these subjects may be people with or without prior medical history of HBV infection can develop to the occult phase [11]. All these scenarios are collectively termed occult hepatitis B infection (OBI) [10].
Clinical evidence reveals that there is generally no reduction in the risk for HCC in CHB patients with HBsAg seroclearance compared with those who are persistently positive for HBsAg [12][13][14]. The virological features and the mechanisms leading to OBI are still unclear, although viral mutations may be one of the significant factors. Many attempts have been done to analyze HBV genome sequences amplified from OBI patients. Most studies focused on searching for mutations on a fragment of HBV genome (mainly the region coding for HBsAg). This is due to the extremely low viral DNA levels in blood or liver tissue in these patients rendering amplification difficult. Mutations have been found in various regions of the HBV genome and may be associated with OBI: (1) mutations in the surface protein affecting antigen detection [15][16][17]; (2) deletions in the pre-S1 region that impair HBV packaging [18,19]; (3) various mutations in viral regulatory regions that cause decreased in HBsAg expression and/ or viral replication [20,21]; and (4) mutations affecting posttranslational protein production [20,22]. In addition to these potential mechanisms, epigenetic modifications of the HBV genome are also suggested to play a significant role in OBI [23]. However, these studies mainly evaluated viral coding sequences, in particular, the different surface proteins. Owing to the complex organization of viral genome and also the efficient regulation, it is important to study mutations in both gene coding regions as well as the regulatory regions. It remains elusive as to which mutations and variations can lead to OBI. Thus, comprehensive whole genome analyses are still needed.
In the present study, we adopted a highly efficient amplification method, rolling circle amplification (RCA) to amplify full-length HBV genome from sera and liver biopsies of OBI subjects. We primarily aimed at analyzing sequence variations of complete HBV genome in subjects with OBI. The secondary aim was to analyze the effect of pivotal mutations identified on regulatory region(s) on promoter activities of the viral genes.

Patients and Samples
A total of 185 serum samples (156 from CHB patients with HBsAg seroclearance; 29 from HBsAg-negative patients with detectable HBV DNA at the first presentation) and 60 liver biopsies (21 from CHB patients with HBsAg seroclearance; 39 from HBsAg-negative patients with detectable HBV DNA at the first presentation) were recruited. Another 11 serum samples were collected from HBsAg-positive treatment-naïve CHB patients as controls. These subjects were first identified by testing their blood samples for HBsAg (Abbott Prism, Abbott Laboratories, Abbott Park IL) and by the nucleic acid amplification test (NAT) for HBV DNA (Procleix Trigis system, Novartis Diagnostics, Emeryville, CA; 95% detection limit, 12.2 IU/ml). Serum HBV DNA levels were measured using the COBAS TaqMan HBV Monitor Test (Roche Diagnostics, Branchburg, NJ), with a lower limit of detection of 20 IU/ml (100 copies/ml). HBV DNA levels in the liver biopsies were measured by real-time PCR using the Artus HBV RG assay (Qiagen), which has a linear range of detection of 1.1 IU/ml to .4610 9 IU/ml. The present study was approved by the Institutional Review Board, the University of Hong Kong, Hong Kong. Written informed consent was obtained from all patients.
Rolling Circle Amplification and Sequencing of Fulllength HBV Genome RCA, a powerful PCR-based technique for the amplification of low viral load circular DNA, has been previously described [24]. Briefly, HBV plus-strand DNA inside the virions in the serum was completed using the endogenous HBV polymerase before extraction using the QIAamp MinElute Virus Vacuum Kit (Qiagen). Nicks in the HBV DNA strands were then completed by ligase (Epicentre, Madison, WI). For the liver biopsy samples, total DNA was extracted using the QIAamp Allprep Kit (Qiagen). The HBV DNA samples were subjected to RCA using eight primers spanning the whole HBV genome (Table S1) [24]. Single genome-length HBV DNA was either retrieved directly by SpeI digestion (New England Biolabs) or further amplified using the Expand High Fidelity PCR system (Roche, Mannheim, Germany) [25]. This amplified HBV DNA was then purified and sequenced using sequencing primers that cover the whole viral genome (Table S1). Sensitivity and conditions for RCA reaction were tested against samples with known copies of viral DNA (quantified by the Artus real-time PCR kit; Qiagen).

Nucleotide Sequence and Amino Acid Analysis
Full-length HBV sequences were assembled using the CLC Main Workbench 6.6.2 (CLC Bio, Katrinebjerg, Denmark). HBV genotype was determined using the NCBI genotyping tool and phylogenetic analysis. HBV sequences from OBI subjects were compared with either sequences from the control subjects or genotype-matched HBV sequences of Chinese CHB patients retrieved from the GenBank [26]. Analysis of nucleotide and amino acid diversity (d), as well as phylogenetic analysis, was performed using the MEGA version 5 software [27]. The program SimPlot [28] was used to search for evidence of recombination. Amino acid mutation frequency was defined as the number of amino acid variations per total number of amino acid residues within a particular genomic region. Prediction of RNA secondary structure was performed using the MFOLD software [29].
In-vitro Analysis on the Effects of Genetic Mutations on Specific Regulatory Regions on HBV Promoter Activities HBV promoter regions from selected OBI cases and wilde-type controls were amplified using primers containing KpnI and NheI restriction sites. The KpnIand NheI-digested PCR products, containing different mutated HBV promoter regions, were cloned into the reporter vector pGL3-Basic. Positive constructs were confirmed and co-transfected with pRL-TK Renilla reporter plasmid into Huh-7 cells using Lipofectamine 2000 (Invitrogen). After 24 hours of incubation, the cells were assayed for luciferase activity using the Dual-Luciferase Reporter Assay System (Promega, Madison, WI). Promoter activities were expressed as a ratio of firefly luciferase to Renilla luciferase luminescence. Results were expressed in arbitrary units (AU).

Statistical Analyses
Differences between categorical variables were analyzed using the Fisher's exact test or Chi-square test. For continuous variables, the Student's t-test was used. All statistical analysis was done using GraphPad Prism 5.0 (GraphPad Software, Inc. San Diego, CA). Data are expressed as percentage or mean 6 SD. A p-value of , 0.05 was considered to be statistically significant.

Rolling Circle Amplification for HBV Genomes from OBI
By using known copies of HBV DNA isolated from CHB patients, we demonstrated RCA could amplify down to 15 copies/ reaction ( Figure 1). Full-length HBV genomes were successfully amplified from 18/60 (30%) liver biopsies and 4/185 (2.2%) serum samples from OBI subjects using this RCA method. Complete HBV genomes were also amplified from all 11 controls. Seventeen out of these 18 OBI subjects had detectable intrahe-patic HBV DNA levels (median: 4.07 copies/cell; range: 0.07-15.34 copies/cell). HBV viral load in the 4 serum samples were all under the detection limit of COBAS assay of 20 IU/ml. Fulllength HBV genomes were also amplified from 11 HBsAg-positive controls (median: 3.51610 5 IU/ml; range: 100-7.34610 7 IU/ ml).

Demographic Data and Phylogenetic Analysis
Demographic data of the study subjects are summarized in Table 1. Genotype C was dominant in both OBI group (17/22; 77.3%; 11 with subtype C1, 4 with subtype C2 and 2 outliers) and controls (9/11; 81.8%; all subtype C1). Genotype B was detected in the remaining OBI subjects (5/22; 22.7%; all subtype B2) and controls (2/11; 18.2%; 1 with subtype B2 and 1 outlier) ( Figure  S1). Evidence of intergenotypic recombination between genotypes B/C was detected in these 3 outliers but only constituted less than 20% of the major parental genotype. The full-length HBV genomes and multiple sub-genomic regions reproduce the same genotyping results, these 3 outliers (2 occult samples and 1 control) were clustered into genotype C and B respectively. Of note, phylogenetic analysis based on the full-length HBV sequences or the 4 individual ORFs cannot distinctly separate OBI from CHB controls (data not shown).

Comparison of HBV Genomic Diversity between OBI and Control Subjects
The nucleotide and amino acid diversity were evaluated with respective to their genotypes. The nucleotide diversity over the entire HBV genome with genotype C was significantly higher in the 17 subjects with OBI when compared with the 9 control subjects (diversity, d = 0.04060.002 vs. 0.02660.002, p = 0.008). The nucleotide diversity in these 17 OBI subjects was also significantly higher than that in 49 genotype C reference sequences retrieved from Chinese CHB patients [26] (d = 0.0460.002 vs. d = 0.03060.001, p = 0.025). There was no significant difference in the nucleotide diversity between OBI and control subjects with genotype B (d = 0.02160.002 vs. d = 0.01660.002, p = 0.279), which may be related to limited number of tested subjects (n = 5 and n = 2, respectively). Similarly, the nucleotide diversity between OBI subjects and 5 genotype B reference sequences retrieved from Chinese CHB patients showed no significant difference (d = 0.01660.002 vs. d = 0.01560.002, p = 0.792) [30] (Table 2). In terms of amino acid diversity, there was no significant difference between OBI and control sequences for both genotypes (data not shown).
Further analysis on the nucleotide and amino acid diversity on the 4 individual HBV ORFs indicated OBI cases presented slightly higher nucleotide and amino acid diversity than control cases ( Table 2). For genotype C cases, the nucleotide diversity in the pre-S1, pre-C and P ORFs was significantly higher in OBI than control cases (p = 0.048, p = 0.047 and p = 0.032, respectively). For genotype B cases, OBI subjects had a significantly higher nucleotide diversity in the pre-C/C and C regions (p = 0.045 and p = 0.031, respectively) ( Table 2), and in the RNase H region (p = 0.049) of the P ORF than control subjects (data not shown). Furthermore, the difference in the nucleotide diversity between OBI and control groups with genotype C in both pre-S/S and pre-C/C regions also showed a similar trend (p = 0.069 and p = 0.059, respectively) ( Table 2). There was no significant difference on the amino acid diversity between OBI and control sequences on the 4 ORFs in both genotypes (data not shown). In summary, OBI cases had a higher nucleotide diversity than control cases, and these changes were scattered over the entire viral genome and may not lead to coding changes. This was evidenced by the comparable amino acid diversity between the two groups.

Mutational Analysis on the pre-S/S Coding Region
Details of the mutations observed in subjects with OBI and CHB infection were illustrated in Table S2. The total number of amino acid variations over the entire PreS/S region was significantly higher in OBI than control cases [89/400 (22.2%) vs. 33/400 (8.25%), p,0.0001] (Figure 2). The number of amino acid substitutions in individual pre-S1, pre-S2 and S regions were also significantly higher in OBI than control cases [pre-S1: 21 Figure 2B). Some mutation patterns were uniquely observed in OBI cases in pre-S/S ORF, including sequence deletions (3/22, 13.6%) and abolishment of the pre-S2 start codon (ATG) by a point mutation (3/22, 13.6%). The analysis of S ORF showed that the clinically important amino acid substitutions were mainly located in the major hydrophilic region (MHR) (residues 103-173). These included I126S, I126N, Q129N, T131N, M133T, G145A and A159V. Except for A159V, all the other 6 amino acid substitutions in MHR reside in the ''a'' determinant region (residues 124-147) of the HBsAg, and were found more frequently in OBI than control cases [ and M133T are associated with diagnostic problems [15,31,32], whereas G145A is known as vaccine escape mutant and associated with OBI [33].

Mutational Analysis on the P Coding Region
The total number of amino acid substitutions over the P ORF was significantly higher in OBI than control cases [197/843 (30.1%) vs. 63/843 (7.6%), p,0.0001] (Table S2). Deletions in the pre-S/S region which lead to deletions and truncated proteins in the overlapping P coding region were found in 3/22 (13.6%) OBI cases. 3 out of 6 (50%) mutations detected in ''a'' determinant region of HBsAg also caused mutation in overlapping rt region, these included sI126S to rtD134E, sQ129N to rtS137Q and sT131N to rtN139K ( Figure 2B). Nucleotide T128A mutation in the finger domain of the rt was uniquely detected in 3/22 (13.6%) OBI cases. It is speculated that rtT128A mutation might result in defective replication activity [35].

Distinct Mutations in the Regulatory Regions
The mutation frequency within the HBV key regulatory regions was comparable between the OBI group and control group (Table 1). However, a significant difference in the nucleotide diversity was noticed over the pre-S2/S promoter when comparing the OBI with the control groups with genotype C (p = 0.037; Table 2). Point mutations were the most noticeable changes that scattered over these key regulatory elements. Most of them result in interference with liver specific transcription factor (TF) binding sites and were more frequently detected in OBI cases than controls (Figure 2A, Figure 3A and Table S3). Large deletions at nt position 3127-55, 728-1255 and 1754-1771 were found in 3/22 (13.6%) OBI cases. These deletions led to loss of pre-S2/S and X promoters and disruption of BCP promoters, respectively. It is noteworthy that unique nucleotide duplications at the promoter regions were observed in 2/22 (9.1%) OBI cases. These unique nucleotide duplications included a 21 nt duplication (nt 3107-3127) in the Sp1 binding sites within the pre-S2/S promoter and a 31 nt duplication (nt 1644-1674) that interrupted the essential element box b in enhancer II (Table S3) [36].
Recent studies suggested that pre-S2/S mRNA splicing is essential for HBsAg expression [20,22]. This RNA splicing event is controlled by a 59 splice donor site (nt 426-464) and the posttranscriptional element (PRE) that contains the 39 splice acceptor site. Nucleotide G458A mutation was found to influence HBV RNA splicing and HBsAg production [20]. No G458A mutation was detected in this study. One OBI case with both T429C and T441G mutations in the 59 splice donor site, which were predicted to affect mRNA secondary structure by MFOLD (Figure 4) [20]. The influence of RNA secondary structure on the activity of a 59splice site of the S mRNA and optimal production of HBsAg needs further functional analysis.

Functional Analysis of the Distinct Nucleotide Alterations in Regulatory Elements
We next explored whether the distinct surface promoter and core promoter (CP) mutations identified in OBI cases caused a change in their level of transcription activity. Based on the findings of the mutations in the regulatory regions in OBI cases, 5 mutants and two wild-type controls were constructed. These included T2768A mutation located in Sp1 binding sites of pre-S1 (MUT1), 21 nt duplication between 3128-3148 on pre-S2/S (MUT2), C3015T mutation on preS2/S (MUT3) and two mutations G1677A (MUT4) and C1706T (MUT5) in CP. Luciferase activities of the plasmids containing these mutant promoters (MUT1-5) were compared to that of their corresponding wildtype promoter. The absolute luciferase activity of the control wildtype promoter was set to 100% and all the other mutant constructs were compared accordingly. As shown in Figure 5, mutation significantly affected the promoter activity in MUT2 (85%, p, 0.05), but not in MUT1 (92%, p = 0.226) and MUT3 (87%, p = 0.156). The promoter activity was significantly decreased in MUT4 (72%, p,0.05) and MUT5 (40%, p,0.005) with respective to the control CP.

Discussion
We performed mutational analysis of the RCA-amplified fulllength HBV genomes from OBI and control cases. In order to exclude natural polymorphism and/or differences related to the geographical origin of the patients, it is imperative to have a robust comparison of sequences obtained from cases with the same genotype and origin. In this sense, we compared OBI sequences with both genotype-matched controls and reference sequences obtained from 49 genotype C (39 with sub-genotype C1 and 10 with sub-genotype C2) [26] and 5 genotype B2 CHB patients from Hong Kong [30].
Our full-length HBV sequence analysis revealed that HBV nucleotide diversity in OBI cases was significantly higher than control cases, indicating that more sequence variations were  (Figure 2), which affect HBsAg detection assays, immune response recognition, HBV infectivity and virion morphogenesis [15,20,23]. However, we did not identify any single prevailing mutation or genetic signature which was associated with OBI. In this study, a mixture of mutation patterns like point mutations, deletions and nucleotide duplications were identified in OBI cases. This finding of multiple mutations, rather than prevailing mutations or genetic signatures, in OBI cases, is consistent with other reports [15,16,20,21,23,37]. This is further confirmed by our mutation frequency analysis that the amino acid mutation frequency within the individual genes and regulatory regions in the OBI cases was higher than that in the control cases. Taken together, it is likely that OBI is attributed to an increased accumulation of mutations in multiple regions, including key regulatory and coding regions, which in turn disrupts HBV replication and gene expression.
Despite the lack of prevailing OBI-associated mutations, this study identified several unique mutations within the HBV regulatory regions that may affect HBV replication. Many of these mutations identified in the key regulatory regions reside in the TF binding sites, which may disrupt TF bindings and hence affect promoter functions (Figure 2A and 3A). The reduced promoter activity was demonstrated in vitro by our luciferase assay, which showed that core promoter activity was reduced with G1677A mutation in TF HNF3 binding sites and with C1706T mutation in the enhancer II region. We have also identified 2 OBI cases with duplications of HBV regulatory sequence. To our knowledge, this is the first description of nucleotide duplication in regulatory elements in OBI. This may explain the significant decrease in the pre-S2/S promoter activity caused by a 21 nt duplication between 3128-3148 by the luciferase assay.
This study also identified mutations in the ''a'' determinant region of the preS/S ORF in OBI subjects. Many mutations identified in the pre-S/S ORF, especially within the ''a'' determinant, have been demonstrated to affect the antigenicity and production of HBsAg, and thus possibly contribute to OBI [20,23,38,39]. This study identified an OBI case with the sG145A mutation, a vaccine escape variant which has been associated with OBI [33]. In addition, point mutations leading to the abolishment of pre-S2/S start codon and early stop signals, and sequence deletions resulting in truncated surface protein synthesis were also found in OBI cases. Xu et al. first reported that expression of a pre-S1 mutant would result in intracellular retention of HBsAg and that an appropriate balance of the various HBV surface proteins was important for functional HBsAg production and secretion [40,41]. Therefore, it is possible that the novel mutation patterns identified in the present study may disrupt surface protein synthesis and secretion. Of note, the change from sW182 to a stop codon (W182*) was identified in an occult case, which was recently suggested to be associated with OBI [42].
Due to the overlapping arrangement of HBV ORFs, mutations in the pre-S/S region could lead to amino acid changes in the overlapping P gene, especially in reverse transcriptase (rt). In particular, three immune escape mutations identified in ''a'' determinant region in OBI cases also caused mutations in the overlapping rt functional domains, including sI126S R rtD134E, sQ129N R rtS137Q and sT131N R rtN139K. The effect of these rt mutations on HBV replication level remains to be proven by further in vitro analysis.
Mutations associated with post-transcriptional regulation of HBsAg production were detected in OBI cases. The mechanism for the newly identified post-transcriptional regulation of HBsAg expression is still not yet fully elucidated. We have identified one occult case with T429C and T441G mutation at the 59 donor splice site, which may affect the preS/S RNA secondary structure. As inhibition of HBsAg production could result from the mutation occurred in the pre-S2/S splicing sites [20,22,43], further investigation in larger OBI samples are needed.
In conclusion, our results indicated that subjects with OBI had a higher genetic diversity and higher mutation frequency than control subjects. The mutations identified in OBI cases included point mutations and sequence deletions, which might cause premature gene termination, defective HBsAg synthesis, decreased HBV promoter activity, or reduced HBV replication. All these mutations constrain viral replication capacity, and may collectively contribute to the OBI entity among the HBsAg-negative subjects. It is likely that multiple mechanisms are involved, and a combination of mutations, rather than any prevailing genetic signatures, is responsible for the OBI status. Figure S1 Phylogenetic analysis using the entire HBV genomes amplified from the 22 occult and 11 control HBV infected subjects. The filled circles represent the occult subjects, and the empty circles represent control subjects. Phylogenetic comparison was done by neighbor-joining algorithm based on Kimura two-parameter distance estimation. Bootstrap values more than 75% are indicated on the major nodes. The scale of the evolutionary distances is shown at the bottom (scale bar). References sequences retrieved from GenBank are indicated by their accession numbers. (DOCX) Table S1 HBV-specific primers for rolling circle amplification (RCA), PCR amplification of full-length HBV genomes, sequencing and regulatory elements analysis.

(DOCX)
Table S2 Amino acid changes in HBV coding genes from subjects with occult and overt HBV infections. (DOCX)