Hepatitis B Virus Genotype Distribution and Genotype-Specific BCP/preCore Substitutions in Acute and Chronic Infections in Argentina

Aim In order to assess Hepatitis B Virus genotype (g) and subgenotype (sg) implications in the course of infection, 234 HBsAg positive patients in different infection stages were characterized (66 acute infections, 63 HBeAg positive chronic infections and 105 anti-HBe positive chronic infections). Results Overall, sgA2 (17.9%), gD (20.9%), sgF1b (34.2%) and sgF4 (19.7%) were the most prevalent. Subgenotype F1b was overrepresented in acute and chronic HBeAg infections (56.1%), whereas gD was the most frequent (40.0%) in anti-HBe positive chronic infections. Among chronic infections, HBeAg positivity rates were 50.0, 12.5, 62.8 and 35.3% for sgA2, gD, sgF1b and sgF4, respectively (p <0.05). A bias toward BCP/preCore mutations was observed among genotypes. In anti-HBe positive chronic infections, sgF1b was more prone to have A1762T/G1764A mutation than sgA2, sgF4 and gD (75.0, 40.0, 33.3 and 31.8%, p<0.005), whereas in the pC region, gD and sgF4 were more likely to have G1896A than sgA2 and sgF1b (81.0, 72.7, 0.0 and 31.3%, p <0.001). The unexpected low frequency of the G1896A mutation in the sgF1b (despite carrying 1858T) prompted us to perform a further analysis in order to identify genotype-specific features that could justify the pattern mutations observed. A region encompassing nucleotides 1720 to 1920 showed the higher dissimilarity between sgF1b and sgF4. Genotypes and subgenotypes carrying the 1727G, 1740C and 1773T polymorphisms were prevented to mutate position 1896. Discussion HBeAg seroconversion is a critical event in the natural history of HBV infection. Differences in the HBeAg positivity rate might be relevant since different studies have observed that delayed HBeAg seroconversion is associated with a more severe clinical course of infection, highlighting the critical role that genotypes/subgenotypes might play in the progression of HBV infection. Polymorphisms in the regions 1720 to 1920 could be involved in the molecular mechanisms underlying seroconversion of each genotype/subgenotype.

Notwithstanding this, the role of mutations in the BCP and pC regions in the evolution of acute and chronic infections is still controversial.
Therefore, the aim of this study was to assess the prevalence of mutations and their relationship with the viral genotype in patients with acute and chronic HBV infections.

Patients
This cross-sectional study included 234 untreated HBsAg positive patients, admitted to the Hepatology Unit of the Hospital Italiano de Buenos Aires and to the Hospital de Infecciosas "F. Muñiz" de Buenos Aires, and recruited during 2004-2013.
Diagnostic criteria for acute infection (AHB) was as follows: acute onset of symptoms without history of chronic HBV infection, levels of serum alanine aminotransferase (ALT) >10-fold the upper reference limit, positivity for IgM antibody to the hepatitis B core antigen (anti-HBc), rapid drop of HBsAg titer, serum HBV-DNA elimination and HBeAg seroconversion at convalescent phase. The diagnosis was confirmed by HBsAg clearance within 6 months after the initial onset; alternatively, when serum HBsAg had persisted for at least 6 months, after the onset of clinically acute hepatitis, diagnosis of acute infection was assessed by liver biopsy.
Chronic infections (CHB) met the following criteria: HBsAg positivity for more than 6 months, a history of chronic hepatitis based on a histo-pathological diagnosis and/or compatible laboratory data and ultrasonographic findings.
Patients were excluded if they had any evidence of autoimmune hepatitis or markers of hepatitis C virus, hepatitis D virus or human immunodeficiency virus.
Patients were divided into three groups: AHB, 66 patients with acute HBV infection; CHB HBeAg positive, 63 chronic patients who were HBeAg positive at baseline; CHB anti-HBe positive, 105 chronic patients who were persistently HBeAg negative.
HBV-DNA amplification (S and BCP/ preCore regions) DNA was extracted from serum samples according to the proteinase K protocol. Briefly, 200 μl of serum was added to 450 μl of mix containing 1 mg/ml proteinase K, 5mM Tris HCl (pH 8.5), 2.0% sodium dodecyl sulfate (SDS) and 25mM ethylenediaminetetraacetic acid (EDTA) and incubated at 37°C for 4 h. DNA was precipitated with 1 volume of absolute isopropanol in the presence of 20 μl of Dextran T500 and 1/10 volume of 3M NaAc (pH 4.7). DNA was recovered by centrifugation at 20,000 g for 15 min; pellets were washed with 70% ethanol, dried, and dissolved in 20 μl of water.

HBV-DNA sequencing
PCR products covering the BCP/pC and S regions were purified by Qiagen columns (Qiagen, Germany), and direct sequencing was carried out using a 3730xl DNA Analyzer (Applied Biosystems, USA) in both amplification senses. Amplification and sequencing of Basal Core Promoter and preCore gene BCP and preCore regions were amplified by nested PCR using primers synthesized according to the consensus sequence of the pre-C region [32]. PCR products were purified by QIAgen columns (QIAgen), and direct sequencing was carried out in a 3730xl DNA Analyzer (Applied Biosystems) in both amplification senses (GenBank accession numbers: HM214716 to HM214756; HM216287 to HM216329; HM216331 to HM216348; HM216350 to HM216358; KJ810838 to KJ810908; KJ843154 to KJ843218).

HBV Typing
Genotyping was assessed by phylogenetic analysis. Seventy one nucleotide sequences of S and BCP/preCore regions representing the different HBV genotypes were retrieved from GenBank and used as references. S and BCP/preCore sequences obtained in this study and HBV sequences from GenBank database were aligned with the ClustalX (v2.1) software [33] and edited with the BioEdit (v7.1.3.0) software [34].
Phylogenetic trees were constructed using the Maximum Likelihood method performed with the RAxML (v 8.0.24) program [35]. Evolutionary models were inferred according to the Akaike Information Criterion (AIC) statistics [36] obtained with the jModeltest (v2.1) software [37]. The robustness of the reconstructed phylogenies was evaluated by bootstrap analysis (1000 replicates).
In order to differentiate among subgenotypes, phylogenetic analyses were combined with the amino acid and nucleotide patterns characteristic of each subgenotype within the S, P and C open reading frames [38]; this was assessed by the VisSPA v1.6.2 program [39]. It was established that the amino acid pattern characteristic of each subgenotype would be formed by at least 90% of the amino acids present in the sequences from the group analyzed and in less than 10% of the samples in the reference group.

Genetic Similarity
To determine the genetic similarity among HBV genotypes and subgenotypes, pairwise comparisons of 251 complete HBV genomes (53 genotype A, 115 genotype D, 55 subgenotype F1b and 28 subgenotype F4) retrieved from GenBank were analyzed with SimPlot software [40]. Distance plot and bootscanning analysis were performed using 200 nucleotide window size and 20 nucleotide increment steps.

Statistical analysis
Fisher's two-tailed exact test and the corrected X 2 test were used to compare qualitative data. ANOVA and non-parametric tests (Mann-Whitney U and Kruskal-Wallis H) were used to compare quantitative variables. Results were expressed as mean ± SEM. Data analysis was performed by the statistical software package SPSS (version 10.0, SPSS, Inc., Chicago, USA). Significance was set as a p-value of less than 0.05.
Multivariate logistic regression analyses were used to determine the independent factors associated with the clinical course (acute/ chronic) and the HBeAg status of the chronic infections (HBeAg/ anti-HBe). Gender, age and genotype (A2, D, F1b, F4) were considered as variables. Dummy variables were created for the variable 'genotype' (with more than two classes). Analyses were performed with the Infostat vL software.

Ethics Statement
This study was carried out according to the World Medical Association Declaration of Helsinki; it was approved by the Ethics Committee of the School of Pharmacy and Biochemistry, Buenos Aires University (Permit Number: 732575/2010) and written informed consent statements were signed by all patients.

Results
In this cross-sectional study 234 HBsAg carriers were included: 66 had acute hepatitis and 168 were chronically infected, of whom 63 were HBeAg positive and 105 anti-HBe positive ( Table 1).
The mean age of this cohort was 44.0 ± 14.0 years, being significantly younger than those patients with acute infection compared to those with CHB anti-HBe positive infections (p<0.001). Regarding gender, 73 patients (31.2%) were women and 161 were men (68.8%) ( Table 1). The male to female ratio showed a significant difference between AHB (4.07) and CHB anti-HBe positive stages (1.50) (p<0.05).
Moreover, subgenotypes (sg) were identified within genotypes A ( Fig 1A) and F (Fig 1C), whereas the phylogenetic signal of the BCP/pC and S regions was not enough to subgenotype genotype D samples ( Fig 1B); consequently, these samples were analyzed as a whole.
Subgenotypying of gF samples was based on both phylogenetic analysis and nucleotide and amino acid comparisons along both S and P genes. The following nucleotide (nt) and amino acid (aa) patterns were characteristic of subgenotype F1b: nt T562, nt C1026, nt T1032 and aa rtL151 (P ORF); whereas those from subgenotype F4 were: nt A482, nt T493, aa L110 (S ORF), aa rtT118, aa rtH122 and aa rtN123 (P ORF). These analyses showed that genotype F isolates could be subdivided into different subgenotypes; 80 belonged to sgF1b, 46 to sgF4 and 1 sample to sgF2a. An uneven genotype distribution among AHB, CHB HBeAg and anti-HBe positive infections was observed ( Table 2).
No significant differences were observed among the ages of patients infected with different HBV genotypes (Table 2).

Multivariate analysis
Using a multivariate analysis with age, gender and genotype as variables, only age and genotype were independently associated with the acute/chronic course of infection. As in univariate analysis, advanced ages were associated with chronic infections (Odd Ratio = 1.03, p = 0.018). Specifically, sgA2, sgF4 and gD were more associated with the chronic course of infection than sgF1b (sgA2: OR = 2.17, p = 0.049; sgF4: OR = 2.52, p = 0.027; gD: OR = 35.13, p<0.001).

Basal Core Promoter and preCore mutations distribution
Mutations modulating HBeAg expression were observed in all HBV infection stages, being more prevalent in CHB anti-HBe positive patients (92.4%) than in AHB (24.2%) and CHB HBeAg positive patients (20.6%) ( Table 3).
Among AHB and CHB HBeAg positive infections, mutations were more frequently found in the BCP region (21.2 and 17.5%) than in the pC region (4.5 and 3.2%). In anti-HBe positive chronic infections, mutations affecting HBeAg expression were observed in 97 out of 105 (92.4%) samples, and more than one mutation was found in 30% of them. In the preCore region, G1896A was the most common mutation (55.2%), whereas other mutations that prevent HBeAg synthesis, such as those affecting the preCore initiation codon (nt 1814-1816), mutations (C1817T, G1897A), insertions and deletions that create a premature stop codon, were observed in a lower frequency (33.3%).

BCP and pC mutations by genotype and infection stage
In spite of the low prevalence of mutations in AHB infections (25.7%), those subgenotypes more frequently observed in this stage, sgA2 and sgF1b, had the double mutation A1762T/ G1764A, while gD and sgF4 did not mutate these positions (Table 3). In CHB HBeAg positive stage, there was no significant difference in the frequency of BCP or pC mutations among different genotypes.
In brief, those patients infected with sgF4 and gD mutated G1896A more frequently than A1762T/G1764A (p = 0.007 and p<0.001 respectively), whereas those patients carrying sgF1b and sgA2 had the opposite mutation pattern, showing higher rates of mutations in positions 1762 and 1764 than in 1896 (p = 0.013 and p = 0.010 respectively).

Nucleotide similarity among HBV genotypes
It is widely accepted that the G1896A mutation rate is closely related to the viral genotype. This mutation is rarely selected in genotypes carrying 1858C (gA, F2, F3 and H), while it has been frequently observed in those genotypes carrying 1858T (B, D, E and G). This paradigm is based on structural principles. The HBV encapsidation signal, essential for pregenomic RNA encapsidation and viral replication, overlaps almost the entire precore region. In the RNA, the signal forms a double stem-loop structure and nucleotide 1896 is basepaired with nucleotide 1858. The G1896A mutation rate observed in sgF1b (carrying 1858T) was unexpectedly low, displaying a mutation pattern more similar to genotype A (1858C) than genotypes D and F4 (1858T).
This result prompted us to perform further analysis in order to identify viral polymorphisms, other than position 1858, that may be involved in the molecular mechanisms of HBeAg seroconversion.
In order to map nucleotide similarities among the different genotypes along the whole viral genome a SimPlot analysis was performed using a data set of 251 full length genome sequences retrieved from GenBank, representing strains from genotype A(n = 53), D (n = 115), F1b (n = 55) and F4 (n = 28).
Along the whole genome, subgenotype F1b showed, as expected, the highest degree of similarity when compared with subgenotype F4. Nevertheless, in the region encompassing the nucleotides 1820 ± 100, a higher degree of similarity and an increase in phylogenetic association between F4 and D (Fig 2), as well as between F1b and A (data not shown), was observed.
The alignment of the consensus sequences spanning the 1820 ± 100 nucleotide region showed a high degree of conservation in this region among genotypes. Nonetheless, in the BCP region there are three identical nucleotide positions in genotypes A and F1b (1727G, 1740C and 1773T), different from those present in genotypes D and F4 (1727A, 1740T, 1773C). These polymorphisms are spotted in the reading frame of the X protein (Fig 3).
In order to assess the role of 1727, 1740 and 1773 polymorphisms in the mutation pattern of 1896 position, the frequency of G1896A mutation was determined in those samples carrying 1858T. Samples with 1727A, 1740T and 1773C (D and F4) were more prone to select G1896A mutation than those with 1727G, 1740C and 1773T (F1b) (p<0.05) ( Table 4).

Discussion
There is growing evidence that HBV genotypes may play a role in causing different disease profiles in chronic hepatitis B infection [41]. However, most of the information on the clinical significance of HBV genotypes has been based on studies performed in Asia or in Europe, with patients infected with genotypes B and C or A and D, respectively. Added to the fact that comparisons have been made between two genotypes, there is a paucity of data on the clinical course of patients with other genotypes, different from A-D [38,42,43].
The present study highlights the differences in genotype and subgenotype distribution among subjects with acute, chronic HBeAg positive and chronic anti-HBe positive infections.
The difference in genotype distribution between HBeAg positive and anti-HBe positive chronic infections suggests a different seroconversion rate among genotypes and subgenotypes (D>>F4>A2>F1b). This might be relevant since different studies have observed that delayed HBeAg seroconversion is associated with a more severe clinical course of infection [41,[44][45][46]. This highlights the implication of the viral genotype in the progression of the infection.
In line with our findings, the few studies that have assessed subgenotype F1b suggest that this subgenotype has a worse clinical outcome than other genotypes [42,47]. This genotype's behavior could be compared to that of genotype C, observed in different studies performed in  Asia, where it showed a higher rate of HBeAg positivity and a worse clinical outcome of the chronic infection compared to those infections caused by genotype B [31,48].
On the other hand, the similar genotype distribution in acute and HBeAg positive chronic infections could be explained by the fact that transmission of HBV most probably occurs during the HBeAg positive stage [49,50]. It has been previously reported that in the latter stage, viral loads are usually higher than in the anti-HBe positive stage of infection [51][52][53][54].
Regarding mutations in the BCP and pC regions, these were more prevalent in the anti-HBe positive stage than in the other two stages of infection. However, in those AHB and CHB HBeAg positive patients infected with viral variants carrying mutations in these regions, BCP mutations were more frequently found (19.4%) than pC mutations (3.9%). This is consistent with the fact that BCP mutations down-regulate HBeAg expression, while pC mutations abolish HBeAg synthesis [55].
Furthermore, in the anti-HBe positive stage, a bias of mutations among genotypes was observed. Mutations in the BCP were more frequently found in subgenotype F1b (75.0%) than in A2, D and F4 (40.0, 33.3 and 31.8%, respectively). In vitro studies have shown that mutation A1762T/G1764A does not abrogate HBeAg expression but decreases its levels, while concomitantly increasing viral replication [15,56]. Thus, the lower seroconversion rate observed in genotype F might be explained by the higher frequency of BCP mutations in this genotype.
Overall, these findings suggest that intrinsic biological features of each genotype may lead to a longer HBeAg positive stage and therefore to different implications in the progression of the infection.
The difference in the mutation pattern among genotypes was initially described in the 90s after the identification of genotypes A to D, when it was observed that the occurrence of G1896A was restricted to HBV genotypes with T at nucleotide 1858 [57][58][59]. Given the fact that pC region overlaps the encapsidation signal, which is essential for efficient replication, those genotypes with T1858 would tend to mutate G1896A in order to increase stability of the stem loop in the encapsidation signal structure.
Since subgenotypes F1b and F4 carry T1858, it would be expected that G1896A mutation would predominate in both subgenotypes; however, this was only observed in subgenotype F4. The bias in the mutation pattern between these two subgenotypes has been previously overlooked, probably because few studies have discriminated between these subgenotypes. This controversy has also been observed in genotype C; although most subgenotypes carry 1858T, there is a strong bias toward using BCP mutations. On the contrary, subtypes B2 and B3 tend to acquire codon 28 mutations, even though their core promoters and preCore/core genes are derived from genotype C [31]. Furthermore, despite the fact that all subgenotypes D have T1858, only subgenotypes D1 and D7 have a tendency to mutate G1896A [60]. The differences in the molecular features of distinct subgenotypes from genotypes B, C, D and F [61][62][63][64][65][66] highlight the relevance of differentiating HBV subgenotypes when analyzing their implications in the progression of the infection.
In summary, those genotypes carrying 1858C seem to prevent G1896A mutation; however, T1858 polymorphism seems to be necessary but not sufficient to promote G1896A substitution.
These observations suggest that the choice of the mechanism that a given genotype uses to regulate HBeAg expression is not fully explained by the encapsidation signal structure. Furthermore, several in vitro studies have shown that there is no strict relationship between the stability of this signal and the replication capacity [14,[67][68][69]. Overall, these findings indicate that sequences outside the encapsidation signal may influence the mechanism of choice.
On the other hand, the high prevalence of mutations other than G1896A in the preCore region in subgenotype A2 indicates that during virus-host interaction, the virus explores different molecular alternatives to regulate HBeAg expression.
The nucleotide alignment of the region encompassing nucleotides 1720 to 1920 suggests that the similarity observed between genotype A and F1b, and between F4 and D, seems to lay on three nucleotide positions (1729, 1740 and 1773). The fact that these nucleotide polymorphisms are synonymous in the open reading frame of X protein implies that the differences observed among genotypes might not be due to changes at protein X but at a regulatory or RNA conformational level.
Nucleotide 1773 is in the phi region, a cis-acting element which has been proposed to be involved in minus-strand DNA synthesis, as it may mediate the translocation of the viral polymerase during replication [70]. Nucleotides in the phi region base pair with nucleotides in the 5' half of the encapsidation signal; for instance, position 1773 pairs with 1876 [71]. The implication of these positions and the mechanism by which they could be related to mutations regulating HBeAg expression, and thereby to seroconversion, should be cause for further elucidations.
In conclusion, our results show a different HBeAg positivity rate among genotypes/ subgenotypes (F1b>A>F4>>D), which could imply a difference in the duration of the HBe positive stage, with its consequent implication in the progression of liver disease. This finding supports the uneven distribution of genotypes between primary and chronic infections and its ensuing epidemiological implications.
Finally, we identified three nucleotide positions outside the encapsidation signal that could contribute to the underlying mechanism related to HBeAg seroconversion in those HBV subgenotypes that displaying 1858T prevented the mutation in the 1896 position.