High Diversity of Hepatitis B Virus Genotypes in Panamanian Blood Donors: A Molecular Analysis of New Variants

Hepatitis B Virus (HBV) is an infectious agent that causes more than half of the cases of liver disease and cancer in the world. Globally there are around 250 million people chronically infected with this virus. Despite 16% of the cases of liver disease in Central America are caused by HBV, the information regarding its genetic diversity, genotypes and circulation is scarce. The purpose of this study was to evaluate the genetic variability of the HBV genotypes from HBV-DNA positive samples obtained from screening blood donors at the Social Security System of Panama and to estimate its possible origin. From 59,696 blood donors tested for HBV infection during 2010–2012, there were 74 HBV-DNA positive subjects. Analysis of the partial PreS2-S region of 27 sequences shows that 21% of the infections were caused by genotype A, 3% by genotype D and 76% by genotype F. In addition, we were able to confirm circulation of six sub-genotypes A1, A2, A3, D4, F3, F1 and a proposed new sub-genotype denominated F5pan. We found a confinement of sub-genotypes F1 and F5pan to the western area of Panama. The tMRCA analysis suggests a simultaneous circulation of previously described sub-genotypes rather than recent introductions of the Panamanian sub-genotypes in the country. Moreover, these results highlight the need of more intensive research of the HBV strains circulating in the region at the molecular level. In conclusion, Panama has a high HBV genotype diversity that includes a new proposed sub-genotype, an elevated number of PreCore-Core mutations, and confinement of these variants in a specific geographical location.


Introduction
Hepatitis B Virus (HBV) is an infectious agent that causes more than half of the cases of liver disease and cancer in the world [1]. Globally there are around 250 million people chronically infected with this virus [2], in Latin America, the HBV prevalence levels go from low (,2%) to intermediate (2% to 8%) depending on the population analyzed [3]. In this region, the virus is also the etiologic agent of around 14% of the cases of hepatocellular carcinoma (HCC) [4]. HBV can be classified in nine genotypes (5)(6)(7)(8)(9), according to a nucleotide intergroup divergence greater than 8% of the whole genome, and in twenty sub-genotypes, if a nucleotide intergroup divergence of 4% to 7.5% is considered [5,6]. The study of the HBV in several clinical settings showed important differences between genotypes in the observed clinical outcomes [7]. Besides, the study of sub-genotypes allowed a more accurate determination of HBV geographical distribution patterns, an example, is the recent estimation of the migration path of the HBV genotype A1 in and out of Africa [8].
Information regarding the genetic diversity of the HBV in Central America is scarce. The most concise reports of the HBV Central American diversity were conducted 15 years ago [9,10]. Moreover, previous studies describing the HBV diversity were done using partial, discrete regions of the HBV genome [11]. Another evidence of the scarce number of studies of HBV in the region is the small number of whole genome sequences of Central American origin, deposited in international databases (n,100) [12].
In Panama, HBV diversity was first described in the Chinese population residing in the metropolitan area. This study showed that only genotypes from Asia (Genotypes B and C) are circulating in this group [13]. Still, there is a lack in the knowledge regarding the diversity and molecular characteristics of this virus in the country. Since 2010, a centralized continuous screening of blood donors by molecular techniques: Nucleic Acid Test (NAT) has been implemented in Panama. The use of this technique created a great opportunity to evaluate the circulating HBV genotypes in blood donors from different regions of the country. The purpose of this study was to evaluate the genetic diversity of the HBV genotypes from HBV-DNA positive samples obtained from screening blood donors at the Social Security System of Panama from 2010 to 2012. In addition, we estimated the possible regional origin of these genotypes.

Ethics Statement
The study cohort included positive samples analyzed anonymously from selected blood donors who were screened from the Social Security Health System of Panama. The study protocol was submitted and approved by The Gorgas Memorial Institutional Ethics Review Board.

Population studied
The Panamanian Social Security Health System has a country network of blood banks that receives approximately 20,000 blood donors annually. Plasma samples from blood donors from seven cities of Panama (Colon, Panama City, Panama Oeste, Penonome, Los Santos, Chiriqui, and Changuinola) are sent for HBV-DNA screening at the centralized nucleic acid test (NAT) unit of the Complejo Hospitalario Dr. Arnulfo Arias Madrid (CHDrAAM). Samples were collected during the period of January, 2010 to December, 2012. All HBV positive samples were included in this study if they fulfilled the following requirements: no less than 300 mL of plasma and storage at 280uC.

Nucleic acid test
Blood donors were screened for HBV-DNA using pools of six samples for a total volume of 1 mL. Each pool was evaluated using the Cobas TaqScreen MPX Test (Roche Molecular Systems, USA). Reactive pools were tested individually to define the positive samples.

DNA Extraction and PCR amplification
DNA was extracted from 300 ml of plasma using commercial DNA/RNA extraction kits (Qiagen, USA). The presence of HBV-DNA was confirmed and quantified using a previously described method [14]. For the preliminary genotyping, samples with detectable viremia were subjected to a nested PCR using a high proof reading enzyme mixture (Platinum PCR Super Mix High Fidelity, LifeTech, USA), that yielded a 879 bp fragment. Sequencing was performed in an automated DNA sequencer (ABI 3130xl, LifeTech, USA) using primers previously described [13].
The full-length genome of Panamanian sequences clustering in particular clades of genotype F, were amplified using two protocols previously published [15,16]. When a sample failed to yield a good quality product with the one-step protocol, the nested approach was used.
The sequences obtained in this study were deposited in Genbank under the accession numbers KJ638656-KJ638679.

Genotype determination by phylogenetic analysis
Hepatitis B genotypes were determined comparing published reference sequences (n = 281) with the PCR nested product (879 bp) obtained from 27 Panamanian sequences. The HBV sequences were aligned with Muscle v3.8.31 [17] and manually inspected, the final dataset is available upon request. The phylogenetic tree was inferred under a Maximum Likelihood (ML) method and the nucleotide substitution model (GTR+I+C) was selected using the jModeltest2 program [18]. The ML tree was reconstructed with the PhyML program using an online web server [19]. A heuristic tree search was performed using the SPR branch-swapping algorithm and the reliability of the obtained topology was estimated with the approximate likelihood-ratio test (aLTR) based on the Shimodaira-Hasegawa-like procedure [20]. The ML trees were visualized using MEGA v.5.2 [21].
The phylogenetic analysis of the whole genome, PreS2-S and PreC-Core gene was performed separately. The best-fit nucleotide substitution model (GTR+I+C for whole genome and PreS2-S; HYK+I+C for PreC-Core gene) was determined with Jmodeltest2 [18], and the ML tree was drawn as above, the reliability of the tree topology was also calculated by bootstrapping (1000 replicates).
The genetic distances among HBV genotype F sequences obtained and sub-genotype F reference sequences (n = 68) were calculated using p-distances, standard error estimated were calculated by bootstrapping (1000 replicates). Evolutionary analysis was conducted in MEGA 5.2 [21].
Amino acid substitutions in polymerase, PreS2-S, HBsAg and X-gene were determined by translating the Panamanian nucleotide sequences and comparing them against reference subgenotype F amino acid sequences [16,23], using MEGA 5.2 [21].

Bayesian analysis
Full-length genome and partial polymerase (nt 177 to 990) gene datasets were constructed including published reference sequences for sub-genotypes of genotype F [16]. An uncorrelated lognormal relaxed clock (UCLN) [24] was used to evaluate the time of the most recent common ancestor (tMRCA) of each node. Sample collection date was used as a calibration to calculate the substitution rates; additionally, a fixed medium rate of 1.5610 25 [16] was used to calculate the tMRCA of each node. The best-fit demographic tree model was selected calculating the path sampling (PS) and stepping stone sampling methods [25]. MCMC chains were run in BEAST v1.8 [26] for 50610 6 generation for the polymerase datasets and for the complete genome dataset. Convergence of chains was estimated by calculating the effective sample size (.200), and uncertainty in parameter estimates with the 95% highest probability density (HPD), all values were calculated after excluding an initial 10% of burn-in using TRACER V1. 6.
The tMRCA of the whole HBV genome of the Panamanian samples was calculated using a partitioned model [27], in which, the genome was partitioned in the corresponding coding genes and overlapping regions, a different substitution model is applied to each partition (Table S1) [27].

Low prevalence of HBV-DNA and HBeAg in blood donors
During the period of 2010 to 2012, 59,697 blood donors samples were analyzed at the NAT unit in the Complejo Hospitalario Dr. Arnulfo Arias Madrid (CHDrAAM), and 79 (0.13%) of them were HBV-DNA positive. A total of 57 (72%) positive HBV-DNA samples were included in this study, 33 of these samples had HBV-DNA level greater than 500 copies/ml (mean = 1.15610 7 , standard error of mean = 7.8610 6 ). Twentyseven of them were successfully sequenced and used for the phylogenetic analysis of partial PreS2-S region.

High genetic diversity of Hepatitis B virus genotypes in Panamanian blood donors
The phylogenetic analysis of partial PreS2-S region showed that the prevalent circulating HBV genotypes in Panama are: A (n = 6), D (n = 1), and F (n = 20) (Fig. 1). For Genotype A, sub-genotypes A1, A2 and A3 were found in 3 (11%), 2 (7.5%) and 1 (3.7%) samples, respectively. For the genotype F, 9 (33%) sequences were closely related to the branch of sub-genotype F1, but in a monophyletic clade separated from the previous reported F1a, F1b. Other groups of sequences (n = 7, 25.9%) were sub-genotype F3, and four sequences (14%) fell within the genotype F branch, but form a new monophyletic cluster of sub-genotype F (Fig. 1).
New variants of Sub-genotypes of HBV genotype F are circulating in blood donors, in a region of the country Sub-genotypes of genotype F were further determined using HBV whole genome sequences of the Panamanian samples clustering in genotype F1 (n = 6), and the new formed cluster (n = 2). The sequences of sub-genotype F1 circulating in Panama again form a monophyletic clade, supported by 0.99 aLTR and 91% bootstrap value. The two sequences of the new cluster remain forming a monophyletic separated cluster of the HBV-F subgenotypes, with a high aLTR value (1.0). The phylogenetic analysis of two HBV genome regions evidences slight differences in the relationships of the proposed F1c sub-genotype in the PreS2-S region, although the Core region shows a similar tree topology as the whole genome. (Fig. 2 a-c).
The recombination analysis gave no evidence of recombination in all the sequences analyzed.
Pair-wise analysis of nucleotide diversity was performed over the complete genome of two of the sequences forming the new cluster and six sequences of the new clade of Sub-genotype F1. The mean percentage of nucleotide divergence between the new cluster found in this study and the reference HBV sub-genotypes F range between 5.35 to 7.41 (Table 1). This result supports the classification of these sequences as a new sub-genotype F5. The nucleotide diversity among the clade formed by the Panamanian F1 sequences was 2.22 to 2.71, which falls within the definition of a clade [5].
The amino acid characterization of PreS2-S, HBsAg, Reverse transcriptase (RT) and X gene sequences showed six unique amino acids for the proposed sub-genotype F5: 124T and 172V in PreS2-S, 157M, 163V and 231L in RT protein and 44P in X protein (Table S2).
The geographic location of the HBV sub-genotypes was diverse (Fig. 3). In the most populated provinces of the country, Panama and Colon, the genotypes most frequently found were A1 (n = 3, 30%), A2 (n = 1, 10%) and F3 (n = 4, 40%). The four sequences that form the new sub-genotype F5 cluster and 8 sequences (89%) of sub-genotype F1, were geographically located in the most western area of the country: provinces of Chiriqui and Bocas del Toro (in the district of Changuinola).
Because of the low prevalence of HBeAg positive samples (n = 2, 3.5%) in Panamanian blood donors, the Basic Core promoter and PreCore mutations were evaluated. Nine sequences of Genotype F were analyzed; mutations 1762T and 1764A were found in six and seven sequences, respectively, and mutation 1896A in 6 of 8 sequences.

Old population history of HBV genotype F observed in Panamanian samples
We employed a Bayesian framework to evaluate the time of the most recent common ancestor (tMRCA) of the Panamanian samples. According to the substitution rate used (fixed or calculated), there were differences in the model supported by the bayes factor analysis. The BSP model was chosen as the best-fit demographic model, when the time-stamped dataset was used, however, the constant Size model was chosen when fixed substitution rates were applied, (Table S3). This result reflects the flexibility of the BSP model to fit a wide range of demographic scenarios [26]. Although the Bayes factor test supports the constant size model against an exponential model in fixed substitution rates, it agrees with the epidemiological information showing that new HBV infections have remained constant in the last years in Panama [28].
The tMRCA analysis of the fixed dataset shows a long history of HBV genotype F in the Americas, but with minor differences among HBV-F sub-genotypes, ( Table 2). The Panamanian F1 sequences (F1pan, Table 2) have a tMRCA of 703 years (438-1034 95% HPD); the new proposed sub-genotype F (F5pan, Table 2) has a tMRCA of 845 years (366-1455 95% HPD). Compared with the rest of sub-genotype F sequences, the subgenotype F2 shows the oldest tMRCA. When the F1c and F5 sequences were compared with the sub-genotypes F1a, F1b, F3, F4, the Panamanian sequences (F1pan, F5pan) have a slightly older tMRCA. However, the 95% HPD estimates for these subgenotypes show considerable overlap.
The estimates of tMRCA with the time-stamped dataset show a more recent history of HBV genotype F in the Americas. The tMRCA for Panamanian sequences (F1pan, F5pan) fall between the period of 1892 to 1911 (Fig 4). The oldest sub-genotype was the F2 of 1796, followed by F4 of 1887, and F3 of 1892; these have a tMRCA similar to the F5pan sequences.

Discussion
The diversity of HBV genotypes in Panama is greater than expected (Fig. 1). We determined circulation of three major genotypes (A, D, F), and at least five previously described subgenotypes (A1, A2, A3, F1pan, F3). Interestingly, a new diversification of genotype F cluster was also observed in the phylogenetic tree analysis (see Fig. 1, bottom). This new cluster is entirely composed by strains isolated from Panama. Since this new cluster has a significantly high statistical support by the values of aLRT and bootstrap obtained, we propose a new sub-genotype, F5pan. This result overrides previous reports, which indicated only the circulation of genotype F3 in the Panamanian population [16,29].
The results of the study are also in agreement with the observed migration of distinct genotypes across the globe. For example, genotype A1 was recently traced in the Americas [8], it was reported in an Afro-descendant population of Colombia [30] and in several regions of Brazil [31][32][33]. Sub-genotype A2 has been recently linked to high-risk groups: in Argentina to drug users [34] and men that have sex with men (MSM) [35], and in Japan to MSM groups [36]. Because of the experimental design of this study, we did not have the opportunity to evaluate risk behaviors of HBV infected blood donors. More studies are needed to address this important issue. In addition, studies intended to evaluate the HBV prevalence and genotypes circulation in specific groups: female sex workers (FSW), MSM and others are missing. Additionally, the only sub-genotype A3 sample circulating in blood donors was closely related to sequences of Haitian origin.
The genotype F was the most diverse HBV genotype in Panama. This genotype includes a monophyletic clade (F1c?), which is separated from the previous reported F1a and F1b clades, and that has a relative high divergence among members (p-distance = 2.5060. 19). This diversity most likely results from a long circulation history of these sub-genotypes (F1c) in Panama. Interestingly, the fact that almost all the F1c samples came from the northwestern region of Panama (Chiriqui province and Changuinola district) indicates a probable historical presence of this virus in Panama.
Additionally, we have observed that a group of four genotype F sequences forms a monophyletic cluster with a high support value (aLTR = 1.0, bootstrap = 100%), separated from the rest of HBV sub-genotype F sequences. The molecular analysis of the complete genome of these sequences evidences a nucleotide divergence greater than 4 percent, when compared with the rest of available HBV-F sub-genotypes sequences. Furthermore, the four sequences have specific amino acids changes in preS2-S, X and polymerase proteins. These results fulfill the requirements for the designation of this group of sequences as a novel HBV sub-genotype F (F5) [6]. Sequences of sub-genotype F5 and F1c were located in the same geographical area of Panama, despite the existence of a great genetic distance between them (p-distance = 7.4160.40). This probably suggests different introductions of these variants in the region.
A Bayesian approach was used to estimate the population history of the Panamanian HBV-F sub-genotypes (F1pan, F5pan). These sub-genotypes have an estimated time of most common ancestor similar to the rest of the HBV-F sub-genotypes (F1-F4). The high variability of the HBV genotypes and its distinctive distribution across the continent, imply a long history of migration and isolation of the common ancestors of the sub-genotypes Table 1. Mean percentage nucleotide divergences among the whole genome of HBV sub-genotype F.    currently circulating in the Americas. The results of this study suggest that the common ancestors of the Panamanian subgenotypes (F1pan, F5pan) were part of the diversification history of the genotype F in the Americas, regardless of the tree priors used (fixed substitution rate or time-stamped data) or the genome region employed (whole genome, polymerase gene) to perform the analysis. Another important result is the evidence of the confinement of these genotypes (F1pan, F5pan) in the most western region of Panama. This region is characterized by a high incidence of HBV infection among adults.
It has been hypothesized that the HBV sub-genotypes F sequences described thus far, do not represent the complete history of HBV genotypes in the Americas. The tMRCA estimated for sub-genotype F Panamanian sequences confirm this hypothesis. The results of this study suggest a simultaneous circulation of the Panamanian samples with previously described sub-genotypes, rather than a recent origin of the Panamanian sub-genotypes in the country.
In conclusion, Panama has a high HBV genotype diversity that includes a new proposed sub-genotype, an elevated number of PreCore-Core mutations, and confinement of these variants to a specific geographical region. These results emphasize the importance to study the molecular epidemiology of HBV in Central America.

Supporting Information
Table S1 Partitions of the HBV Whole genomes.