Global and regional dispersal patterns of hepatitis B virus genotype E from and in Africa: A full-genome molecular analysis

Description of the spatial characteristics of viral dispersal is important in understanding the history of infections. Nine hepatitis B virus (HBV) genotypes (A-I), and a putative 10th genotype (J), with distinct geographical distribution, are recognized. In sub-Saharan Africa (sub)-genotypes A1, D3 and E circulate, with E predominating in western Africa (WA), where HBV is hyperendemic. The low genetic diversity of genotype E (HBV/E) suggests its recent emergence. Our aim was to study the dispersal of HBV/E using full-length, non-redundant and non-recombinant sequences available in public databases. HBV/E was confirmed, and the phylogeny reconstruction performed using maximum likelihood (ML) with bootstrapping. Phylogeographic analysis was conducted by reconstruction of ancestral states using the criterion of parsimony on the estimated ML phylogeny. 46.5% of HBV/E sequences were found within monophyletic clusters. Country-wise analysis revealed the existence of 50 regional clusters. Sequences from WA were located close to the root of the tree, indicating this region as the most probable origin of the HBV/E epidemic and expanded to other geographical regions, within and outside of Africa. A localized dispersal was observed with sequences from Nigeria and Guinea as compared to other WA countries. Based on the sequences available in the databases, the phylogenetic results suggest that European strains originated primarily from WA whereas a majority of American strains originated in Western Central Africa. The differences in regional dispersal patterns of HBV/E suggest limited cross-border transmissions because of restricted population movements.


Introduction
Hepatitis B virus (HBV) is a common cause of liver disease and the prototype member of the family Hepadnaviridae [1]. Despite the availability of an effective vaccine, HBV infections continue to be a public health problem [2,3]. In 2015, the World Health Organization (WHO) been derived from HBV carriers of African origin regardless of their country of residence since genotype E is rarely found outside Africa. All sequences present in the databases as of August 2020 have been accessed. Duplicate sequences (N = 253) from the two public repositories identified by their identical accession number were removed from the analyses in addition to the sequences lacking the metadata (N = 42). Simplot v3. 5

Country grouping, phylogenetic and phylogeographic analysis
The available HBV/E sequences from different countries (N = 318) were classified into geographical regions according to the Global Burden of Disease classification system (http://www. who.int) [48]. The global distribution of these sequences per country as shown in Monophyletic clusters were defined as those having bootstrap values higher than 70%, within which 70% of HBV/E strains share the same geographic area of sampling (country or region). Trees were converted to midpoint rooted by using the FigTree v1.4.3 program (http://tree.bio. ed.ac.uk/software/figtree/) [51]. The origin of genotypes E was inferred by character reconstruction using the criterion of parsimony on the estimated ML phylogeny using Mesquite v3. 2 [52]. We conducted two kinds of phylogeographic analyses: one grouping sequences according to the country of sampling and another, grouping them according to the geographic regions as defined by the Global Burden of Disease classification system [48].

Results
We studied 318 complete genome sequences sampled from 29 countries around the world, which showed a mean nucleotide diversity of 1.95% ranging between 0% and 3% (S1A Table). Nearly 93% of all sampled sequences were collected in Africa. Specifically, 54.5% of the HBV sequences were isolated from four African countries, namely Guinea (24.5%), Nigeria (16%), Cameroon (9.1%) and Central African Republic (9.1%) ( Table 1, Fig 1). However, the highest mean nucleotide diversity of~3% was observed for sequences sampled from United Kingdom and Belgium (S1B Table). In addition, the highest intergroup sequence divergence of~3% between the countries was observed for Central African Republic and United Kingdom, Central African Republic and Belgium and United Kingdom and Belgium (S1C Table).
Phylogeographic analysis of the HBV/E sequences grouped in geographic regions revealed the existence of local dispersal in Africa (Fig 2). In addition, sequences from West Africa were located close to the root of the ML tree indicating that the HBV/E epidemic probably originated in West Africa and expanded to other geographical regions, within and outside of Africa (Fig 2). There are also some indications that the European strains originated primarily from West Africa whereas Western Central Africa was the source of the majority of viral strains dispersed to the Americas (Fig 2).
Country-wise phylogeographic analysis suggests that 46.5% (148) of the total number of HBV/E sequences (N = 318) were found within 50 monophyletic clusters ( Table 1). The analysis showed that HBV/E sequences form regional clusters at different percentages according to their geographic origin (Table 1). Specifically, all the sequences sampled from Democratic Republic of the Congo form a single monophyletic cluster. The same pattern was observed for Colombia, Egypt and South Africa. High levels of local dispersal, where > 50% of sequences showed monophyletic clustering, were found for Cameroon, Ghana, Liberia, Namibia, Nigeria, Sudan, and United Kingdom (Table 1).
A number of sequences from Guinea and Nigeria formed 14 and 9 monophyletic clusters, respectively whereas for Belgium, Cameroon, Central African Republic, Colombia, Democratic Republic of Congo, Egypt, Ghana, Liberia, Namibia, Niger, South Africa, Sudan and United Kingdom, a limited number of clusters were detected ranging from one to six ( Table 1). The sequences sampled from two semi-isolated rural communities in North and Central Nigeria clustered in a single, separate clade indicative of localized intra-country dispersal. The <50% monophyletic clustering of sequences from Belgium, Central African Republic, Guinea and Niger revealed the lowest regional dispersal. None of the sequences from Angola, Argentina, Benin, Burkina Faso, Cape Verde, Cuba, Ethiopia, Japan, Madagascar, Martinique, Mexico, Saudi Arabia, Senegal and Somalia formed monophyletic clusters (Table 1).

Discussion
Wide-range full-genome phylogenetic and phylogeographic analyses of the dispersal patterns of HBV/E were performed. As HBV/E is predominantly found in West Africa, there was an over-representation of some countries/geographical regions, probably introducing a sampling bias that cannot be avoided. Nonetheless, despite the limitations under these assumptions, the full-length HBV/E sequences analyzed showed a conspicuous low genetic diversity of 1.95% similar to earlier studies that reported an intragenotypic nucleotide divergence of 1.73% [15,18,26]. The low nucleotide diversity suggests its relative recent introduction into the population [26]. This coincides with reports that concluded that the recent origin and wide The present analyses of the limited number of sequences available in the databases, suggest that HBV/E sequences found in the European region and in the Americas were disseminated mostly from West African region. Considering HBV/E is only intermittently found in the Americas and rarely found outside Africa except in individuals of African descent, [20] this analysis is based on a small number of sequences thus limiting our ability to reach firm conclusions or make a strong statement.
Various times from the most recent common ancestor (t MRCA ) of HBV/E have been calculated using Bayesian inference, with a median time from t MRCA of 130 years [28] whereas in Nigeria, a more recent t MRCA was estimated to be year 1948 (95% HPD: 1924-1966), with an increase of HBV/E-infected population over the last~40 to 50 years [53]. These times differ from the estimated t MRCA of 6,000 years [29]. However, as previously suggested HBV/E may have existed in indigenous African populations and recently re-introduced [15]. HBV/E has previously been isolated in individuals from Colombia [54], India [55], Pygmies [56] and the Khoi San (Kramvis, unpublished data), with no history of travel to or from Africa. Nonetheless, resolution of the variance of the estimated age of HBV/E will be difficult without the accurate determination of the nucleotide substitution rate of HBV [13]. In contrast, the presence of subgenotype A1 in Brazil and Haiti [27,57], coincides with the present dominance of this subgenotype in southeast Africa, which was the source of the~400, 000 captives taken to south and Central America in the middle of the 19 th century. The fact that HBV/E did not cause an epidemic in the Americas could be because of the absence of HBV/E infection in the founding population of slaves or the limited secondary onward transmission within this population.
The observed pattern of regional dispersal for sequences sampled from Nigeria and Guinea (Fig 3) suggests limited population movements associated with cross-border transmissions. In addition, the sequences sampled from Nigeria clustering in a separate clade supports the limited cross border transmission. The rapid spread of HBV/E within a short period that was observed in large parts of Africa can be associated with a sudden change in the route of transmission. It is plausible that a sudden change in the route of transmission [20] such contaminated vaccine preparations [27] may be responsible for the spread. Furthermore, numerous mass injection campaigns against small pox, yaws [27, 58,59] and sleeping sickness [60], using multiple injections with same needles, were undertaken in the West African region. In addition, socio-cultural practices like facial or body scarification, traditional birth attendance and shaving by local barbers using unsterilized sharp instruments are alternative routes of transmission of blood-borne pathogens [61,62]. A study conducted in Egypt linked the transmission of HCV to unsafe mass injection campaign against schistosomiasis until the 1980's [63]. Therefore, because HBV is more transmissible than HCV, [64] it may partly explain the rapid spread of the HBV/E in West Africa [63]. The big puzzle to be solved is the reason HBV/E rapidly spread in West Africa and predominated over genotype A, which was dispersed from Africa by slave trade to the Americas [14,65].
Perinatal transmission is possibly another mode of HBV transmission that might have led to the rapid spread of HBV/E in sub-Saharan Africa. HBeAg easily crosses the placenta to infants born to HBeAg-positive mothers infected with HBV/E (vertical) [66]. This can lead to HBe/HBcAg tolerance in utero and perinatally [37] thus there is a high probability of chronic carrier status later in life [14,26,34,35,37,67,68]. In addition, community based transmission (horizontal) caused by children coming to contact with open wounds including behavioral factors (biting of fingernails and scratching the back of the carriers), sharing of bath towel and dental cleaning materials [69,70] is another mode of transmission. Extensive studies have been done to further identify the factors that influence perinatal transmission but with limited focus on West Africa. Although perinatal HBV transmission may explain, in part, the explosive spread of virtually identical viruses within a community, it is critical to understand whether it also explains the similarity of viruses across the vast expanses of the HBV/E crescent.
A study, conducted by Jayaraman and colleagues, linked the rapid spread of HBV and HIV infection in sub-Saharan Africa to the risky practices including blood transfusion and sociocultural practices [64]. Most of the sequences sampled from the different geographical regions were obtained from asymptomatic carriers, blood donors or ESLD patients infected with HBV/E. The progression of chronic HBV to cirrhosis, end stage liver disease (ESLD) and hepatocellular carcinoma (HCC) is more rapid in HIV-positive individuals than those with HBV alone [71]. The onset of the HIV epidemic in the 1950's might have played a role in the explosive transmission and dispersal of HBV/E in West Africa [72] with a high frequency of HBV/ HIV co-infection [73].

Conclusion
Taken together, our findings suggest considerable differences in the pattern of HBV/E regional dispersal, with the HBV/E epidemic probably originating in the West Africa and expanding to other regions, within and outside Africa. The observed strong patterns of regional and localized dispersal suggest that the population movements associated with cross-border transmissions were limited and this could be explained by the late introduction of HBV/E into the population as well as a sudden change in the route of transmission such as extensive use of unsafe needles in mass immunization campaigns and socio-cultural practices. In addition, the onset of the HIV epidemic in the 1950's might have played a role in the explosive transmission and dispersal of HBV/E in West Africa, where HBV/HIV co-infection rate is high.
Supporting information S1