Fig 1.
Phylogeny of primate papillomaviruses.
A maximum likelihood phylogenetic tree was inferred from the concatenated nucleotide sequence alignment of 4 open reading frames (E1-E2-L1-L2) of 141 papillomavirus types representing 132 species (see PV list with hosts in S2 Table). The majority of analyzed primate papillomaviruses cluster into three distinct clades, Alpha-, Beta- and Gamma-PV genera, corresponding predominately to the anatomical sites (e.g., mucosal vs. cutaneous epithelium) where the viruses were originally isolated, rather than to the distinct host species. The branches represented by non-human primate papillomaviruses are highlighted in red. Non-primate papillomaviruses are collapsed and joined by grey lines (see comprehensive tree in S1 Fig). The dot sizes are proportional to the bootstrap percentage supports from RAxML.
Fig 2.
Schematic model of virus-host codivergence.
Strict virus-host codivergence requires the evolutionary history of the pathogen to mirror that of its hosts. Clustering of viruses according to the host from which they were isolated should be observed. In addition, the divergence times of hosts and parasites should be similar (different colors highlight viruses infecting different primate host ancestors). Intrahost divergence can be defined according to specific phylogenetic criteria, such as niche-adaptation prior to coevolution in primate papillomaviruses, as opposed to clustering by hosts.
Table 1.
Permutational multivariate analysis of variance using primate papillomavirus pairwise distance.
Fig 3.
Divergence time estimation of primate papillomaviruses to their most recent common ancestors (MRCAs).
A Bayesian MCMC method was used to estimate divergence times as described in the methods. Times were calculated separately for each genus, Alpha- (A), Beta- (B) and Gamma-PVs (C). Branch lengths are proportional to divergence times. The branches in red refer to non-human primate papillomaviruses. Numbers above the nodes with circles are the mean estimated divergence time in million years (M) between human and non-human papillomavirus clades. The bars in grey represent the 95% highest posterior density (HPD) interval for the divergence times (see details in S5 Fig, S6 Fig and S7 Fig, respectively). Panels B and C show time on the Y-axis and phylogeny on the X axis.
Table 2.
Divergence time estimation of Alphapapillomavirus and Dyoomikronpapillomavirus types.
Fig 4.
Schematic model of virus-host codivergence of primate papillomaviruses.
(A) A schematic topology of representative primate papillomaviruses. The branch colors represent viruses with specific host niche adaptation (brown–isolated from mucosal tissues, blue–isolated from cutaneous tissue). (B) Model of phylogeny and divergence of primate papillomaviruses. In this model, one or more primate papillomavirus ancestors evolved to colonize distinct host ecosystems prior to the speciation of a primate ancestor. A process of further viral adaptation to colonize more specific host ecosystems (represented by black circles at the nodes) may have followed upon host speciation, resulting in the radiation observed in the extant primate papillomavirus tree. The broken lines in grey (starting from open circles) represent clades for which specific HPV species lack detectable non-human primate counterparts. The 3-dimentional structure represent host phylogeny.
Fig 5.
Geographic distribution of HPV16 variants.
(A) A total of 3256 HPV16 variants with known geographic origin from 22 countries/regions (see details in Table 3) were assigned into lineage/sublineage and summarized by geographic group in the pie charts. (B) Principle component analysis using a weighted UniFrac algorithm clustered different study cohorts into three distinct groups, namely African, Eurasian (Asian and Caucasian) and South/Central American, mainly associated with a predominant population from which viruses were sampled. (C) Relative frequency of HPV16 lineages/sublineages distribute into four major geographic populations (African, Asian, Caucasian, and South/Central American).
Table 3.
HPV16 variant assignment with known geographic origin.
Fig 6.
Divergence time estimation of HPV16 complete genome variants.
A Bayesian MCMC method was used to calculate the divergence times of HPV16 complete genome variants from their most recent common ancestors, as described in the methods. The nodes highlighted with red circles indicate divergence times of the split between HPV16 A and non-A lineages, between A1-3 and A4 sublineages, and between C and D lineages. Branch lengths are proportional to divergence times scaled in thousands of years (K). Grey bars indicate the 95% highest posterior density (HPD) for the corresponding divergence age (see details in S10 Fig). Colors in branches represent distinct HPV16 variant lineages/sublineages. The plot on top of the tree is a Bayesian skyline estimation based on 311 present-day human mtDNA sequences (without the loop region) from geographically diverse populations and 212 HPV16 complete genome variants with a similar geographical distribution. The median posterior estimates (the product of the effective population size Ne and the generation length g in years) throughout the given time period are illustrated with lines in black. The dark blue (humans) and dark red (HPV16) areas give the 95% HPD interval of these estimates.
Table 4.
Divergence time estimation of HPV16 variant lineages.
Fig 7.
Ancient HPV variant codivergence with archaic hominins.
The plot shows the correlation between divergence time (X-axis) of HPV variants from the most recent common ancestors and genomic diversity (Y-axis) of HPV variants. The Alpha-3 (HPV61), Alpha-5 (HPV26, 51, 69, 82), Alpha-6 (HPV30, 53, 56, 66), Alpha-7 (HPV18, 39, 45, 59, 68, 70, 85, 97), Alpha-9 (HPV16, 31, 33, 35, 52, 58, 67), Alpha-10 (HPV6, 11), Alpha-11 (HPV34, 73), and Alpha-13 (HPV54) variants from previous publications were included. The adjusted R2 value indicating the correlation between sequence diversity and divergence time of HPV type variants was calculated using the linear model (lm) function in R.
Fig 8.
Schematic illustration of HPV16 codivergence with archaic hominins.
The model was based on HPV16 variant divergence time estimation, phylogenetic topology, and geographic distribution that superimposes an ancestral viral transmission between Neanderthals/Denisovans and modern human populations. The early divergence event among deeply separated HPV16 variant lineages (A vs. BCD) suggests ancient virus-host codivergence following the speciation of modern humans and archaic hominins (e.g., Neanderthals and Denisovans) from their most recent common ancestors. The gene flow through host interbreeding between archaic hominins allowed viral transmission from Neanderthals/Denisovans to modern humans. th-n/d denotes the splitting time between Neanderthals/Denisovans and modern humans, th represents the speciation of modern humans. taf indicates the era of population expansion of modern humans walking out-of-Africa. tg indicates the time of gene flow (f) that may have occurred between modern humans and Neanderthals/Denisovans. tn estimates the extinction of Neanderthals/Denisovans. The arrows indicate the out-of-Africa migration events of archaic and modern human populations. The broken lines indicate potential extinction of viral variants. Branch lengths and widths are not drawn to scale.
Table 5.
Estimation of divergence time (thousand years ago, kya) of HPV variants from the most recent common ancestor (MRCA).