Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

Start and end nucleotide positions (genomic coordinates) for early (E1–E7) and late (L1–L2) genes for each HPV genotype analyzed in this study. Gene boundaries were obtained from reference genomes curated in the Papillomavirus Episteme (PaVE) database and were used to map gene regions onto multiple sequence alignments for entropy-based variability analyses. “N.A.” indicates that the gene was not annotated or not present in the corresponding reference genome.

More »

Table 1 Expand

Fig 1.

Shannon entropy (H) distributions across genome positions for each HPV genotype.

Most sites cluster at low H (high intragenotype conservation), with sparse right-tail bins indicating variable regions. These profiles provide a baseline to identify conserved genomic regions and variable windows that may be informative for genotype differentiation.

More »

Fig 1 Expand

Fig 2.

Heatmap of mean Shannon entropy (Mean H; 0–1) by gene (E2, E4, E5, E6, E7, L1, L2) and HPV genotype.

Each cell shows the average H computed over all positions annotated for that gene within the genotype. Color scale: low values (purple) = higher conservation; high values (green/yellow) = greater variability. White cells indicate missing data/lack of gene annotation.

More »

Fig 2 Expand

Fig 3.

Inter-genotype metrics per gene based on Shannon entropy.

Rows = genes (E2, E4, E5, E6, E7, L1, L2); columns = metrics: Mean H (mean entropy H across all annotated positions of the gene across genotypes), Median H (median H), IQR H (Interquartile Range = Q3 − Q1), which quantifies the central dispersion of H and is robust to extreme values; % Conserved (percentage of positions with H = 0) and % High (percentage with H > 0.5). The color scale is normalized by column (0–1), the numerical labels show the raw (unnormalized) values. In this summary, high Mean/Median H values together with high % High and low % Conserved indicate genes with inter-genotype divergence, while low values and high % support these genes as relatively conserved regions across genotypes.

More »

Fig 3 Expand

Fig 4.

Intergenotype variability per gene in the consensus MSA.

(A) Percentage of positions by Shannon entropy category (Conserved: H = 0; Intermediate: 0.1–0.5; High: H > 0.5) for each gene (E2, E4, E5, E6, E7, L1, L2). A higher proportion of “High” indicates greater divergence between genotypes. (B) Mean entropy (0–1) per gene, with points labeled by their corresponding values. Early genes (E5, E6, and E7) show the highest means and therefore greater variability, whereas capsid genes (L1 and L2) display the lowest mean entropy values, indicating greater relative conservation.

More »

Fig 4 Expand

Fig 5.

Maximum likelihood tree inferred from the alignment of consensus sequences of the HPV genotypes analyzed, representing their overall sequence divergence.

The tree was grouped using hierarchical clustering (Ward.D2) into four main clusters (colors) to contextualize intergenotypic variability. Cluster 1 (Green): HPV 156 genotype, showing the greatest basal divergence. Cluster 3 (Blue): The densest and most recent group, grouping closely related oncogenic genotypes (HPV 16, 31, 33, 35, 52, 58, 67, 73). Clusters 2 (Orange) and 4 (Pink): Intermediate groups containing the remaining oncogenic genotypes. The phylogenetic structure supports the need for highly specific differential markers due to the evolutionary proximity between high-risk genotypes.

More »

Fig 5 Expand