Fig 1.
Sequence and structure information related to Spike variants are extracted from the Exascalate4Cov Consortium Database. Structural analysis is performed by calculating differences using the TM-Score and Net charge of the Spike Protein. Protein Contact Networks search communities to find similarities among variants and to evaluate residue centrality values. Sequence analysis is studied to compute distance among sequences and, thus between variants.
Table 1.
SARS-CoV-2 variants, lineage classification and mutations on the Spike protein sequence.
Table 2.
Centrality measures definition.
Table 3.
P-values of the t-tests for the comparison of average node centrality values of mutated nodes in the Omicron1 variant only.
EC—Eigenvector Centrality, BC -Betweenness Centrality, KC—Katz Centrality, DC—Degree Centrality, CC—Closeness Centrality.
Table 4.
P-Values of the t-test for the comparison of average node centrality values of all nodes in the Omicron1 variant.
p-values obtained after correction for multiple tests and with values less than 0.05 have been considered significant. EC—Eigenvector Centrality, BC—Betweenness Centrality, KC—Katz Centrality, DC—Degree Centrality, CC—Closeness Centrality.
Fig 2.
Centrality analysis pipeline: Starting with a Spike protein variant we compute the PCN.
Then, we calculate centrality values for each node and finally map values on the protein structure with different colors.
Fig 3.
Node eigenvector centrality boxplots.
From the upper part of the figure: (a) average centrality values of all the nodes of the S proteins of the selected variants as boxplots; (b) the eigenvector centrality values for nodes of the functional domain (Omicron1 variant); (c) eigenvector centrality values of the nodes of the RBD domain for all the selected variants.
Fig 4.
Amino acids eigenvector centrality values mapped on the protein structure of the following Spike variants: a) Omicron1; b) Wild Type; c) Delta.
Eigenvector centrality values are represented by a color-based scale from blue (lower values) to red (higher values). The decrease of eigenvector centrality of RBD domain of the Omicron1 variants is shown by the presence of a larger number of blue colored nodes.
Table 5.
Confidence intervals (CI) of average centrality values (from left to right: Eigenvector, Betweennes, Katz, Degree, and Closeness).
Fig 5.
Starting with a Spike protein variant, for example the Omicron1 variant, we use the corresponding PDB file to compute a PCN.
Then, communities are identified and plotted by applying the Louvain community detection algorithm on the PCN. Finally, communities are mapped on the protein structure to relate communities and functional domains of the protein.
Fig 6.
Community detection analysis comparing Spike of the Wild Type, Delta, and Omicron1.
Communities mapped directly on the protein structure of a) Wild Type, b) Delta, and c) Omicron1 variant to visualize functional domains predicted by the Louvain algorithm.
Fig 7.
Community detection analysis comparison between a) Delta and b) Omicron1.
Visualization of the mutations and the communities on the protein structure. Mutated residues are displayed as red spheres. In Omicron1, and Omicron5, the mutations seem to fall inside certain communities with the same function in the Spike Protein. We found that Omicron1 has more than ten mutations that fall inside the same community.
Table 6.
Summary of communities and their mutations.
As some mutations do not belong to any community, we report them here as belonging to a virtual community referred to as (UnModelled).
Fig 8.
Structural, Sequence and Network similarity results: a) Heatmap representing the structural similarity of variants computed by TM-scores; b) Heatmap representing the similarity between PCNs of variants; c) sequence similarity represented by a Phylogenetic tree. The figure evidences no direct correlation between sequence changes and structural modification. For instance, Iota1, and Zeta variants have dissimilar sequences but very similar structures.
Fig 9.
RBD and NTD domain net charges for all the selected variants.
Table 7.
Net charge of the Spike protein variants in RBD and NTD domains.
With e the elementary charge constant equals to 1.602 * 10−19C.
Fig 10.
COVID-19 worldwide number of vaccinations and confirmed cases studied.
Variants are reported as points on their first detection date.