Fig 1.
Continent-level SARS-CoV-2 lineage dynamics and pandemic curves.
Lines show a 14-day rolling average of reported SARS-CoV-2 cases. Bars show the biweekly proportions of common lineages and are coloured by lineage. The white space shows the proportion of sequences from other (non-majority) lineages.
Fig 2.
Association of SARS-CoV-2 infection rates and predictor variables globally.
A. Pearson’s correlation matrix of infection rate and predictor variables. Positive correlations are denoted in orange and negative correlations in blue and colour intensity is directly proportional to coefficient value. B. Model fitting using multiple linear regression. Black solid lines show a 14-day rolling average of adjusted SARS-CoV-2 cases. Pink solid lines show fitted mean response values of infection rates with predictor values as input.
Fig 3.
Mutational signatures extracted from the SARS-CoV-2 genome sequences by non-negative matrix factorisation.
Signatures are patterns of probabilities for each category of substitution in a three nucleotide context. Each bar represents a context and is coloured by the substitution category of the mutation that occurs there. Each signature may represent a distinct mutational process. Signature 1 is heavily biased towards cytosine to thymine (C→T) mutations, particularly in 3’ CpG contexts TCG, CCG and ACG. Signature 2 from SARS-CoV-2 is predominantly adenine to guanine (A→G), guanine to adenine (G→A) and thymine to cytosine mutations (T→C). Signature 3 is strongly guanine to thymine (G→T), a pattern that is thought to be caused by the action of guanine oxidation by reactive oxygen species. Signatures are shown normalised against the tri-nucleotide composition of the SARS-CoV-2 genome. Non-normalised forms in the context of the SARS-CoV-2 genome composition are shown in S5 Fig.
Fig 4.
Signature exposure plots showing the activities of the extracted mutation signatures over the duration of the COVID-19 pandemic.
A. Shows the percentage activity of the signatures during a given week of the pandemic, with each colour representing a different signature. B. Shows the signature activities as their absolute values at each epidemic week.
Fig 5.
A. Counts of unique SARS-CoV-2 mutations for each epidemic week, with colours representing which continent the mutations came from. B. Counts of unique mutations per week that are part of the mutational signature substitution-context features (i.e., no indel mutations included). Colours represent which lineage/group of lineages the mutations belong to. C. Ridgeline plot showing the exposure of mutational signatures in SARS-CoV-2 variant-defined subsets. Exposures are coloured by the signature they have been attributed to. D. Ridgeline plot showing the exposure of mutational signatures in SARS-CoV-2 continent-defined subsets.
Fig 6.
A. Exposures for each of the SARS-CoV-2 mutational signatures for both synonymous and non-synonymous stratified datasets. Synonymous exposures are below 0 on the y-axis, while non-synonymous exposures are above 0. Each area represents signature exposures across epidemic weeks, with colours representing which signature the exposures are attributed to. B. Non-synonymous and synonymous mutations in the tree-based references of identified variants of concern. Signature 1 produces the majority of both synonymous and non-synonymous substitutions in all lineages. Signature 3 mutations are more often non-synonymous substitutions in the lineages of concern, with most lineages having few to no changes. Signature 2 non-synonymous mutations appear to have increased in the Omicron lineages (BA.1 and BA.2). C. Variant of concern associated non-synonymous mutations coloured by the mutational signature with the greatest likelihood of causing the change. D. Variant of concern synonymous mutations coloured by the putative mutational process that caused the change.
Fig 7.
A. Signature exposures per month from wastewater sequences show similar trends in mutational processes as the global data, although at a lower resolution and, interestingly, with a lower Signature 2 exposure. B. Substitutions in SARS-CoV-2 consensus sequences from infections of immunocompromised individuals contain mutation types corresponding with patterns observed in the distinct signatures. Of note, there are more synonymous mutations present in the chronic infection data than in the global sequences, although it is important to note the sample size for immunocompromised infections is low. C. Mutation counts in wastewater sequences for bi-yearly time periods. Highly mutated sequences cluster to the right especially during the 2021 July-December time period, as would be expected when Omicron was emerging.