Table 1.
Type-specific genomic regions and gene content.
Fig 1.
Phylogenetic tree of all M. pneumoniae isolates in the current study (n = 107).
Maximum likelihood phylogenetic tree of 107 M. pneumoniae isolates based on core protein sequences (n = 464) identified through orthologous clustering. Bootstrapping values over 50 are represented on the tree.
Fig 2.
Phylogenetic tree of closed M. pneumoniae genomes in the current study (n = 34).
Maximum likelihood phylogenetic tree of 34 M. pneumoniae isolates based on core protein sequences (n = 642) identified through orthologous clustering. Bootstrapping values over 50 are represented on the tree.
Fig 3.
Clusters of M. pneumoniae isolates sharing unique SNPs.
(A) Number of shared unique SNPs in isolate clusters ranging from 1–106 isolates relative to reference genome FH. Only the group of isolates sharing the largest number of SNPs is shown. (B) Number of shared SNPs in each subtype relative to type 2 reference FH identified among all genomes (black bars) or closed genomes only (grey bars). *No closed genomes were available for Type 1N.
Fig 4.
Hierarchical Bayesian Analysis of Population Structure (hierBAPS) of type 1 and type 2 groups separately.
Three subpopulations were identified in (A) Type 1 and (B) Type 2 genome groups. Green, T; blue, A; red, G; yellow, C.
Fig 5.
High importance features for classification of M. pneumoniae subtypes based on Random Forest model.
(A) Representative plot of mean decrease in Gini value for top 50 feature lists. Each list consisted of ≥ 1 variant position. (B) Heatmap of presence/absence of the 25 features of highest relative importance for separation of six subtypes resulting from all iterations of the model (n = 40). Other variant sites also contributed to the final model.