Multi-omics data integration reveals metabolome as the top predictor of the cervicovaginal microenvironment
Fig 4
Metabolites (particularly xenobiotics, carbohydrates, amino acids and peptides) and the inflammatory cytokine MIF can accurately predict Lactobacillus dominance.
Integrated vaginal metabolome and immunoproteome profiles were used as predictive features for training cross-validated Random Forest classifiers to predict whether a subject’s vaginal microbiota is Lactobacillus dominant (LD ≥ 80% relative abundance consists of Lactobacillus ASVs) or non-LD (NLD < 80% relative abundance consists of lactobacilli). Combined measurements predict the Lactobacillus dominance at an overall accuracy rate of 86.1%. A 1.6-fold improvement over baseline accuracy was observed. Receiver operating characteristics (ROC) analysis showing true and false positive rates for each group, indicating excellent predictive accuracy for both LD (AUC = 0.93) and NLD groups (AUC = 0.93) (A). The confusion matrix illustrates the proportion of times each sample receives the correct classification when evaluating the classifier at a threshold of 0.5 (B). The graphs depict the 25 most strongly predictive features ranked by their mean Gini importance score across all 10 trained classifiers, a measure of their overall contribution to classifier accuracy (C).