Single-cell network biology characterizes cell type gene regulation for drug repurposing and phenotype prediction in Alzheimer’s disease
Fig 6
AD Gene prioritization and clinical phenotype prediction using network-based machine learning.
(A) Boxplots showing the distribution of balanced accuracies (y-axis; obtained from 10 independent runs of five-fold cross-validation) in predicting known AD genes using interaction patterns in cell type GRNs as features (x-axis). (B) Gene ontology biological process terms enriched within the top 20% predictions in the microglia AD machine learning (ML) model. The terms are depicted along the y-axis, and the FDR corrected p-values are shown along the x-axis. (C) Genes were sorted according to their probability of being associated with AD in the microglia ML model, and the top 5% of the sorted list was used as features to predict AD phenotypes in an independent dataset (ROSMAP). The boxplots show the distribution of balanced accuracies (y-axis) obtained from testing four AD phenotypes (see Methods) and a set of randomly selected samples (x-axis). (D) Average feature importance scores of TFs at the three hierarchy levels in the microglia AD network. (E) Visualization of the subnetwork connecting top 10% TFs with highest feature importance scores in microglia AD network. Each grey node depicts a TF with border color set along a red gradient according to the disease-gene association score given in the DisGeneNet database (based on preliminary evidence collected from independent studies). (F) Feed-forward loops observed within top-ranked TFs in the microglia ML model (red line) and the distribution in 1000 random networks (grey bars). (G) Regulatory logics observed within top-ranked TFs in the microglia. (H) Enrichment of top-ranked genes within coregulated genes modules in microglia (*Permutation p value < 0.001).