Unveiling gene perturbation effects through gene regulatory networks inference from single-cell transcriptomic data
Fig 3
IGNITE predictions of single and triple KO perturbations in mPSCs and comparison with experimental benchmarks.
A. Scaled KO–WT difference for Rbpj, Etv5, and Tcf7l1 single knockouts, computed from experimental data [39] and from simulations with IGNITE, SCODE, and CellOracle. For each gene, the simulated scaled KO–WT difference was calculated as the scaled difference between the average fraction of active cells in wild-type and knockout conditions. For the experimental data, scaled log2FC values from [39] were used. All quantities were scaled between −1 and +1 to facilitate comparison across datasets (see Methods for details). B. Scaled KO–WT difference for the triple knockout, computed from experimental data [38] and from simulations with IGNITE, SCODE, and CellOracle. The simulated scaled KO–WT difference was calculated as the scaled difference between the average fraction of active cells in wild-type and knockout conditions. For the experimental data, scaled log2FC values from [38] were used. All quantities were scaled between −1 and +1 to enable comparison across datasets (see Methods for details). C. Hierarchical clustering of the IGNITE-generated triple KO GA. The clustering algorithm used is Ward’s method. Each row represents a gene, while each column corresponds to an individual cell. The color indicates inactive (−1, yellow) or active (+1, blue) gene activity. 9547 cells were simulated. D. PCA scatter plot representing the gene activity, GA, of the input dataset (scRNA-seq data with LogNorm, PST, and MB) and the IGNITE-generated triple KO GA. The colour gradient represents the cell density within each square area, with separate scales for the input (blue) and the generated triple KO cells (orange). E. Spearman’s correlation between simulated and experimental scaled KO–WT differences for Rbpj, Etv5, Tcf7l1, and the triple KO. Results are based on the 10 GRNs inferred with IGNITE that exhibited the lowest CMD values (out of 250 tested models). Boxes indicate the interquartile range, the horizontal line marks the median, and whiskers represent the full range of non-outlier data.