Fig 1.
Illustration of the PROB framework for inferring the causal gene regulatory network from cross-sectional transcriptomic data.
(a) Illustration of cross-sectional transcriptomic data, taking three genes (i.e., A, B, and C) as an example. Each sample was labeled with staging information (e.g., S1, S2, S3, and S4). (b) Similarity graph-based random walk approach for cancer progression inference. A scale-free temporal progression distance (TPD) is defined by analytically summing the transition probability between patients over all random walk lengths. Patients are thus ordered according to the TPD with respect to the root identified with the aid of staging information. (c) The expression dynamics of each gene according to the latent-temporal progression are then recovered. (d) A Bayesian Lasso method is developed to infer the causal GRN based on the temporal data of gene expression. Besides edge directions, PROB can also infer signs of the interactions (activation or inhibition), compared to the existing correlational network methods.
Fig 2.
Demonstrating robustness of PROB using synthetic datasets at different levels of variabilities.
A set of expression data for 6 genes in 100 cancer patients was simulated. Different levels of technical variabilities (with coefficient of variations (CVs) = 0%, 5%, 10% and 15% respectively) were introduced into the progression-dependent gene expression dynamics. (a) Simulated cross-sectional gene expression data. The sample IDs of the synthetic data were randomized and the staging information was retained. (b) Comparison of the inferred latent-temporal progression with the true progression in the synthetic dataset, evaluated using Spearman’s rank correlation coefficient (rho). (c) Recovered gene expression dynamics according to inferred progression trajectory. (d) Accuracy of the GRN inference evaluated using the areas under curve (AUCs) of the ROCs.
Fig 3.
Comparison of PROB with other existing pseudotime inference methods and GRN inference methods using a real dataset.
We employed a set of scRNA-seq data of dendritic cells (DCs) for benchmarking since the gold standard in this situation is available. The cells were sequenced at 1, 2, 4 and 6h after stimulation of LPS. (a) The estimated latent-temporal progression of cells recapitulated the real progression with R2 = 0.851 to the capture times. (b) Benchmarking PROB with other pseudotime inference methods (Slice, Slicer, PhenoPath, Wishbone, PAGA, Monocole2, DPT, Tscan) evaluated by Kendall Tau and R2 (S4 Fig). (c) a TF network inferred by PROB. (d) Benchmarking PROB with eight existing GRN inference methods (PCOR, LASSO, GENIE3, ARACNe, CLR, MRNET, SCODE and LEAP) based on an experimentally-defined TF network [37] evaluated by AUC of ROC. (e) PROB correctly revealed the ordering of the outgoing causality scores (on a log10 scale) for the known regulators and targets [38] on the DC scRNA-seq dataset. (f) Comparing properties of different methods in their capabilities of predicting network links, regulatory directions and signs as well as gene expression dynamics.
Fig 4.
Reconstructing EMT regulatory networks during bladder cancer progression.
(a) Expression patterns of the EMT regulatory genes along with the inferred latent-temporal progression of conventional urothelial carcinoma (UC) to aggressive sarcomatoid urothelial bladder cancer (SARC). (b) UC-specific network with edges unique to the UC network. (c) SARC-specific network with edges unique to the SARC network. Different colors of nodes in the network denote genes in different pathways (S1 Table). (d) Reconstructed expression dynamics of ACSS1, PTPN12 and CDH1. ACSS1 and PTPN12 have largest out-degree values in the UC-specific network and SARC-specific network, respectively. CDH1 is a marker gene of epithelial state during EMT. (e) A decrease in EMT score indicated a transition from epithelial to mesenchymal state during the progression of UC to SARC. The EMT score for each tumor sample was calculated as weighted sum of expression levels of 73 EMT-signature genes as introduced in [39]. Positive EMT score corresponds to the epithelial phenotype while negative score to mesenchymal phenotype. Wilcoxon rank sum test (one-tailed) p value was calculated to assess the statistical significance.
Fig 5.
Experimental validation of the predicted role of ACSS1 in EMT of bladder cancer.
(a-b) Expression levels of ACSS1 and CDH1 in 5637 cells when ACSS1 was overexpressed (a) and inhibited (b), measured by q-PCR. (c) Protein expression levels of ACSS1 and CDH1 in 5637 cells when ACSS1 was overexpressed or inhibited, measured by Western-blotting. (d) Quantification of the relative protein expressions. (e) Examples of immunohistochemical expression of ACSS1 and E-cadherin in conventional UC and SARC. Statistical significance was assessed by student’s t test. **P<0.01; ***P<0.001; ****P<0.0001. OE-ACSS1: overexpression of ACSS1; si-NC: small interfering RNA negative control; si-ACSS1: small interfering RNA targeting ACSS1.
Fig 6.
FOXM1 was revealed as a key gene underlying breast cancer progression by PROB.
The gene expression data of 196 patients with clinical information (e.g., grade) were extracted from the GEO database (GSE7390 [40]). (a) Heatmap showing the expression profile of 100 selected genes that were most sustainably ascending (blue group) or descending (purple group) during cancer progression. (b) Gene set enrichment analysis for the descending genes (upper panel) and ascending genes (lower panel). The descending genes were enriched in local movement processes, and the ascending genes were mainly enriched in cell cycle and cell division processes. (c) The inferred GRN for the 100 genes. FOXM1 was found to be a hub gene in the network. (d-f) Clinical relevance of FOXM1 for breast cancer patients with respect to distant metastasis-free survival (DMFS) (d), relapse-free survival (RFS) (e) and overall survival (OS) (f). (g) Significance test of the prognostic power of FOXM1 using a bootstrapping approach. The p value from the permutation test was 0.0146, verifying the statistical significance of the prognostic power of FOXM1.
Fig 7.
Validation of the predicted FOXM1 subnetwork.
(a) The subnetwork of FOXM1 with predicted target genes. (b) Validation of the expression changes of the predicted target genes of FOXM1 with perturbation experiments. MCF-7 cells were treated with DMSO (control) or thiostrepton (a FOXM1 inhibitor) for 48 hours. Except for SCCBP1 and STIL, the other 6 genes were significantly down-regulated after FOXM1 inhibition. (c-e) ChIP-seq analysis of FOXM1 in the MCF-7 cell line with four biological replicates, showing that FOXM1 binds ASPM, CDCA8 and KIF2C. (f-h) ChIP-seq analysis of FOXM1 in the MDA-MB-231 cell line with two biological replicates, showing that FOXM1 binds ASPM, CDCA8 and KIF2C.