Fig 1.
Schematics of the random circuit perturbation (RACIPE) method.
(A) The gene regulatory network for a specific cellular function is decomposed into two parts–a core gene circuit modeled by chemical rate equations and the other peripheral genes whose contribution to the network is regarded as random perturbations to the kinetic parameters of the core circuit; (B) RACIPE generates an ensemble of models, each of which is simulated by the same rate equations but with randomly sampled kinetic parameters. For each model, multiple runs of simulations are performed, starting from different initial conditions, to identify all possible stable steady states; (C) The in silico gene expression data derived from all of the models are subject to statistical analysis.
Fig 2.
Randomization scheme to estimate the ranges of the threshold parameters.
(A) Schematic of the procedure to estimate the ranges of the threshold parameters, so that the level of a regulator has 50% chance to be above or below the threshold level of each regulatory link (“half-functional rule”). First, for a gene A without any regulator, the RACIPE models are generated by randomizing the maximum production rate and the degradation rate according to S1 Table. The distribution of A level is obtained from the stable steady state solutions of all the RACIPE models (top left panel, yellow histogram). Second, for a gene A in a gene circuit, the distribution of A level is estimated only on the basis of the inward regulatory links (i.e. the B to A activation and the C to A inhibition in the bottom left panel). The distributions of the levels of the inward regulators B and C are assumed to follow the same distributions as a gene without any regulator (bottom left panel, blue and red distribution); the threshold levels for these inward links are chosen randomly from (0.02M to 1.98M), where M is the median of their gene expression distributions. Finally, the distribution of A level is obtained by randomizing all the relevant parameters. That includes the levels of B and C, the strength of the inward regulatory links (i.e., the threshold level, the Hill coefficient and the fold change), the maximum production rate and the degradation rate of A, and the threshold for any regulatory link starting from A is chosen randomly from (0.02M to 1.98M), where M is the median level of the new distribution of A level (orange in the bottom panel). The same procedure is followed for all other genes. (B) Tests on several simple toggle-switch-like circuit motifs and the Epithelial-to-Mesenchymal Transition (EMT) circuit show that the “half-functional rule” is approximately satisfied with this randomization scheme. For each RACIPE model, we computed the ratio (x/x0) of the level of each gene X at each stable steady state (x) and the threshold (x0) for each outward regulations from gene X. The yellow region shows the probability of x/x0 > 1 for all the RACIPE models, and the green region shows the probability of x/x0 < 1.
Fig 3.
RACIPE identifies robust features of toggle-switch-like motifs.
RACIPE was tested on three circuits–a simple toggle-switch (TS, top left) which consists of genes A and B that mutually inhibit each other (solid lines and bars), a toggle-switch with one-sided self-activation (TS1SA) which has an additional self-activation link on gene A, and a toggle-switch with two-sided self-activation (TS2SA) which has additional self-activation links on both genes. (A) Probability distributions of the number of stable steady states for each circuit. (B) Probability density maps of the gene expression data from all the RACIPE models. Each point represents a stable steady state from a model. For any RACIPE model with multiple stable steady states, all of them are shown in the plot. (C) Average linkage hierarchical clustering analysis of the gene expression data from all the RACIPE models using the Euclidean distance. Each column corresponds to a gene, while each row corresponds to a stable steady state from a model. The analysis shows that the gene expression data could be clustered into distinct groups, each of which is associated with a gene state, as highlighted by different colors on the right of the heatmaps.
Fig 4.
The gene states of the toggle-switch motif are robust against different types of distributions used to sample the parameters.
(A) Uniform distributions in three different ranges were used to sample the kinetic parameters of the RACIPE models. The top panels show the range of the distribution (left panel: the full range; middle panel: half; right panel: one-fourth). The bottom panels show the probability density maps of the gene expression data from all the RACIPE models. Similarly, panels (B) and (C) show the use of a Gaussian distribution and an exponential distribution, respectively. For the Gaussian distribution (B), its standard deviation was shrunk by a factor of two from left to right. For the Exponential distribution (C), its mean was reduced by a factor of two from left to right. The means of the distributions are indicated by red arrows.
Fig 5.
Application of RACIPE to coupled toggle-switch circuits.
RACIPE was tested on coupled toggle-switch circuits, as illustrated at the top of the figure. (A) 2D probability density map of the RACIPE-predicted gene expression data projected to the 1st and 2nd principal component axes. (B) Average linkage hierarchical clustering analysis of the gene expression data from all the RACIPE models using the Euclidean distance. Each column corresponds to a gene, while each row corresponds to a stable steady state. The clustering analysis allows the identification of several robust gene states, whose characteristics were illustrated as circuit cartoons to the right of the heatmaps. The expression levels of each gene in these gene states are illustrated as low (grey), intermediate (blue), or high (red). See S6 and S7 Figs for the definitions.
Fig 6.
RAICPE identifies multiple EMT cell states from gene network analysis.
(A) A proposed Epithelial-to-Mesenchymal Transition (EMT) circuit is constructed according to the literature; the circuit consists of 13 transcriptional factors (circles), 9 microRNAs (red hexagons) and 82 regulatory links. The blue solid lines and arrows represent transcriptional activations, the red solid lines and bars represent transcriptional inhibition, and the green dashed lines and bars stand for translational inhibition. Two readout genes CDH1 and VIM are shown as green circles while the other transcriptional factors are shown in blue. (B) Average linkage hierarchical clustering analysis of the gene expression data from all the RACIPE models using the Euclidean distance. Each column corresponds to a gene, and each row corresponds to a stable steady state. Four major gene states were identified and highlighted by different colors. According to the expression levels of CDH1 and VIM, the four gene states were associated with epithelial (E in red), mesenchymal (M in grey) and two hybrid epithelial/mesenchymal (E/M I in purple and E/M II in brown) phenotypes. (C) The gene expression distribution of each gene state. The gene expression distribution of each gene for all of the RACIPE models is shown in blue, while that for each gene state is shown in red (50 bins are used to calculate the histogram of each distribution). For clarity, each distribution is normalized by its maximum probability. Each row represents a gene and each column represents a gene state. (D-F) Gene expression data were projected to either CDH1/VIM, miR-200b/ZEB1, or miR-34a/SNAI1 axes. Different gene states are highlighted by the corresponding colors and enclosed by the ellipses. (G-I) Transcriptomics data from the NCI-60 cell lines were projected to either CDH1/VIM, miR-200b/ZEB1, or miR-34a/SNAI1 axes. The NCI-60 cell lines have been grouped into E, E/M and M phenotypes according to the ratio of the protein levels of CDH1 and VIM. Different gene states are highlighted by the corresponding colors and enclosed by the ellipses.