Rare-event sampling of epigenetic landscapes and phenotype transitions

doi:10.1371/journal.pcbi.1006336

Fig 1.

Computational pipeline for rare-event sampling of epigenetic landscapes and phenotype transitions.

The input to the computational pipeline is a reaction network model of gene regulatory network dynamics. Stochastic simulations are performed using SSA [33] and Weighted Ensemble rare-event sampling [45]. The WE method can be run in two modes: Rate Mode computes the rate of transitioning between two user-defined regions of interest with high accuracy. Transition-Matrix Mode computes the pairwise transition probabilities among N_bins adaptively defined sampling bins that span the system state-space. Further visualization and analysis of the transition-matrix can be performed, including automatic designation of metastable phenotypes via the coarse-graining framework [42] and identification of likely transition paths [36].

More »

Expand

Fig 2.

Simulation results show good agreement with a theoretical benchmark for the 2-gene ExMISA (mutual inhibition, self-activation) cell-decision circuit.

The Chemical Master Equation for the 2-gene model, ExMISA, was solved numerically (see Methods) (top) and compared to simulation results from the computational pipeline presented in this paper (bottom). Shown for each are the Quasipotential Landscape (A), Eigenvalue Spectrum (B), and Markov State Model (C). (A) Quasipotential landscapes of the ExMISA network projected onto the two protein coordinates. Deep blue regions denote low potential (high probability) and yellow denote high potential (low probability). The four visible basins in both correspond to combinations of lo/hi expression for the two genes A and B. (For both rows, quasipotential surfaces estimated over discrete states/bins are smoothed for visualization). WE sampling captured both the basin structure and low probability edge and barrier regions. (B) Eigenvalue spectra and corresponding computed global transition timescales. Gaps in the eigenvalue spectrum indicate separation of timescales, i.e., the presence of metastability. C) Four-phenotype coarse-grained models automatically generated from the clustering algorithm (see Methods). Each colored circle represents a cell phenotype, sized proportionally to its probability. Edges are inter-phenotype transitions (colored by source-state, with width proportional to probability). The full CME and simulation pipeline identify similar metastable phenotype networks (see S11 Fig for details).

More »

Expand

Fig 3.

Pluripotency network model and simulation results (Parameter Set I).

A)Wiring diagram for the eight-gene pluripotency network model, adapted from [28]. Arrowheads represent positive interactions, while flat lines denote repression. B) Simulation results: state-transition graph of sampled network states. Circles represent aggregate gene-expression states sampled during the Weighted Ensemble simulation. Circle areas are proportional to the steady-state probability π_i in each state according to ln(γπ_i) with scaling factor γ = 3.4. States are colored according to the gene expression levels of three of the genes; red, green, and blue correspond to high NANOG, GATA6, and CDX2 expression respectively, while black corresponds to low or no gene expression. Edges connecting the states indicate possible state-transitions, colored according to the originating state. The graph is produced using Gephi [54] using a force-directed layout algorithm (Force Atlas), therefore short inter-state distances reflect higher probability of transitioning. C) Full protein compositions of two representative states, with either high CDX2 expression (blue) or high NANOG expression (red). States in (C) correspond to yellow circles in (B).

More »

Expand

Fig 4.

Simulation results for the pluripotency network (Parameter Set I).

The Computational Pipeline Uncovers Six Metastable Phenotypes and Irreversible Phenotype Transitions. A) Computed eigenvalue spectrum and global timescales indicating the presence of metastability in the network. The gap in the eigenvalue spectrum after the sixth eigenvalue suggests that a partitioning can be found into six metastable phenotypes. B) The coarse-grained network showing six algorithmically-identified phenotypes designated as Low NANOG 1 (LN1), Low NANOG 2 (LN2), Stem Cell (SC), Primitive Endoderm (PE), Trophectoderm (TE), and the Intermediate Cell (IM) state. C) The averaged gene expression levels (copy numbers) of each transcription factor for each phenotype and their respective steady-state probabilities. D) The four most probable transition pathways from the SC state to the TE state (differentiation) and from the TE state to the SC state (dedifferentiation). E) The highest probability transition paths projected onto three protein coordinates, NANOG, GATA6, and CDX2. Differentiation from SC to TE is visibly irreversible, i.e., the system returns by a separate route.

More »

Expand

Table 1.

Computed mean first passage times (MFPTs) of phenotype transitions in the pluripotency network.

MFPTs are shown for transitions between the pluripotency (high NANOG) state (SC) and low NANOG expression states (LN(1)) (left columns) and for transitioning between the pluripotency state (SC) and the trophectoderm state (TE) (right columns), in units of the inverse transcription factor decay rate, k⁻¹. Transitions for Parameter Set I were computed using the WE method in rate mode while transitions for Parameter Set II were estimated from the sampled transition matrix. The definitions of SC and LN(1) are analogous to the high NANOG production (N^hi) and low NANOG production (N^lo) transitions measured in experiments [8, 9]. Increasing the adiabaticity (i.e., the rates of DNA-(un)binding, h, f), leads to rarer inter-phenotype transitions. The simulations also show that, within the same gene network for a given parameter set, inter-phenotype transition times span four orders of magnitude.

More »

Expand

Fig 5.

The rare-event sampling pipeline makes rare states and transitions accessible to simulation.

A) The global state-transition graph computed with the computational pipeline for the Pluripotency Network with rare transitions (Parameter Set II). The states are colored according to the coarse-grained (algorithmically-identified) phenotypes. In this parameter regime (f = 50) the differentiated (TE, PE, IM) and pluripotent phenotypes are cleanly separated, reflecting exceedingly rare transitions between the two phenotypes (O(10⁹), see Table 1). (B) States visited in conventional SSA simulation (using the same initialization, definitions, and placement as in (A)). In the conventional simulation, a transition out of the IM phenotype was never observed.

More »

Expand

Fig 6.

Simulation results for the pluripotency network (Parameter Set II).

Changing DNA-Binding Kinetics Alters the Epigenetic Landscape. A) Computed eigenvalue spectrum and global timescales. B) The coarse-grained Markov State Model showing five phenotypes corresponding to the LN1, SC, PPE, TE, and IM phenotypes of Parameter Set I. The majority of the steady state probability is in the IM phenotype (0.98). C) The gene expression levels for each phenotype and their respective steady-state probabilities. D) The four most probable differentiation pathways between SC and TE phenotypes. E)The dominant pathways of (de)differentiation projected onto the GATA6, CDX2, and NANOG coordinates. The change in DNA-binding kinetics shows different transition dynamics from Parameter Set I. Here, the forward and reverse paths are the same.

More »

Expand