Rare-event sampling of epigenetic landscapes and phenotype transitions

Stochastic simulation has been a powerful tool for studying the dynamics of gene regulatory networks, particularly in terms of understanding how cell-phenotype stability and fate-transitions are impacted by noisy gene expression. However, gene networks often have dynamics characterized by multiple attractors. Stochastic simulation is often inefficient for such systems, because most of the simulation time is spent waiting for rare, barrier-crossing events to occur. We present a rare-event simulation-based method for computing epigenetic landscapes and phenotype-transitions in metastable gene networks. Our computational pipeline was inspired by studies of metastability and barrier-crossing in protein folding, and provides an automated means of computing and visualizing essential stationary and dynamic information that is generally inaccessible to conventional simulation. Applied to a network model of pluripotency in Embryonic Stem Cells, our simulations revealed rare phenotypes and approximately Markovian transitions among phenotype-states, occurring with a broad range of timescales. The relative probabilities of phenotypes and the transition paths linking pluripotency and differentiation are sensitive to global kinetic parameters governing transcription factor-DNA binding kinetics. Our approach significantly expands the capability of stochastic simulation to investigate gene regulatory network dynamics, which may help guide rational cell reprogramming strategies. Our approach is also generalizable to other types of molecular networks and stochastic dynamics frameworks.


Introduction
In multicellular organisms, differentiation of pluripotent stem cells into tissue-specific cells was traditionally considered to be an irreversible process. The discovery of cell reprogramming revealed that a the identity of a cell is not irreversibly stable, but rather plastic and amenable to control by perturbation of gene regulatory interactions-for example, through over-expression of key transcription factors [1]. Cellular plasticity has also been observed in other contexts, where cells appear to spontaneously transition among phenotypically distinct states. For example, in embryonic stem cells, expression levels of key transcription factors show dynamic heterogeneity, which is thought to enable diversification of the population prior to lineage commitment [2][3][4][5][6]. This heterogeneity may result at least in part from stochastic state-transitions between functionally distinct, metastable subpopulations [4,[7][8][9]. Stochastic state-transitions have also been proposed to play a role in cancer, by enabling cancer stem cells to arise de novo from non-stem subpopulations [10], or by enabling cells to reversibly transition to a drug-tolerant phenotype [11]. In microbial systems, stochastic phenotype switching impact the structure of epigenetic landscapes, motivating the use of Master Equation-based approaches. That is, the number and stability of phenotype-states accessible to a given GRN varies depending on the kinetic parameters governing these fluctuations [23,24,29]. Furthermore, ODE or "mean-field" models that average over these fluctuations can show qualitatively different landscape features [30][31][32].
Master Equation approaches face the well-known challenge of the "Curse-of-Dimensionality", as solving them requires enumeration of a state-space that grows exponentially with the number of molecular species in the network. For this reason, discrete stochastic models of GRNs are often studied by stochastic Monte Carlo simulation, via the Gillespie algorithm [33]. However, stochastic simulation can also be problematic: in systems with metastability, such as GRNs, stochastic simulation becomes highly inefficient. Transitions between metastable states are rare events (i.e., rare relative to the timescale of fluctuations within a metastable attractor basin), and thus difficult or impossible to observe. Often, these rare events are precisely the events of interest, such as in GRNs where infrequent state-transitions represent critical cell-fate transitions.
Rare-event sampling algorithms are designed to overcome these challenges, by redirecting computational resources towards events of interest, while maintaining statistical accuracy to global system dynamics [34,35]. In this work, we present a rare-event simulation-based method for computing and analyzing epigenetic landscapes of stochastic GRN models. We combine rare-event methods with coarse-graining and analysis by Transition Path Theory-adopted from the field of Molecular Dynamics of protein folding [36]-and show that this unified framework provides an automated approach to map epigenetic landscapes and transition dynamics in complex GRNs. The method quantifies the number of metastable phenotype-states accessible to a GRN, calculates the rates of transitioning among phenotypes, and computes the likely paths by which transitions among phenotypes occur. We apply the method to a model of pluripotency in mouse Embryonic Stem Cells. Our results reveal rare sub-populations and transitions in the network, demonstrate how global landscape structure depends on kinetic parameters, and reveal irreversibility in paths of differentiation and reprogramming. Our approach is not limited to gene regulatory networks; it is generalizable to other stochastic dynamics frameworks and is thus a potentially powerful tool for computing global dynamic landscapes in areas such as signal-transduction, population dynamics, and evolutionary dynamics.

Methods
A graphical overview of the computational pipeline presented in this paper can be found in Fig 1. 3/45  [33] and Weighted Ensemble rare-event sampling [37]. The WE method can be run in two modes: Rate Mode computes the rate of transitioning between two user-defined regions of interest with high accuracy. Transition-Matrix Mode computes the pairwise transition probabilities among N bins adaptively defined sampling bins that span the system state-space. Further visualization and analysis of the transition-matrix can be performed, including automatic designation of metastable phenotypes via the coarse-graining framework [38] and identification of likely transition paths [36].
reactions and parameters can be found in the Supplement, File S1 and Table S1. The model encompasses stochastic birth/death processes for transcription factor production and degradation, and stochastic binding and unbinding of transcription factors to DNA regulatory/promoter regions; the binding-states of these regions governs the production rate. Each transcription factor is assumed to bind to DNA as a homodimer, giving cooperative regulation. In the "exclusive" network variant, transcription factors compete for binding sites on DNA (only one transcription factor dimer can be bound to a gene's promoter at a time). The discrete state-vector, which completely describes the state of the system, is given by x = [A ij , B ij , n a , n b ]. A ij and B ij represent the three possible promoter binding-states for each gene (i.e., A/B 00 , A/B 10 , A/B 01 denote unbound, activator-bound, or repressor-bound states). The copy-numbers of expressed protein transcription factors are denoted by n a and n b for products of gene A and B, respectively, and may in principle take any nonnegative integer value. All processes related to transcription, translation, and assembly are subsumed into a single protein birth reaction. For genes in state A/B ij , this production occurs with rate constant g ij . The production rate is high when the promoter is bound only by the activator (its own product). Otherwise, if unbound or repressor-bound, a low "basal" rate of expression is assumed, i.e. g 00 = g 01 < g 10 . Degradation of protein products occurs with rate k, and stochastic binding/unbinding of transcription factors to DNA occur with h and f , respectively. The model is symmetric, with equivalent parameters for the two genes.

Pluripotency Network Model
The pluripotency network model of mESCs was developed by Zhang and Wolynes [28] on the basis of experimental literature and previous models. The 8-gene network shares the same stochastic reaction framework as the ExMISA model. The genes (NANOG, OCT4, SOX2, GCNF, KLF4, PBX1, GATA6, and CDX2) suppress and activate each other through homo-and heterodimers of their encoded transcription factors (OCT4 and SOX2 form a heterodimer; all other regulatory interactions occur via homodimers). Binding of transcription factors to promoters is not exclusive. The model has five kinetic parameters: g on , g off , h, f , and k, corresponding to the rate of gene expression in the activated state, the rate of gene expression in the un-activated state, binding of transcription factors to DNA, unbinding of transcription factors from DNA, and transcription factor degradation (or exit from the nucleus). Genes are expressed at the basal rate g off except when bound by at least one activator and no repressor, in which case they are expressed with rate g on . The exception to this logic rule is NANOG, which must be bound by the the KLF4 and PBX1 transcription factor homodimers and the heterodimer OCT4-SOX2 to be activated. Overall, these interactions lead to a total of 396 biochemical reactions, with a total of 88 "species" (counting 80 distinct gene promoter configurations and 8 protein species). The complete logic rules and list of reaction rate parameters can be found in the Supplement (File S1, Table S2, and Table  S3).

Theoretical Background: the Chemical Master Equation and Stochastic Transition-Matrix
The mathematical framework of the network models is the discrete Chemical Master Equation (CME) [33], which gives the time-evolution of the probability to observe the system in a given state. In vector-matrix form, the CME can be written

5/45
where p(x, t) is the probability over the system state-space (x) at time t, and K is the reaction rate-matrix containing stochastic reaction propensities (diagonal elements k jj = − i k ij , i.e., columns sum to 0). Equation 1 assumes a well-mixed system of reacting species, and assumes that the technically infinite state-space described by x (containing molecular species numbers/configurations) may be limited to some finite number of "reachable" states, (i.e., with non-negligible probability) for an enumeration of N states of the system, K ∈ R N ×N . The steady-state probability π(x) ≡ p(x, t → ∞) over N states satisfies Thus, π(x) can be obtained from K as the normalized right-eigenvector corresponding to the zero-eigenvalue. It is sometimes desirable to work with the time-dependent stochastic transition-matrix T(τ ) rather than the time-independent stochastic rate matrix K [38]. For example, T(τ ) may be more amenable to estimation by sampling (as we demonstrate in this work for the pluripotency network, for which K is impractical to enumerate). For a CME with rate matrix K, T(τ ) is given by where exp denotes the matrix exponential. T(τ ) ∈ R N ×N 0≤x≤1 then gives the conditional probability for the system to transition between each pair of states within a lagtime τ . That is, the elements t ij give the probability that the system, if found in state i, will then be found in state j at a time τ later, and rows sum to 1. Using T(τ ), the evolution of probability over discrete intervals of the lagtime τ is given by the Chapman-Kolmogorov equation: Eigenvectors corresponding to dominant eigenvalues of the stochastic transition-matrix are associated with slow system processes. By Perron-Frobenius, for an irreducible stochastic matrix T(τ ) with eigenvalues λ i , there exists λ 1 = 1, and all other eigenvalues satisfy |λ i | < 1. Analogous to Equation (2) for K, the steady-state probability can be obtained directly from T(τ ) according to π T (x) = π T (x)T(τ ), i.e., as the normalized left-eigenvector corresponding to λ 1 . Eigenvalues λ i are related to global system timescales t i by (with t 1 giving the infinite-time, stationary result) [38]. Additionally, the Mean First Passage Time (MFPT X,Y ) where X and Y are individual states can be computed using the matrix elements T i,j by [42,43]:

Weighted Ensemble Stochastic Simulation
Stochastic reaction kinetics can be simulated by the Stochastic Simulation Algorithm (SSA) [33], which produces numerically exact realizations of the CME (Eq 1). Simulation circumvents the need for enumerating the exceedingly large system state-spaces typical of gene network models, but suffers from inefficiency due to rare 6/45 events. The Weighted Ensemble (WE) rare-event sampling algorithm [37] redistributes computational resources from high-probability regions of state-space to low-probability regions, which tend to be under-sampled in conventional simulation. The method thereby reduces computational effort in sampling rare transitions and improves accuracy of estimating probability density in, e.g., barrier-regions or tails of distributions. The method can be applied to any stochastic dynamics framework; in recent years, it has been widely applied to atom-scale Molecular Dynamics. Details of the methodology are discussed in a recent review [35] and references therein. Briefly, the algorithm works as follows: state-space is divided up into bins that span transitions of interest. The number of bins, N bins , is typically O(100), and a variety of binning procedures can be used (we use an adaptive procedure described below). Initially, a single simulation trajectory, or "replica", is assigned a weight of 1 and allowed to freely move within and between bins for a user-defined lagtime τ WE . After each iteration of τ WE , a splitting and culling procedure divides and/or combines replicas and their associated weights in such a way as to reach and maintain an equal target number of weighted replicas, M targ , in each bin. Over the course of the simulation, the combined weights of the replicas in a bin (averaged over successive iterations) will evolve toward the probability of the system to reside in that bin. By maintaining the same number of replicas in each bin (M targ ), with weights proportional to probability, the algorithm devotes comparable computational time to low-and high-probability regions. Effectively, the algorithm computes long-time processes on the basis of many short-time simulated trajectories.

Adaptive Binning Procedure
As with other enhanced sampling methods, the WE algorithm requires dividing of state-space into defined sampling regions or "bins". For high-dimensional systems, discretization poses a challenge because, for an N -dimensional, evenly spaced grid, the number of required sampling bins increases exponentially with the number of degrees of freedom. To address this challenge, a variety of Voronoi-polyhedra-based procedures have been developed [44][45][46]. These methods balance the need to focus simulation toward regions with non-negligible probability, while still enabling capture of rare transitions of interest. In addition to efficiently discretizing high-dimensional spaces, the methods have the benefit of requiring little to no a priori knowledge of system dynamics (e.g., of the locations of regions of interest, or of appropriate progress coordinates for transitions). We utilize an adaptive binning procedure from ref. [46]. Each bin (of user-defined number N bins ) is a Voronoi polyhedron with a generating node; the bin is defined as the region of state-space encompassing all points closer to the generating node than to nodes of any other region. After each lagtime τ WE , new Voronoi regions are generated by successively selecting N bins node-positions from the current replica positions in a way that maximizes the Euclidean distance between them. By this procedure, over the course of the simulation, bins spread to encompass all areas of state-space reached by any simulated trajectory. After sufficient iterations, the bin positions stop spreading to new areas but continue to fluctuate. The procedure is shown by representative simulations in Movement of Voronoi Centers during weighted ensemble sampling. Starting from the left are shown three successive iterations of the adaptive WE simulation for a representative network..

Computation of Transition Rates
One important output of WE sampling is the quantitative rate of transitions between regions of interest, which may be difficult or impossible to estimate from conventional simulation. WE sampling may be run in different modes, depending on whether the 7/45 sought-after information concerns a specific transition of interest, or a more global picture of system dynamics, i.e., encompassing approximate rates of transitions among many system states. We term the two modes "rate" mode and "global transition-matrix" mode. The former can deliver a more accurate estimate for a particular state-transition, while the latter can yield a more comprehensive, but approximate, measure of global system dynamics.
In rate mode, the user specifies two regions of interest, X and Y , The flux of probability into/out of regions of interest can be estimated by recording the amount of weight transferred at the end of each simulation iteration. The mean first passage time of transitions from X to Y (MFPT X,Y ) is given in general by the inverse of probability flux from X to Y . In practice, we apply a"labeling" scheme [47,48], where each replica is labeled as belonging to either set S X or S Y according to its history, i.e., whether it most recently visited region X or Y , respectively. The summed weight of all replicas in S X is given by P S X , and P S X + P S Y = 1 satisfies probability conservation. Then, where Φ SS (Y |S X ) is the average probability flux from S X into Y at steady-state, which is measured by the weight of S X -labeled replicas entering Y during the simulation after convergence to steady-state. The labeling scheme enables accurate estimates, including for non-Markovian transitions. For Markovian transitions well-described by a single rate-constant, k X,Y = 1/MFPT X,Y .

Computation of Network Transition-Matrix
Running WE in transition-matrix mode enables visualization and analysis of global system dynamics on the basis of a single simulation, and requires no designation of regions of interest. In this mode, the previously-converged Voronoi bins are fixed, and simulations are used to estimate a coarse-grained stochastic transition-matrix T(τ ) of size N bins × N bins . The coarse-grained T(τ ) approximates the true dynamics over the full state-space, as given by T(τ ). Thus, the procedure enables estimation of the global transition-matrix (and subsequent analysis) in systems where enumeration of states is not feasible. To estimate T(τ ), the weight transferred between bins is recorded at each iteration, and the elements of the transition-matrix are estimated according to [47]: where w i,j 2 is the average weight transferred from bin i to bin j over the iteration time τ W E (counting only after at least 2 transitions, and averaging over multiple iterations) and w i is the average population (summed weight) in bin i. By construction, this is a row-stochastic transition-matrix with state-space "resolution" determined by N bins (each state in the full state-space sampled by the simulation is assigned to its nearest neighboring Voronoi node). The lagtime τ of the transition-matrix corresponds to the sampled WE-time τ WE . However, use of T(τ ) to compute system dynamics imposes a Markovian approximation, by which equilibration of replicas within bins is assumed to be rapid on the timescale of τ , and hops between states (i.e. bins) are memoryless. As such, while this mode of simulation has the advantage of acquiring a holistic view of global system dynamics, it has the disadvantage of introducing a Markovian approximation.

8/45
Coarse-Graining Procedure to Classify Phenotype-States While the sampled N bins × N bins transition-matrix provides a global approximation of the epigenetic landscape and state-transitions, we apply a method to further coarse-grain dynamics, known as the Markov State Model framework [29,36,38]. This automated procedure produces a highly simplified representation of global dynamics in terms of a few (generally < 10) clustered sets and the transitions among them. Such highly-reduced models can be beneficial in terms of human intuition of system dynamics, comparison to experiments, and-in this application-automated designation of dynamic phenotype-states. The method utilizes the concept of metastability, i.e., system states that experience relatively fast transitions among them are clustered together into the same coarse-grained set. Collectively, the coarse sets experience relatively rare inter-cluster transitions and frequent intra-cluster transitions. We employ the metastability concept as a definition of cell phenotype, reasoning that a phenotype should be a relatively stable attribute of a cell, and stochastic inter-phenotype transitions should be relatively rare. In practice, we employ the Markov State Model framework to further reduce the sampled row-stochastic transition-matrix T(τ ) from size N bins × N bins down to C × C, where C is the number of coarse-grained clusters. As the Markov State Model (MSM) is itself a stochastic transition-matrix on a coarse-grained space, it implies a more severe Markovian approximation. It provides a way to describe global system dynamics in a highly simplified way while maintaining high accuracy to the slowest system dynamics as sampled by T(τ ). In previous work, we demonstrated the application of this coarse-graining approach to automatically designate phenotypes in small gene networks [29]; here, we extend the applicability of the coarse-graining to large, complex networks by combining it with rare-event sampling. The coarse-graining procedure is a spectral clustering method based on the Perron Cluster Cluster Analysis (PCCA+) algorithm [49], which optimizes the (nearly)-block-diagonal structure of T(τ ) for systems with metastability. The signature of such metastability is a separation-of-timescales for intra-and inter-basin dynamics, which may be seen as gaps in the eigenvalue spectrum [38]. As noted above, T(τ ) (or its sampled counterpart, T(τ )) has λ 1 = 1, corresponding to the infinite time-limit. If a set of m dominant eigenvalues exists, such that for decreasing eigenvalues λ i 1, i ∈ {2, ..., m}, and a gap is present, λ j << λ m for j > m, this indicates the presence of m slow-timescale processes in the system, and further indicates that T(τ ) may be re-ordered to give m nearly-uncoupled blocks. In practice, the algorithm attempts to find a coarse-graining onto C clusters, where C may be user-defined, or may be determined algorithmically, e.g., according to the spectral gap [49]. Here, we choose C clusters, where the last significant gap in the spectrum is seen between λ C and λ C+1 . For the GRNs studied here, this corresponds to choosing C such that λ C /λ C+1 > 10.

Transition Path Analysis
The coarse-grained model of system dynamics given by the MSM enables estimation of the ensemble of dominant transition paths among phenotypes, along with their relative probabilities. We adopt methods from Transition Path Theory according to Noe, et al. [36] (details therein). Briefly, T(τ ) can be used to compute the effective flux of trajectories, along any edge in the coarse-grained network, contributing to transitions between states X and Y (where these designated states correspond to one or more coarse-grained phenotype-states produced by the MSM). A pathway decomposition algorithm on the matrix of effective fluxes for X → Y transitions then yields a set of dominant pathways and the relative contribution of each to the overall flux. Each state in the MSM is analogous to a cell phenotype, and transition path analysis is used to identify parallel phenotype transition paths and the relative rates of transitioning 9/45 between phenotypes.

Visualization of Epigenetic Landscapes
Both the sampled transition-matrix T(τ ) and the coarse-grained MSM encode stationary and dynamic information about global dynamics-that is, they quantify the epigenetic landscape. For visualization, we use Gephi graph visualization software [50] using the Force Atlas algorithm. Every circle (or node) in the graph corresponds to a sampling bin or to a coarse-grained phenotype, and the area of a circle is proportional to its relative steady state probability according to ln(γP SS ), where P SS is the steady state probability of the node and γ is a constant chosen to improve visibility of low probability regions of the landscape. Lines between circles (edges) correspond to transitions between sampling regions or coarse-grained phenotype. Their thickness and coloring correspond to their relative transition probability and source state, respectively.

Validation: Numerical Solution of the Chemical Master Equation
To validate the simulation method, we compare the simulated dynamics to the numerical solution to the CME. We choose the parameters of the ExMISA model in such a way as to restrict the effective state-space, so that a numerical solution of the CME is tractable. Building the reaction rate matrix K ∈ R N ×N requires enumeration of N system states. In general, if a system of S molecular species has a maximum copy number per species of n max , then N ≈ n S max . In the ExMISA model, the state-vector is given by For enumeration, we neglect states with protein copy-numbers larger than a cutoff value which exceeds g 10 /k (corresponding to the average number of transcription factors maintained in the system from a gene while in its active state). For example, with model parameters g 10 = 18 and k = 1, we truncate at n a,max = n b,max = 41 and assume that probability flux between states with n a , n b ≤ 41 and states with n a , n b > 41 is assumed to be 0 (i.e., the boundaries of the state-space are reflective). Including the gene-binding states, this gives N = 3 × 3 × 42 × 42 = 15876 states. This size is tractable for complete solution of the CME using matrix methods in MATLAB [51]. This truncation of the state-space introduces a small approximation error (see Error in computed steady-state probability as a function of N , the number of protein states retained in the state-space truncation. N corresponds to the maximum allowed copy-number of transcription factors a and b in the ExMISA network. For a truncation to N , probability flux between states with n a , n b ≤ N and states with n a , n b > N is assumed to be 0 (i.e., the boundaries of the state-space are reflective). The error where i runs over all enumerated states of the state-space with truncation to N + 1 (all states outside the boundary have probability 0). That is, the error is computed as the sum of the absolute difference between steady-state probabilities for each state, comparing π[N ] (steady-state probability computed with truncation to N ) to π[N + 1] (truncated to N + 1).).
The pluripotency network has 8 genes with copy numbers of O(10 3 ) (determined by the parameters g on /k = 3900). The number of distinct binding-promoter states for each gene are 16,32,8,8,2,8,4, and 2 for GATA6, NANOG, CDX2, OCT4, SOX2, KLF4, GCNF, PBX1, respectively (see Table S2). Together these combinations enumerate a state-space of N > 10 30 ≈ 1000 This size precludes solution of the CME, and we instead estimate the dynamics by WE sampling. Where possible, we validate the WE-sampling results by "conventional", i.e., by direct simulation using SSA.

Validation of Coarse-Grained Models
To check the validity of the coarse-grained MSM as a representation of the global dynamics, we use the Chapman-Kolmogorov test to compare the relaxation curves of the coarse-grained system to those found through direct SSA following Equation 4 [38]. If the coarse-graining is appropriate, the relaxation curves of the MSM probabilities will match the relaxation profile of long conventional (direct SSA) simulations initiated within each coarse-grained phenotype. Transition paths through the coarse-grained phenotype network are validated, where possible, against conventional SSA simulation.

Implementation and Software
Stochastic Gillespie (SSA) simulations were carried out using BioNetGen [52]. WE sampling was implemented with in-house software code written in MATLAB. Simulations were run on the high performance computing cluster (HPC) at the University of California, Irvine, and parallelization of BioNetGen SSA simulations was performed using the Sun Grid Engine scheduler. The coarse-graining procedure and transition path analysis was implemented in python scripts, adapted from MSMBuilder [53] and Pyemma [42], respectively. Transition-matrix and MSM visualization was carried out using Gephi software and the Force Atlas layout [50]. All simulation parameters can be found in the supplement Table S4.

Rare States and Transitions in Gene Regulatory Networks are Accessible by Rare-Event Sampling
We first apply the computational pipeline to a small two-gene model (the exclusive Mutual Inhibition, Self-Activation model, ExMISA, see Methods), exhibiting an archetypal motif for cell fate-decisions [39,40]. The model is tractable for computation of full, discrete stochastic dynamics to within a small approximation error using matrix methods. Thus, the model provides a numerical benchmark for assessing the accuracy of the simulation method, before extension to larger systems where solution of the Chemical Master Equation (CME) is intractable. For the chosen parameters, the ExMISA model shows four peaks in the steady-state probability distribution (projected onto protein copy numbers, n a and n b ). Peaks in probability correspond to basins in the so-called quasipotential landscape, defined by U = −ln(π(x)) (Fig 2). The four peaks/basins corresponds to four possible combinations of binarized A/B gene expression: hi/hi, hi/lo, lo/hi, and lo/lo. These four phenotype-states arise due to the combination of balanced repression and self-activation in the network, and the slow kinetic parameters (Supplementary Table S1) for transcription factor binding and unbinding to promoters that effect changes in individual gene-activity states between low and high expression rates [29,54]. i.e., the presence of metastability. C) Four-phenotype coarse-grained models automatically generated from the clustering algorithm (see Methods). Each colored circle represents a cell phenotype, sized proportionally to its probability. Edges are inter-phenotype transitions (colored by source-state, with width proportional to probability). The full CME and simulation pipeline identify similar metastable phenotype networks (see Difference in Coarse-Grained clustering for the 2-gene ExMISA cell decision network studied through the numerical benchmark (top) and the WE sampling pipeline (bottom). The color of each coarse-grained phenotype cluster corresponds to the expression level of protein a/b: lo/lo (black), hi/lo (red), lo/hi (blue), hi/hi (magenta). All enumerated state phenotypes are sized proportionally to their probability for each of the nine gene configurations for the numerical benchmark, while only the centers of each sampling region are shown for the WE sampling computational pipeline. While the centers of the sampling regions are mostly well separated according to their gene configuration, the phenotype states assigned to the sampling region extend across multiple gene configurations due to the choice of euclidean distance metric in assigning phenotype states to sampling regions. for details).

12/45
The WE-based simulation method enabled estimation of global dynamics of the ExMISA model. By redistributing computational resources from relatively high-probability to low-probability regions (see Methods), the WE method enabled uniform sampling of the quasipotential landscape, i.e., mapping basins (high-probability regions) along with high barriers (low probability regions) (Fig 2a). The simulation estimated individual steady-state bin-probabilities as low as 1.3 × 10 −6 and showed good global agreement with the numerical CME benchmark (see Fig 2 and Supplement, Convergence of the flux of the transition between the polarized phenotype-states in the ExMISA network. The 5% and 95% confidence intervals for the long conventional simulation are shown in dotted blue lines. The flux between the a/b hi/lo and lo/hi phenotypes was calculated using WE sampling with parameters: τ = 200, 300 bins, and 50 replicas per bin. The system was sampled for 1100 iterations of τ . ).
In addition to sampling global dynamics, the WE method can be used to estimate rate constants for individual, rare transitions of interest. The Mean First Passage Time of the global network switch from the center of one polarized phenotype-state to another, i.e., MFPT X→Y from protein a/b expression level hi/lo to lo/hi was estimated from WE to be 1.82 × 10 5 (see Table Computed Mean First Passage Times in the ExMISA Network-Comparison of Different Methods), in agreement with the CME result.

Phenotype Transitions can be Approximated by Markovian Jumps, Enabling Construction of Coarse-Grained Models
A network transition-matrix T(τ ) over sampled bins (N bins = 300) was constructed from WE sampling for ExMISA and used for subsequent analysis of global system dynamics. By comparison, a full network transition-matrix T(τ ) over the enumerated system state-space was constructed from the CME (N = 15876, see Methods). The full, computed (T(τ )) and simulated ( T(τ )) transition-matrices showed qualitatively similar eigenvalue spectra with four dominant eigenvalues, indicating the presence of metastability (separation-of-timescales between intra-basin and inter-basin transitions) (Fig 2b). The slow system-timescales predicted by the full CME model corresponding to eigenvalues λ 2 , λ 3 , λ 4 were t 2 , t 3 , t 4 = 6.8 × 10 4 , 4.2 × 10 4 , 1.0 × 10 4 respectively, in units of k −1 where k is the protein degradation rate (the Perron eigenvalue λ 1 = 1 is associated with the infinite-time (stationary) distribution). The corresponding values given by the WE-simulated T(τ ) were 6.1 × 10 4 , 3.5 × 10 4 , 9.4 × 10 3 , respectively. These numbers demonstrate how the sampled T(τ ) enables global approximation of slow system timescales to < 20% relative error. Quantitative error in these values depends on both "spectral" (lagtime) and discretization error [38] (see Convergence of the slowest implied timescale t 2 with increasing number of sampling regions (bins) and increasing lagtime τ . The lagtime calculated using the truncated CME is shown in gray. The accuracy of the WE approximation increases monotonically with increasing bin number and lagtime. ). In contrast, WE sampling in "rate mode" (see Methods) enabled highly accurate estimation of MFPT X→Y to within 2% error (Computed Mean First Passage Times in the ExMISA Network-Comparison of Different Methods).
According to the Markov State Model framework, the presence of timescale separation indicates that a simplified model, retaining a few coarse-grained metastable states with Markovian transitions among them, can reasonably approximate the full system dynamics. Using this approach, we label the metastable sets as phenotypes accessible to the network, reasoning that a useful classification of cell phenotypes should be one that gives relatively stable, rather than transient, cell types. We apply the Markov State Model coarse-graining procedure to both the full T(τ ) and simulated T(τ ), yielding similar results. The coarse sets (or metastable phenotype-states) in the reduced models for both cases are generated automatically, and map directly onto the four basins seen in the quasipotential landscape (i.e., the gene A/B expression hi/hi, hi/lo, lo/hi, and lo/lo cell phenotypes). The reduced models are visualized by network graphs, in which node sizes are proportional to steady-state probability, and the thicknesses and lengths of edges are proportional to the transition probability between them (on lagtime τ ) (Fig 2c). Numerical values for the reduced models can be found in Transition Matrices of Metastable Phenotype Clusters (MSMs). The network graph can be considered to be an alternative representation of the global epigenetic landscape, which contains both stationary and dynamic information. (In contrast, the epigenetic landscape plotted as a quasipotential function does not explicitly contain dynamic information, due to non-gradient dynamics [16]).
Validation of the coarse-grained model can be carried out according to the Chapman-Kolmogorov test [38], which tests how well the relaxation dynamics initialized in the metastable phenotypes approximate the dynamics that are predicted either by the full model (CME) or simulated trajectories. According to this test, relaxation dynamics out of metastable phenotypes from WE sampling was predicted with error values between 0.02 and 0.12 for all phenotypes (The Chapman-Kolmogorov test on the four Markov State Model phenotypes of the sampled ExMISA network. The relaxation curves and variance of a 1000τ trajectory are shown in red. The relaxation curve predictions from the MSM transition matrix is shown in black. The total error σ is measured as the 2-norm containing the differences between the two estimates of the relaxation curve. ). Together, these results indicate (i) that a Markovian model of phenotype transitions is a good approximation of the full system dynamics for the ExMISA model, and (ii) that the WE-simulation based computational pipeline predicts a quantitatively similar coarse-grained phenotype-network to the full CME model.

The Method Maps the Epigenetic Landscape and Identifies Dominant Phenotypes in a Pluripotency Network Model
We apply the computational pipeline to a pluripotent fate-decision network from mouse Embryonic Stem Cells (mESCs) introduced by Zhang et al. [28] (Fig 3A). The network comprises eight interacting genes: NANOG, GATA6, CDX2, SOX2, OCT4, GCNF, and PBX1. Three of these genes, NANOG, SOX2, and OCT4 have been suggested to maintain pluripotency [55], and NANOG inhibits the expression of differentiation markers [56]. The GATA6 and CDX2 genes have been used in experiments as markers of differentiation, with the GATA6 transcription factor being a marker of the primitive endoderm cell lineage, and the CDX2 transcription factor being a marker of the trophectoderm lineage [57]. state-transition graph of sampled network states. Circles represent aggregate gene-expression states sampled during the Weighted Ensemble simulation. Circle areas are proportional to the steady-state probability π i in each state according to ln(γπ i ) with scaling factor γ = 3.4. States are colored according to the gene expression levels of three of the genes; red, green, and blue correspond to high NANOG, GATA6, and CDX2 expression respectively, while black corresponds to low or no gene expression. Edges connecting the states indicate possible state-transitions, colored according to the originating state. The graph is produced using Gephi [50] using a force-directed layout algorithm (Force Atlas), therefore short inter-state distances reflect higher probability of transitioning. C) Full protein compositions of two representative states, with either high CDX2 expression (blue) or high NANOG expression (red). States in (C) correspond to yellow circles in (B).
Using the WE-based computational pipeline, we estimate T(τ ) with a resolution of N bins = 250. To visualize the global landscape as a graph network at this resolution, we plot the converged T(τ ) using a force-directed automated graph layout [50] (Fig 3B). The barbell shape of the network reflects the broad antagonism between pluripotency and differentiation genes, which is a general feature of the overall network topology. At the same time, each "pole" comprises multiple distinct patterns of gene expression (seen in the graph as different colors with full compositions in Fig 3C), hinting at the existence of multiple phenotypes associated with both pluripotency and lineage-specification. Moreover, the network representation reveals numerous links between pluripotent and differentiated states, pointing to both direct and indirect transitions, through a network of relatively transient intermediate states.
To further analyze the global dynamics of the pluripotency network, we apply the Markov State Model coarse-graining framework. The simulated T(τ ) shows gaps in the eigenvalue spectrum after four and after six eigenvalues (Fig 4a). The corresponding approximate timescales are given by t 2 , t 3 , t 4 , t 5 , t 6 = 1.1 × 10 5 , 95, 51, 12, 12 (k −1 ), respectively. These values, though only approximate, indicate the presence of a single long timescale process (t 2 ) corresponding to transfer between differentiated and 15/45 pluripotent states, while transitions within those basins (t 3 , etc.) occur at least four orders of magnitude more quickly. Applying the coarse-graining algorithm to achieve six clusters results in a reduced model (Fig 4b), with the clusters representing metastable phenotypes. The phenotypes can largely be distinguished in the subspace of NANOG, GATA6, and CDX2 expression levels; the differentiated phenotypes show expression of either GATA6 (primitive endoderm, PE), CDX2 (trophectoderm, TE), or both (denoted an intermediate cell type, IM). Phenotypes associated with pluripotency do not express high levels of GATA6 or CDX2, and may express high levels of NANOG (stem cell, SC). The coarse-grained model reveals two separate pluripotent phenotypes that are low in NANOG expression: one which expresses other pluripotent factors OCT4, SOX2, and KLF4 ("Low NANOG 1" LN1), and one which has low expression of all factors ("Low NANOG 2" LN2) (Fig 4c). Overall, these phenotypes broadly match experimentally-determined categories, coincide with steady-states of the stochastic model computed previously by a CME-approximation method [28], and coincide with phenotype-states identified in related pluripotency GRN models [58]. The steady-state probabilities associated with the phenotypes are highly nonuniform, with 95% of the population divided nearly evenly between the IM and LN1 phenotypes, which are associated with differentiation and pluripotency, respectively. The LN2 state is rarest, comprising only 8 × 10 −4 % of the population. Together, these results indicate that the clustering method identifies both common and exceedingly rare phenotypes in the in silico cell population modeled by simulation trajectories. Furthermore, the automated method identifies both expected and novel phenotypes. C) The averaged gene expression levels (copy numbers) of each transcription factor for each phenotype and their respective steady-state probabilities. D) The four most probable transition pathways from the SC state to the TE state (differentiation) and from the TE state to the SC state (dedifferentiation). E) The highest probability transition paths projected onto three protein coordinates, NANOG, GATA6, and CDX2. Differentiation from SC to TE is visibly irreversible.

The Method Reveals Multiple, Irreversible Pathways for Phenotype Transitions in the Pluripotency Network
Previously, Markov State Models constructed on the basis of Molecular Dynamics simulations were used to analyze the ensemble of distinct pathways of protein-folding [36]. Here, we utilize the coarse-grained model of phenotype transitions in the pluripotency GRN in a similar manner, to analyze pathways of cell differentiation and dedifferentiation. Using Transition Path Theory, the method identifies the pathways that carry the greatest fraction of net probability flux, among sequences associated with successful SC→TE transitions (and reverse) (Fig 4d,e). Transition paths between the stem cell (SC) and PE phenotypes can be found in Pathway decomposition for the SC → PE transition for f = 10. For Parameter Set I, the method identifies three pathways encompassing > 98% of the probability flux for both forward and reverse transitions. While the SC→ TE transition is most likely to occur directly through the LN1 state (i.e., NANOG expression will shut off, followed by turning on CDX2), the reverse transition shows a different route through the IM and PE states (i.e., GATA6 expression turns on, then CDX2 turns off, then GATA6 turns off, and finally NANOG turns on).
Dynamic analysis of the coarse-grained model, including analysis of transition paths, relies on the Markovian approximation for inter-phenotype transitions. In the pluripotency network, stochastic transitions between pluripotency (SC, LN1, LN2) and differentiation (TE, IM, PE) basins are infrequent relative to transitions within those basins, justifying the Markovian assumption, since the system equilibrates within those basins much more rapidly than inter-basin transitions occur. However, the Markovian assumption may be less accurate for describing intra-basin transitions between phenotypes, which occur much more frequently. Despite the coarse-grained model encompassing transitions on highly disparate timescales, the qualitative results of transition path analysis were validated by collected conventional simulation trajectories (not subject to any Markovian assumption), which identified the same dominant transition paths (Validation of the SC → TE transition pathway calculated through weighted ensemble sampling. The parallel transition pathways are compared against those calculated from a single long conventional simulation.). Overall, these results indicate that a stochastic excursion of a cell from the SC to TE phenotypes and back maps a cycle in gene-expression space, echoing previous studies indicating nonequilibrium dynamics in GRNs [16,23]. The results further indicate that the Markov State Model, while a highly coarse-grained approximation, can provide an accurate estimation of inter-phenotype transition dynamics.

Cell Phenotype Landscape and Transition Dynamics are Sensitive to Kinetic Parameters
We applied the computational pipeline to the pluripotency network using two different rate parameters sets (see File S1), which differ in rates of transcription factor binding and unbinding to DNA. In line with previous studies [23,24,29], we found that increasing the so-called adiabaticity (i.e., increasing h and f , or the rates of TF-binding relative to protein production and degradation, Parameter Set II) led generally to rarer inter-phenotype transitions (see Table 1). For example, in Parameter Set I, the Mean First Passage Time (MFPT) for transitions from SC → TE was calculated to be 1.36 × 10 5 in units of k −1 , as compared to 8.13 × 10 8 for Parameter Set II. The MFPTs of the reverse transition TE → SC for each set were 2.70 × 10 5 and 5.82 × 10 9 , respectively (see Table 1 and Computed Mean First Passage Times of Inter-Phenotype Transitions in the Pluripotency Network (Parameter Set I)). These differences in magnitude broadly reflect that moving toward the adiabatic regime leads to increased epigenetic barriers between phenotypes.  (1)) (left columns) and for transitioning between the pluripotency state (SC) and the trophectoderm state (TE) (right columns), in units of the inverse transcription factor decay rate, k −1 . Transitions for Parameter Set I were computed using the WE method in rate mode while transitions for Parameter Set II were estimated from the sampled transition matrix. The definitions of SC and LN(1) are analogous to the high NANOG production (N hi ) and low NANOG production (N lo ) transitions measured in experiments [8,9]. Increasing the adiabaticity (i.e., the rates of DNA-(un)binding, h, f ), leads to rarer inter-phenotype transitions. The simulations also show that, within the same gene network for a given parameter set, inter-phenotype transition times span four orders of magnitude.

18/45
In addition to generally slowing transitions, the increased adiabaticity of Parameter Set II gives rise to an epigenetic landscape structure that is distinct from that of Parameter Set I, with altered steady-state phenotype probabilities (Fig 5a). The eigenvalue spectrum shows qualitatively distinct features as well, with a gap after five values (Fig 6a). As such, the Markov State Model framework identifies five dominant phenotypes in the network, which correspond broadly to those of Parameter Set I, except that only a single Low-NANOG (LN) phenotype is identified (Fig 6b). Most of the steady-state probability is contained in the IM state (Fig 6c). In addition to altering the transition rates and relative phenotype probabilities, the kinetic parameters altered the dynamics of differentiation and dedifferentiation. The two likeliest pathways of forward (and reverse) SC → TE transitions follow the same route through LN and IM phenotypes (Fig 6d,e). Alternative differentiation pathways of forwards (and reverse) SC → PE transitions can be found in Pathway decomposition for the SC → PE transition for f = 50. These results indicate that, while the same GRN model with different kinetic parameters may give rise to qualitatively similar phenotypes, they differ in quantitative stationary and dynamic features, including relative steady-state probabilities, transition times, and likeliest transition pathways.  Table 1). (B) States visited in conventional SSA simulation (using the same initialization, definitions, and placement as in (A)). In the conventional simulation, a transition out of the IM phenotype was never observed.

Efficiency of Rare-Event Sampling Compared to Conventional SSA
Phenotype transitions that are relatively rare can be difficult to observe with conventional SSA simulation. We compared simulated landscapes (based on estimated T(τ )) from the computational pipeline to those obtained from an equivalent (large) number of SSA simulation steps (Fig 5a,b). Additional comparisons of synthetic cell populations using tSNE visualization reflect the rarity of phenotypes and phenotype 21/45 transitions (Fig 7, tSNE plot of replica populations with rescaled weight (ln(P s ) + 30)). This comparison revealed that the WE-based method uncovers multiple phenotypes and associated transitions that are invisible to conventional simulation due to the rarity of exiting metastable basins. Quantitative estimates of efficiency gains for WE have often been based on comparing the number of simulation steps required to estimate a desired quantity (such as a rate constant) using WE versus conventional simulation [59]. Treating T(τ ) as the desired output (as it contains holistic dynamic information for the system), we estimate the efficiency gain of our pipeline by computing: Sim. steps to estimate T(τ ), Conv.
However, it is often difficult to acquire the required number of steps for conventional simulation, so an approximate lower bound for the denominator can be estimated according to: Sim. steps to estimate T(τ ), Conv.
where simulation steps and transition times are measured using the same time-unit (here, k −1 ). That is, the denominator is the sum over the MFPTs of transitions between each pair of states (bins), i, j, where i, j = 1...N bins . This approximation is based on the rationale that one requires simulation time O(MFPT) to observe at least one transition between a given pair of states. From the WE-estimated transition-matrix T(τ ), estimates of the MFPT for transitions between any pair of states (bins) can be obtained using Eq 6. According to Eq 9, we estimate that our pipeline provided efficiency gains of 3000 for ExMISA (Fig. 2), 200 for Pluripotency Parameter Set I (Fig. 3), and 4 × 10 7 for Parameter Set II (Fig. 6). These numbers show that the pipeline affords a significant speedup over conventional simulation in providing global dynamic information. The numbers further show that the efficiency gain is most pronounced for the Pluripotency network with exceedingly rare inter-phenotype transitions.

Discussion
In this work, we present a method for efficient, automated computation of epigenetic landscapes, metastable phenotypes, and phenotype-transition dynamics of stochastic GRN models. Our computational pipeline was inspired by studies of metastability and barrier-crossing in Molecular Dynamics, and our application of the pipeline to cell-scale networks addresses a number of current challenges for stochastic GRN dynamics. First, it overcomes the curse-of-dimensionality of complex models, by leveraging available rule-based modeling tools for stochastic biochemical networks [52]. Second, it overcomes the challenge of efficiently simulating stochastic systems with rare events, by using enhanced Weighted Ensemble rare-event sampling [37]. Third, it addresses the challenge of extracting and interpreting essential dynamics of complex systems on the basis of simulated trajectories, by using the Markov State Model framework [36] to automatically generate a compact, approximate representation of global system dynamics. Combining these tools into a unified pipeline provides an automated means of computing and visualizing essential stationary and dynamic properties of stochastic GRNs, including the number and identities (i.e. state-space mapping) of metastable phenotypes, their steady-state probabilities, and most-likely pathways of inter-phenotype transitions and their transition rates. By advancing the capability to compute and interpret hypothesized or experimentally-derived stochastic GRN models, the method can yield insight into how "local" stochastic, molecular processes involved in epigenetic regulation affect "global" dynamics such as phenotypic stability and fate-transitions in cells. Moreover, it can help close the gap between dynamic, molecular-detailed models of gene regulation and cell-population level experimental data, to inform rational cell reprogramming strategies.

Insights from the Pluripotency Network Simulations
We used the pluripotency network as a model system to develop and demonstrate the simulation approach, but the results also yielded biological insights. For example, the simulations revealed a hierarchical structure of the epigenetic landscape. The network-exhibiting 5-6 metastable phenotypes-occupies a limited subspace from the vast possible gene combinations (e.g., 2 8 = 256 possible distinct on/off combinations of gene expression states). The dominant feature of the global landscape is a high barrier/slow timescale between pluripotent and differentiated phenotypes. Within each of these categories, further sub-states were identified. The model revealed multi-timescale dynamics of phenotype transitions; the pluripotency network showed relatively rapid transitions between phenotype-states that differed in the expression-level (high vs. low) of a single gene, e.g. the high NANOG to low NANOG transition, whereas phenotype transitions involving a change in expression level of seven genes, e.g. the SC macrostate to the TE macrostate, occurred five orders of magnitude more slowly on average. While the accessible phenotypes appear broadly similar across parameter sets, the relative stability and transition dynamics among phenotypes were sensitive to kinetic parameters governing transcription factor binding/unbinding. A global change in these parameters (affecting all individual transcription factor-DNA interactions equally) changed the shape of the landscape, altering the relative steady-state probabilities of different phenotypes and the likely transition pathways linking them. The DNA binding parameters capture the local epigenetic mechanisms that enable/disable transcription factors from accessing regulatory elements. A global rate change nevertheless has a varying influence on different genes because the number of regulators differs, as does the molecular logic by which activators and repressors exert combinatorial control on different genes. These results echo findings that global modification of chromatin regulators often have lineage-specific effects [60]. These results highlight both the need for, and the challenge, of informing cell reprogramming strategies with quantitative network models, as they suggest that the dynamic response of cellular networks to perturbations is governed by the detailed kinetics of molecular regulatory mechanisms, which are generally difficult to parameterize.

Dynamic Definition of Cell Phenotype
The Markov State Model framework implicitly imposes a dynamic definition of cell phenotypes; the number of phenotypes was determined using spectral gap-analysis, and the coarse-graining algorithm automatically identified metastable aggregates (i.e., grouped sampled network states into larger clusters). This is different from the classifications of phenotypes that are generally used in analyzing experimental data, where gene expression or marker levels are often used to categorize cells. However, experiments have also revealed the potential need for a dynamic definition of cell phenotype, based not only on single-timepoint measurements of gene expression or phenotype-markers, but also on information from past or future timepoints [4,8]. For example, Filipczyk et al. [8] identified distinct subpopulations within a compartment of NANOG-negative cells in mESCS, which differed in their propensity to re-express NANOG. At the same time, fluctuations between low-and high-NANOG expressing cells were not necessarily associated with any functional state change. The Markov State Model approach, based on kinetic/dynamic coarse-graining, thus provides a quantitative approach for classifying phenotype-states that is both completely generalizable rather than ad hoc (it requires no a priori knowledge or designation of markers/genes) and is in line with these recent experiments revealing the need for a dynamic definition of phenotype.

Timescales of Stochastic Phenotype Transitions
Markovian transitions (i.e., memoryless "hops") among cell phenotypes have been observed experimentally: examples include transitions among phenotypes in cancer cells, as measured by flow cytometry [10], and among pluripotency-states in mESCs, as measured by time-lapse microscopy of fluctuating gene expression [7][8][9]. The compact nature of these data-inferred networks-showing hops among a limited set of broad phenotypes-suggests that the computed MSM framework advanced in this study provides an appropriate level of resolution at which to analyze GRN dynamics and may serve as a useful tool for comparing models to experimental data.
Experimental studies have quantified the timescales of Markovian transitions between NANOG-high and NANOG-low states in mESCs [8,9]. From Hormoz et al., the probability of transitioning from NANOG-high to NANOG-low in mESCs is 0.02 per cell cycle, while that of the reverse transition is 0.08. These values represent a relatively rapid transition rate, since NANOG expression is known to be particularly dynamic [56]. Similarly, plasticity has been observed in cancer cells where quantitative estimates of stochastic cell transitions between a stem cell cancer cell phenotype to a basal cancer cell phenotype were observed to be roughly on the order of 0.01 to 0.1 per cell cycle [10]. We can translate our model results to approximate biological timescales: the degradation rate, which sets the timeunit for model results (i.e., k is taken to be 1) was experimentally determined to be on the order of a few hours (in the E14 mouse embryonic stem cell line, the half-lives of NANOG, OCT4, and SOX2 are approximately 4.7, > 6, and 1.6 hours, respectively [61]). Assuming that degradation is unimolecular, k = ln(2)/t [N AN OG]1/2 , and the half-life of NANOG, t [N AN OG]1/2 = 5 hours, the degradation rate is k = 0.1. Using a mESC cell cycle time of 12 hours [62], the simulations for Parameter Set I then predict NANOG-high to NANOG-low transitions occurring with a rate of 0.03 per cell cycle, and of 3 × 10 −3 for the reverse. For Parameter Set II, the computed rates were 8 × 10 −6 and 5 × 10 −5 , respectively. Comparison of these computed and experimental rates of NANOG transitions indicates that Parameter Set I (f = 10) is more in line with experimental observations, while Parameter Set II (f = 50) gives transition rates that are three orders of magnitude too slow. These results are in agreement with previous findings from theoretical studies that GRNs in pluripotency networks operate in a so-called "weakly-adiabatic" regime [24,27,28], in which the timescale of DNA-binding by transcription factors is on the order of transcription factor production and degradation.

Comparison to Other Models and Computational Approaches
A number of theoretical studies have elucidated dynamics of stochastic molecular-detailed GRN models (i.e., models that include molecular fluctuations and regulatory mechanisms, in contrast to Boolean models [63]). These studies have largely focused on small 1-or 2-gene motifs[ [21-25, 32, 41]], but recent years have seen extension of stochastic methods to studies of more complex, experimentally derived GRN models encompassing O(10) genes. For example, determination of global dynamic properties of such networks has been achieved by combining information from long stochastic simulations of discrete models [27,58], or of continuum SDE models, in combination with path integral approaches [54,64]. The pluripotency network studied herein was developed by Zhang and Wolynes [28]; in their work, the authors developed a continuum approximation to the Chemical Master Equation that enabled quantitative construction of the epigenetic landscape. Here, we present an alternative approach that is unique in two major aspects: (1) the use of stochastic simulations (i.e., SSA [33]), which is enabled by use of the WE rare-event sampling algorithm, and (2) the automated Markov State Model framework for designating phenotypes and constructing a coarse-grained view of the epigenetic landscape. While we utilize a different framework (that of coarse-grained, discrete stochastic models) from Zhang and Wolynes to approximate and interpret dynamics, our results are broadly consistent with theirs. For example, the dominant identified phenotypes we found are the same as in their work (the only exception being the exceedingly rare LN2 phenotype identified by the coarse-graining algorithm for Parameter Set I).

Current Challenges and Future Directions
Our approach is uniquely suited to extracting global dynamics information for stochastic systems with metastability, using simulations. An advantage of this approach is that both the WE and coarse-graining algorithms are"dynamics-agnostic" [59], meaning that they can be applied to any type of stochastic dynamics framework. In the context of computational biology, our pipeline could be extended to other types of stochastic biochemical systems, such as systems with hybrid discrete-continuum dynamics [65], systems with spatial heterogeneity [66], or multi-level models [67]. In addition to this flexibility, simulation-based methods have the advantage of being able to leverage existing, widely-used open-source packages, which in turn facilitate model specification and model sharing. For example, BioNetGen [52] can interpret models specified in the Systems Biology Markup Language.
Several challenges and potential weaknesses with the pipeline exist, both with regard to sampling rare events, and in determining an appropriate coarse-grained model. Potential challenges with the WE algorithm have been described elsewhere [35,66], and include the difficulty of determining a binning that captures slow degrees of freedom and the existence of time-correlations between sampled iterations of the simulation, which can impede unbiased sampling. The Voronoi-based binning procedure we employ here is related to a number of similar approaches [24,[44][45][46], and has the advantage of effectively tiling a high-dimensional space without the need for a priori knowledge. However, in practice, according to others and our own studies, the method is effective up to about 10 degrees of freedom. These challenges are the subject of continued study.

Supporting information
For more information, see File S1 and File S2.
File S1 Description of network models, kinetic parameters, and weighted ensemble parameters File S2 Pseudo-code for the computational pipeline.

ExMISA Network
Two-gene network with Mutual Inhibition, Self-Activation, and exclusive transcription factor binding.

Pluripotency network
There are eight genes (encoding transcription factors) in the pluripotency network. Transcription factors bind as homodimers with the exception of the OCT4-SOX2 heterodimer. Only three transcription factors interact with their own gene, CDX2, NANOG, and GATA6. Transcription factors bind as dimers with the rate h and unbind with the rate f . When a gene is bound by any activator and no repressors, it expresses at a rate g on , otherwise, it expresses at a rate g of f . The only exception is NANOG, which must be bound by all three of its activators and no repressors to be activated.  Error in computed steady-state probability as a function of N , the number of protein states retained in the state-space truncation. N corresponds to the maximum allowed copy-number of transcription factors a and b in the ExMISA network. For a truncation to N , probability flux between states with n a , n b ≤ N and states with n a , n b > N is assumed to be 0 (i.e., the boundaries of the state-space are reflective). The error S S[N ] is defined by i |π[N + 1] − π[N ]|, where i runs over all enumerated states of the state-space with truncation to N + 1 (all states outside the boundary have probability 0). That is, the error is computed as the sum of the absolute difference between steady-state probabilities for each state, comparing π[N ] (steady-state probability computed with truncation to N ) to π[N + 1] (truncated to N + 1).  Convergence of the slowest implied timescale t 2 with increasing number of sampling regions (bins) and increasing lagtime τ . The lagtime calculated using the truncated CME is shown in gray. The accuracy of the WE approximation increases monotonically with increasing bin number and lagtime.      Difference in Coarse-Grained clustering for the 2-gene ExMISA cell decision network studied through the numerical benchmark (top) and the WE sampling pipeline (bottom). The color of each coarse-grained phenotype cluster corresponds to the expression level of protein a/b: lo/lo (black), hi/lo (red), lo/hi (blue), hi/hi (magenta). All enumerated state phenotypes are sized proportionally to their probability for each of the nine gene configurations for the numerical benchmark, while only the centers of each sampling region are shown for the WE sampling computational pipeline. While the centers of the sampling regions are mostly well separated according to their gene configuration, the phenotype states assigned to the sampling region extend across multiple gene configurations due to the choice of euclidean distance metric in assigning phenotype states to sampling regions. OCT4-SOX2 -Interaction rules for genes in the pluripotency network.   1.01 × 10 −8 9.99 × 10 −1 Markov State Models of metastable phenotype-cluster transitions found through the computational pipeline for all three simulated networks. There are four different combinations of a/b protein expression levels in the coarse-grained phenotype network of the ExMISA network: lo/lo, hi/hi, lo/hi, and hi/lo. The steady-state probabilities of the lo/lo, hi/hi, lo/hi, and hi/lo cell phenotypes are predicted by the computational pipeline to be 1.71 × 10 −2 , 7.67 × 10 −2 , 3.71 × 10 −1 , 3.80 × 10 −1 , respectively. The steady state probabilities of the six and five coarse-grained phenotype networks in the pluripotency network Parameter Set I and Parameter Set II can be found in figures 4 and 6, respectively.  2.20 × 10 5 The Mean First Passage Times of NANOG Fluctuation and (De)differentiation in the pluripotency network calculated using τ = 10 and f = 10. The MFPT reported for the WE is a block average over the last 500 iterations. The MFPT and standard deviation are found for transitioning between the stem cell phenotype (SC) and the pluripotent phenotype with low NANOG expression (LN1) (analogous to high NANOG production (N hi ) and low NANOG production (N lo ) transitions measured in experiments) and for transitioning between the stem cell phenotype (SC) and the trophectoderm phenotype (TE), calculated on the timescale of the protein degradation rate k. The SC, TE, and LN1 regions of interest (ROI) are defined as the SC, TE, and LN1 phenotypes derived from the MSM reduction of the sampled transition matrix.