From Understanding the Development Landscape of the Canonical Fate-Switch Pair to Constructing a Dynamic Landscape for Two-Step Neural Differentiation

Abstract Recent progress in stem cell biology, notably cell fate conversion, calls for novel theoretical understanding for cell differentiation. The existing qualitative concept of Waddington’s “epigenetic landscape” has attracted particular attention because it captures subsequent fate decision points, thus manifesting the hierarchical (“tree-like”) nature of cell fate diversification. Here, we generalized a recent work and explored such a developmental landscape for a two-gene fate decision circuit by integrating the underlying probability landscapes with different parameters (corresponding to distinct developmental stages). The change of entropy production rate along the parameter changes indicates which parameter changes can represent a normal developmental process while other parameters’ change can not. The transdifferentiation paths over the landscape under certain conditions reveal the possibility of a direct and reversible phenotypic conversion. As the intensity of noise increases, we found that the landscape becomes flatter and the dominant paths more straight, implying the importance of biological noise processing mechanism in development and reprogramming. We further extended the landscape of the one-step fate decision to that for two-step decisions in central nervous system (CNS) differentiation. A minimal network and dynamic model for CNS differentiation was firstly constructed where two three-gene motifs are coupled. We then implemented the SDEs (Stochastic Differentiation Equations) simulation for the validity of the network and model. By integrating the two landscapes for the two switch gene pairs, we constructed the two-step development landscape for CNS differentiation. Our work provides new insights into cellular differentiation and important clues for better reprogramming strategies.


Two-gene developmental landscape: the two remaining constructions
A caveat of Wang's construction is that the extreme points on landscape deviate from the locality of equilibrium points of equation (1) in main text because the Itô stochastic integral is used in his construction. Ao proposed a framework based on symmetric and antisymmetric matrix transformation to decompose the SDEs into the gradient part and curl part, which ensures the extreme points on landscape match the stable points (Ao, 2003). However, a SDEs decomposition by this transformation is not easy even though Ao developed a gradient expansion algorithm (Ao, 2004). There also existed another effective method for landscape construction (Bhattacharya et al , 2011). But it is only a numerical method and doesn't consider the influence of noise. Considering the above situations, we mainly applied Wang's method of landscape construction. Here we summarize the main mathematical basis for landscape construction by Ao and Bhattacharya's methods.
For Ao's transformation, let us first consider a process defined by following SDEs (Xing, 2010): where g relates to diffusion matrix D by gg T = 2D, the independent Gaussian white noise satisfies < ς j (t), ς j (τ ) >= δ ij δ(t − τ ) and ϵ is used to characterize the intensity of noise. Here G i corresponds to the F i (the force driving gene i) in the main text while dx i /dt describes the change of rate in gene expression of gene i. In order to obtain a scalar function corresponding to the potential function for the SDEs system, one can always construct a symmetric matrix S and antisymmetric matrix T and transform Eq. S1 into: where ϕ is the scalar potential function. By introducing an additional condition, g ′ g ′T = 2S, we can uniquely determine the matrixes of S and T by To obtain M, a gradient expansion algorithm is developed: Based on the solution of ϕ(x), the landscape for the system can be mapped. According to Bhattacharya's method of landscape construction (Bhattacharya et al , 2011), for a two genes system described by Eq. 1 in main text, we can define a quasi-potential term V q whose increment is By aligning all the trajectories under the assumption of the continuity of potential, we can then map the landscape for the above system.
We first constructed the stage-specific and developmental landscape by Ao's framework and Bhattacharya's method ( Figure S2) with the same treatment of parameter a in Wang's framework (Wang et al , 2011). We found that the stage-specific landscapes constructed are almost topologically identical. This may result from the fact that both constructions tightly associate with the theory of Lyapunov stability (Bhattacharya et al , 2011, Yuan et al , 2010. By similar operation stated in the main text, we further quantified the development landscape. We found that the quasi-potential on Bhattacharya's developmental landscape increases from stem cells to mature cell types which is contrary to the original Waddington landscape. On the other hand, the quasi-potential on Ao's developmental landscape firstly decreases from the stem cell valley then increases slightly.

Mean first passage time under different noise influence
To characterize the stability of cell types under different noise amplitude, we use the concept of mean first passage time (MFPT) which is defined as (van Kampen, 2007) , where F 1 , F 2 , x 1 , x 2 is the same terms as in the equation (1) from the main text. τ is the the mean first passage time for a cell stays at a specific valley (corresponds to a cell type) on landscape (Wang et al , 2010). From Figure S4, we find that the MFPT decreases as the noise amplitude increases, implying that the cell types become less stable under large noise influence.

Confirmation of stochastic simulation by public microarray data
We used two different models (the first one has an enforced decrease of P ax 6 and another has the negative feedback from mature cell type to P ax 6) to simulate the development dynamics of CNS genesis and obtained similar results ( Figure S5A, B). We used the first one as our default model. In addition to a qualitative consistency between the stochastic simulation of neurongenesis and the microarray data, as shown in the main text, the stochastic simulation for astrogenesis and oligodendrogenesis also partially matches qualitatively with the public microarray time series.
The expression of Scl, Olig 2 and Hes 5 increases from E11 to E14 in the microarray data (GDS10796) for astrogenesis ( Figure S6). This may correspond to the starting part of our simulation. The expression of Stat 3, Aldh1L1 and Sox 8 increases simultaneously in the microarray data, identifying with the temporal increase both of Scl and Olig 2 (before the bifurcation of their expression) caused by Hes 5 activation in the starting part of the simulation. It also indicates that the astrogenesis may begin at about E11 ( Figure  S6). The expression of Aldh1L is relative low, comparing to T uj 1 and Sox 8, which may result from the fact that Aldh1L is not the optimal marker for astrocytes (Cotterell and Sharpe, 2010).
As confirmed by the microarray data (GDS2379), the expression level of Scl increases while that of Olig 2 decreases during oligodendrogenesis in the later part of our simulation ( Figure S7). The decrease of M ytlL and Sox 8 is at odds with our simulation, possibly because that they are also not the optimal markers of oligodendrocytes.
We also conducted reprogramming simulation after confirming our model and network by the above stochastic simulation (Figure. S5C).

Three-gene motif as the network design for stemness maintenance
In the main text, we proposed that the three-gene motif in Figure 4b is the network design for stemness maintenance and cellular development. To model this motif, we consider to couple the upstream gene's positive regulation and self-activation and using following equations to define its expression dynamics: , where x 1 , x 2 , x 3 represent the gene expression level of b, c, a of the three-gene motif in Figure 4b of main text. We set a 1 = a 2 = 1, b 1 = b 2 = 1, k 1 = k 2 = 1, S 1 = 0.5, S 2 = 2, n = 4 for this three-gene model. Figure S8 is bifurcation graph and the corresponding development landscape calculated based on equations of x 1 , x 2 under the change of x 3 (assumed to represent the developmental process). From this figure, we can conclude that the two side terminal cell type valleys can be reprogrammed to center stem cell valley by the enforced increase of gene A.