An enriched network motif family regulates multistep cell fate transitions with restricted reversibility

Multistep cell fate transitions with stepwise changes of transcriptional profiles are common to many developmental, regenerative and pathological processes. The multiple intermediate cell lineage states can serve as differentiation checkpoints or branching points for channeling cells to more than one lineages. However, mechanisms underlying these transitions remain elusive. Here, we explored gene regulatory circuits that can generate multiple intermediate cellular states with stepwise modulations of transcription factors. With unbiased searching in the network topology space, we found a motif family containing a large set of networks can give rise to four attractors with the stepwise regulations of transcription factors, which limit the reversibility of three consecutive steps of the lineage transition. We found that there is an enrichment of these motifs in a transcriptional network controlling the early T cell development, and a mathematical model based on this network recapitulates multistep transitions in the early T cell lineage commitment. By calculating the energy landscape and minimum action paths for the T cell model, we quantified the stochastic dynamics of the critical factors in response to the differentiation signal with fluctuations. These results are in good agreement with experimental observations and they suggest the stable characteristics of the intermediate states in the T cell differentiation. These dynamical features may help to direct the cells to correct lineages during development. Our findings provide general design principles for multistep cell linage transitions and new insights into the early T cell development. The network motifs containing a large family of topologies can be useful for analyzing diverse biological systems with multistep transitions.


Introduction
Cell fate transition, including differentiation, de-differentiation and trans-differentiation, is a fundamental biological process in which the function of a cell gets specialized, reprogrammed or altered. The process often involves significant changes of multiple cellular properties, including the morphology, the self-renewal capacity and the potentials to commit to alternative lineages [1,2]. These changes are controlled by the dynamics of interacting transcription factors (TFs) and the modulation of chromatin structures, which in term are governed by complex regulatory networks in the cells [3][4][5]. Interestingly, the fate transitions in many systems are achieved by sequential commitments to a series of cellular states with stepwise changes in their transcriptional profile towards the final stage of the program (Fig 1) [6][7][8][9][10][11]. The intermediate states between the initial state (e.g. the undifferentiated state in the case of cell differentiation) and the final state may be important for multiple purposes, such as facilitating 'checkpoints' that ensure appropriate development of cellular behaviors, or allowing the cells to make correct decisions at the lineage branching points [11][12][13][14][15].
One example of these stepwise cell lineage transitions is the development of T lymphocytes in the thymus. The differentiation from multipotent pre-thymic progenitor cells to committed T cells involves multiple cellular states with stepwise changes of their cellular properties and the transcriptional profiles (Table 1) [16][17][18][19]. Several lines of evidence suggest that the transition states at an early phase of the differentiation can serve as stable checkpoints for sequential lineage commitments. The progress through these intermediate states is accompanied by stepwise loss of their potentials to differentiate into other cell types: pre-thymic progenitor cells can be converted to a few types of cells, including B cells, natural killer (NK) cells, dendritic cells (DCs) etc., whereas the multipotency of the intermediate cell types is more limited but not completely lost [20][21][22][23][24][25][26]. In addition, the stability of these intermediate states is substantial because the loss of differentiation signals does not result in de-differentiation of some intermediate states [20], suggesting restricted reversibility (or complete irreversibility) of the multiple transitions. In addition, the lymphoid progenitor cells need to divide for a certain number of times at an intermediate state before committing to the T cell lineage, and the stable activities of the lineage defining transcriptional program at the intermediate stages may be important for the proliferations [27]. Finally, the loss of certain transcription factors (e.g. BCL11B) can lead to the termination of the differentiation at some intermediate states, which is often associated with diseases such as leukemia [18,20,28]. This further suggests that the intermediate states are cellular 'attractors' between the initial and the final stages of the differentiation (Fig  1, bottom panel). Similar stable intermediate states during cell lineage transitions are observed in other systems, such as the epithelial-mesenchymal transition, and the skin development (Table 1), and those states also serve as regulatory stages for altering cellular properties including self-renewal and migration [10,[29][30][31][32][33][34][35][36][37]. Therefore, the multiple intermediate states are involved in diverse normal development and pathological conditions. Understanding the regulatory programs for the sequential cell lineage commitments is a key step towards the elucidation of mechanisms underlying various biological processes involving multistep lineage transitions. Despite the accumulating data and observations on these stepwise lineage commitments, general mechanisms governing these differentiation processes with multiple intermediate cellular states remain unclear.  In this study, we explored the strategies in terms of the transcriptional network design that gives rise to stepwise transitions during cell differentiation. We first used a generic form of networks containing three interacting TFs to find network motifs that can produce four attractors (the minimum number of attractors in the examples of T cell development, epithelial-mesenchymal transition and skin development) with stepwise changes of transcriptional factor levels. We found two types of network motifs, both involving interconnections of positive feedback loops, which can generate the four-attractor systems. These motifs constitute a large family of gene regulatory networks. We found that there is an enrichment of these motifs in a network controlling the early T cell development. We built a specific model using known interactions among key transcription factors in developing T cells, and the model shows that the transcriptional network governs multistep and irreversible transitions in the development process.
To investigate the stochastic dynamics for early T cell development, we mapped out the quasi-energy landscape for the early T cell development. This landscape characterizes the four attractors representing four stages of early T cell development quantitatively. In addition, by calculating the minimum action paths (MAPs) between different attractors, we quantified the dynamics of the key factors in response to Notch signal with fluctuations, which are in good agreement with experimental observations. Finally, we identified the critical factors influencing T cell development by global sensitivity analysis based on the landscape topography. Overall, our model for early T cell development elucidates the mechanisms underlying the stepwise loss of multipotency and multiple stable checkpoints at various stages of differentiation. The network topologies for multiple attractors found in this study and our motif discovery strategy combined with the landscape methodology can be useful for analyzing a wide range of cell differentiation systems with multiple intermediate states.

Networks in a large motif family govern systems with four attractors with stepwise transcriptional modulation
To find transcriptional network topologies that can generate multiple intermediate states during cell fate transition, we first performed random parameter sampling with a network family containing up to 3 nodes (Fig 2A). In this framework of network topology, each node represents a transcription factor (TF) that can potentially influence the transcription levels of other two TFs and itself. Topology searching with a 3-node network was used for motif discovery for various performance objectives in previous studies [38,39]. We performed exhaustive search for topologies with up to 6 regulations from a total of 9 regulations of the network family, and constructed a mathematical model for each topology (see Methods for details). For each model, we performed random sampling in the parameter space from uniformly distributed values (S1 Table). We selected topologies containing at least one parameter set that is able to generate four attractors with stepwise changes of transcriptional levels. We define the system with four attractors with the stepwise changes of transcriptional levels as the scenario in which there are four stable steady states and they can be consistently ordered by the concentrations of any pairs of TFs. In other words, one TF always monotonically increases or decreases with another TF in these four states, and we term these states 'ordered' attractors in this paper. Among the 2114 network topologies that we searched, we found 216 topologies that can produce such behavior. In addition, we found 417 topologies that can only produce four unordered steady states (TF concentrations are non-monotonically correlated among the states) (S11 Fig, S12 Fig).
To visualize the relationships among these topologies, we constructed a complexity atlas (Fig 2B), in which the nodes represent the network structures that gave rise to four attractors, Complexity atlas for selected topologies. Closed circles denote minimum motifs. Open circles denote topologies containing more regulations than those in the minimum motifs. Each arrow denotes the difference by one regulation in the network. Examples of minimum motifs are shown at the bottom. Red: Type I motif. Blue: Type II motif. Green: Hybrid motif. C. Overlaid four attractors for each of the 29 minimum topologies. Factor A denotes the TF on the left of the network diagram. Factor B denotes the TF on the right of the network diagram. In some topologies A and B and positively correlated (left panel), whereas they are negatively correlated in other topologies (right panel). Colored dots denote the stable steady states. Colored lines connect states of their corresponding topologies. The colors of the cell states match the illustration in Fig 1. The colors of the lines denote different representative models. The z-score is calculated by shifting the mean of each four attractors to 0 and then normalizing the four data points to unit variance data. D. Example phase planes for two minimum topologies (Type I and Type II respectively). In each case, four out of the seven steady states (intersections denoted by solid dots) are stable. Network structures and phase planes for all 29 minimum motifs are included in S1 and S2 Figs. All models shown in this figure are built with additive form of Hill functions. https://doi.org/10.1371/journal.pcbi.1006855.g002 Network motifs regulating multistep cell fate transitions and the edges connect pairs of topologies that differ by a single regulation (addition or removal of a transcriptional interaction) [40]. We define the minimum topologies as those of which the reduction of complexity, or the removal of any regulation from the network, will abolish its capability to generate four attractors (solid nodes in Fig 2B and examples in Fig 2C). We found 29 such minimum topologies which represent the non-redundant structures for producing the four-attractor system.
Interestingly, all of the 216 topologies obtained from our search contain three distinct positive feedback loops (including double-negative feedback loops), and they can be categorized into two types of motifs ( Fig 2B, bottom panel). The Type I motif contains three positive feedback loops that are closed at a single TF (red nodes and edges in Fig 2B). The Type II motif contains three connected positive feedback loops, two of which do not share any TF but are connected via the third loop (blue nodes and edges in Fig 2B). There is a remarkable diversity of each of the motif types because the interconnected positive feedback loops can share multiple TFs (S1 and S2 Figs). Based on the complexity atlas (Fig 2B), we found that Type II motifs contain 4-6 regulations, and Type I motifs contain 5-6 regulations. Some of the networks with 6 regulations contain subnetworks of both Type I and Type II motifs (Hybrid type, green nodes). The four attractors in the space of two TFs exhibit a variety of patterns of nonlinear monotonic correlations (Fig 2C, S3 Fig), which are governed by intersections of highly nonlinear nullclines in the state space containing the two TFs ( Fig 2D, S1 and S2 Figs). The definitions of various types of motifs are listed in Table 2, and the statistics of the topologies discovered are summarized in Table 3 (also see S11 Fig for an illustration). Overall, this motif family represents a large number of networks that can produce a common type of behaviors: multiple stable intermediate states in terms the transcriptional activity.
We next asked whether there is a difference between Type I and Type II motifs in terms of their ability to generate systems with four ordered attractors. We found that with the same number of sampled parameter sets, Type II motifs have greater fractions of parameter sets that  3C). This suggests that as compared to Type II motifs, Type I motifs has higher specificity in generating four ordered attractors, which is more relevant to the stepwise cell fate transitions than the unordered ones. Moreover, we observed that the inter-attractor distances between neighboring attractors in the gene expression space were generally more variable with Type I motifs than those with Type II motifs ( Fig  3D, magenta boxes). In particular, among the three inter-attractor distances for each model, Type I motifs generated smaller minimum distance than Type II motifs did (Fig 3D, orange boxes. p-value < 0.0001, Mann-Whitney U test). We did not observe any significant difference between Type I and Type II motifs in terms of the stabilities of the attractors and the kinetic paths that they generate (S13 Fig See Methods for calculation of quasi-energy landscape and kinetic path). In addition to the effects of motif types, we also asked whether the fractions of Inter-attractor distances for each parameter set that generates four ordered attractors were calculated and summarized. For each set of parameters associated with the four attractors, the minimum, the maximum and the standard deviation of the distances were analyzed. Minimum Type I and Type II motifs were compared using these statistics. https://doi.org/10.1371/journal.pcbi.1006855.g003 Network motifs regulating multistep cell fate transitions positive or negative regulations in the network can influence the function of generating fourattractor systems. We found that the fraction of positive regulations has a weak but significant correlation with the fraction of successful parameter sets generating four-attractor systems (S14A and S14B Fig). Although negative regulations are slightly less favorable, they might be important to ensure that the levels of some TFs are inversely correlated during multistep cell fate transitions, which is necessary for having at least one highly expressed TFs in each of the initial and the final cell states (S14C Fig).
In summary, we found two types of network motifs that generate four attractors with stepwise changes of the transcriptional profile. Two of these attractors represent the multiple intermediate states observed in various biological systems. This exploratory analysis elicits several interesting questions: what are the biological examples of such network motifs? Can the conclusions with respect to the two types of motifs be generalized to networks with more than three TFs? Is there any advantage of combining both types of motifs? How are the transitions among these states triggered deterministically and stochastically? To provide insights into these questions in a more biologically meaningful context, we will use a specific biological system to describe more detailed analysis of these motifs and their underlying gene regulatory networks in the following sections.

Type I and Type II network motifs are enriched in a transcriptional network controlling early T cell development
We asked whether the motifs that we discovered can be found in any known transcriptional network that potentially control multistep cell differentiation. We used the early T cell differentiation in the thymus as an example to address this question. The differentiation from multipotent lymphoid progenitor cells to unipotent early T cells involves multiple stages at which the cells have differential potentials to commit to non-T lineages and other cellular properties such as proliferation rates. At the early phase of this process, four stages of development T cells (ETP/DN1, DN2a, DN2b, DN3) were identified experimentally, and the progression through these stages is controlled by a myriad of transcription factors including four core factors, TCF-1, PU.1, GATA3 and BCL11B. These TFs form a complex network among themselves (see Fig  4A and supporting experimental observations in S3 Table), and the stepwise changes in the levels of these TFs were observed in the four developmental stages of T cells [20,28]. The interactions involving these core TFs were shown to be critical for the irreversible commitment to the T cell lineage by forming a bistable switch [41]. Among these factors, PU.1 level decreases as the cells commit to later stages, whereas the levels of other three factors increase in this process. It is unclear, however, whether this transcriptional network can serve as a regulatory unit that governs the multistep nature of the T cell differentiation.
We noticed that this T cell transcriptional network contains the motifs that we found in our analysis using the generic form of networks, we therefore hypothesized that the models based on this network can have four attractors with sequential changes of the four TFs. Indeed, using random sampling we were able to find parameter sets that give rise to four-attractor systems similar to what we obtained with the generic 3-node framework. To find the functional components that generate this behavior, we analyzed the subnetworks of the complex T cell regulatory network [42]. We removed the regulations from the network systematically, and we found that out of the non-redundant 1553 topologies (2047 subnetworks), there are 568 topologies (701 subnetworks) that can generate four attractors with stepwise changes of the TFs ( Fig  4B). We used a complexity atlas to visualize the relationships among these subnetworks ( Fig  4C). We found that the network can be reduced to one of the 66 minimum topologies (97 minimum subnetworks) which retains the four-attractor property (solid nodes in Fig 4C). Notably, these networks can be classified into the two types of motifs described earlier (Fig 2B). Similar to the networks that we obtained through the generic framework, the two types of minimum motif have 4-6 regulations. Subnetworks with both types of motifs (green nodes and edges) start to appear when the number of regulations reaches six. The numbers of motifs and subnetworks obtained for the generic framework and the T cell model are summarized in Table 3.
We next quantified the enrichment of the two motif families in the early T cell transcription network. We first generated random networks by perturbing the existing regulations in the network model and computed the empirical p-values for observing the numbers of different types of network motifs. The T cell network contains a large number of positive feedback loops and the two types of motifs that we described earlier ( To exclude the possibility that this differential significance was observed due to the way we generate random networks which gives low p-values (<10 −4 ) in general, we used another method to generate random networks with an augmented number of regulations ( Fig 5, middle panel, blue bars). Each pair of TFs were assigned with a pair of random regulations (positive, negative or none). Consistent with the previous method, the T cell transcriptional network is enriched with positive feedback loops overall, but the enrichment is more significant for Type I motifs or for the combination of Type I and Type II motifs. Interestingly, motifs that are similar to Type I motif but have higher complexity (more positive feedback loops) do not show more significant enrichment than Type I motif does (S15 Fig). In addition, we found that networks controlling switch-like behaviors, but not multistep cell fate transitions, are not enriched with Type I or Type II motif [43][44][45][46]. These results suggest the possibility that the network has been evolved to reach more complex performance objectives than those enabled by simple positive feedback loops alone.
Since the minimum motifs alone can generate the four-attractor system, we asked whether the combination of these motifs enhances the ability of the network to produce the system. We therefore compared a subnetwork containing only one minimum Type I motif with another one containing multiple such motifs in terms of the performance to generate a particular fourattractor system (Fig 6A See Methods and S1 Text for details). We found that the subnetwork with multiple Type I motifs (Network 1) outperforms the one with only one motif (Network 2) in that Network 1 can give a better fit to a hypothetical four-attractor system (Fig 6A-6C). In this hypothetical 'target' system, the four attractors are assumed to be determined by dynamics of PU.1 with multiple feedback loops. The assumed degradation (Fig 6B, gray curve) and production rate functions of PU.1 (Fig 6B, green curve) are specified. The curves of these two functions have 7 intersections, four of which represent attractors. The optimized production function obtained from Network 1 (Fig 6B, purple curve) has more robust intersections with the degradation curve than one obtained from Network 2 ( Fig 6B, red curve). This difference was observed for production functions of these two categories from multiple runs of optimization ( Fig 6C). This suggests the advantage of combining multiple motifs with similar functions We next asked whether the topologies that contain both Type I and Type II motifs have greater probabilities to generate the four-attractor system than the topologies with one type of motifs do. When we explored the parameter space randomly for each topology with a fixed number of samples, a larger number of parameter sets that can generate the four-attractor system were found with the topologies containing both motifs than with those containing either  Blue: Type II motif. Green: Hybrid motif. B. Performances of the two subnetworks are compared. Performance was quantified with the sum of squared distance (SSD) from a predefined hypothetical continuous production function (gray curve) of PU.1 level that have 7 intersections with the degradation function, which generates four attractors (see details in supplementary text). The purple and red curves represent the optimized functions fitted to the gray curve. The gray curve is closer to the purple curve than to the red curve, suggesting a better fit with Network 1. C. SSD values obtained from 500 optimization runs for each of Network 1 and Network 2. Each value was calculated using the procedure shown in B. D. Numbers of parameter sets that generate the four-attractor systems per 10 6 random samples per topology. Three types of network motifs with 7 regulations are compared.
Type I or Type II motifs only (S5 Fig and Fig 6D). This suggests that the combination of both motifs might be a robust strategy to generate the four-attractor system. This pattern was observed for all the topologies in the complexity atlas (S6 Fig) as well as those with the same degree of complexity ( Fig 6D, networks with 7 regulations were chosen because they have comparable fractions of the three types of motifs).
In summary, we found that the core transcriptional network controlling early T cell differentiation are enriched with Type I and Type II network motifs. The network composed of these two types of motifs governs a dynamical system containing four attractors, corresponding to four known stages in the early T cell development. The networks with both types of motifs and greater number of such motifs have more robust capability of generating the fourattractor systems than those networks with fewer types of numbers of motifs do.

Stepwise transitions with restricted reversibility provide robustness to fluctuating differentiation signal to multiple intermediate states
We next characterized the dynamical features of the four-attractor system of the T cell development model in response to differentiation signals. For this and subsequent analysis, we focused on a model describing the network topology shown in Fig 4A (the full model). We first performed bifurcation analysis of the system to the changes of Notch signaling (Fig 7A). With the increasing Notch signal, the system undergoes three saddle-node bifurcations, at which the stability of the proceeding cellular states is lost (Fig 7A, black arrows). These bifurcation points therefore represent the cell state transitions from one stage to the next. The structure of the bifurcation diagram shows a remarkable robust multistep commitment program governed by the T cell transcription network: the commitment to each stage of the program has restricted reversibility in that the attenuation or withdrawal of the Notch signaling does not result in de-differentiation of the developing T cells (i.e. the return of the transcription profile to earlier stages that may have greater multipotency). It was previously shown that the commitment from DN2a to DN2b is an irreversible process with respect to Notch signaling, and this transition eliminates developing T cells' potential to be diverted to any other lineages when Notch signaling is abolished [20,41]. However, simple toggle-switch models do not explain the observation that the multipotency of the early T cells is lost in a stepwise manner. For example, cells at ETP can be differentiated into B cells, macrophages, dendritic cells (DCs), granulocytes, natural killer (NK) cells and Innate lymphoid cellsubset2 (ILC2), whereas the potentials to commit to many of the lineages are blocked even in the absence of Notch signaling at the DN2a stage, at which the cells can only be differentiated into NK cells and ILC2 [20]. Therefore, the stepwise, irreversible transcriptional transitions revealed by our model is consistent with the experimental observations with respect to the loss of multipotency in the stepwise manner.
Although the absence of Notch signal does not allow the reversal of lineage progression, it was previously shown that the absence of BLC11B in lymphoid progenitor cells blocks its ability to progress to DN2b stage, whereas the Cre-controlled knockout of Bcl11b in committed T cells (e.g. DN3 cells) reverts its transcriptional profile to DN2a-like cells [28]. Upon blocking the production of BCL11B in our model, we observed the loss of attractors of DN2b and DN3, and the DN2a state is the only stable stage even in the presence of the strong Notch signaling (Fig 7B). As a result, increasing Notch signaling only triggers one saddle-node bifurcation, representing the transition from ETP to DN2a cell (Fig 7B black arrow), whereas the decrease of the BCL11B production triggers the transition back to DN2a instead of ETP (Fig 7C). These results are in agreement with the previous experimental findings [28], and they further support the importance of the multistep differentiation system revealed by our model. The bifurcation analysis shows how the lineage progression is influenced by stably increasing or decreasing Notch signal strengths. We next asked how the duration of Notch signal may control the multistep lineage transition. By inducing the differentiation with varying durations of the Notch signaling, we found that cells experiencing transient Notch signals may only commit to intermediate stages of differentiation ( Fig 8A). In addition, the system is able to integrate the information of the signal intensity and duration to make decision on the lineage progression. These results suggest that the multistep lineage transition can be triggered by the increasing strength of the signal, the increasing duration of the signal, or the combination of both types of signal dynamics. Earlier experimental studies have shown that transient Notch signaling can irreversibly drive the cells to an intermediate, but committed stage with a definitive T cell identity (DN2b) [28,41,47]. This is in agreement with our results, and our model further suggests that the commitment to other intermediate states is also irreversible with respect to the lineage progression (note that this irreversibility does not refer to the establishment of T cell identity).
One possible advantage of the multi-stable system is its robustness of response in facing fluctuating signals. We therefore performed numerical simulations of the dynamical system under increasing Notch signaling with significant fluctuations. Under this condition, transient reduction of Notch signaling halted the progress of the lineage commitment but did not trigger the de-differentiation (Fig 8B). Our model suggests that the design of transcriptional network allows the system to stop at intermediate stages before proceeding to the next ones. This strategy has several potential physiological benefits: 1) it protects the cell lineage progression against sporadic fluctuations of Notch signaling; 2) it facilitates the 'checkpoints' before lineage commitment in the middle of the developmental process and 3) it allows the stable storage of differentiation intermediates which can be differentiated into mature T cells rapidly when there is an urgent need of new T cells with a diverse T cell receptor repertoire.

Quantitative analysis of the energy landscapes and minimum action paths delineates the patterns of the multiple-attractor system in T cell differentiation
With the deterministic modeling and bifurcation approaches, we described the local stability for multi-stable T cell model. However, the global stability is less clear from the bifurcation Network motifs regulating multistep cell fate transitions analysis alone. In addition, it is important to consider the stochastic dynamics for T cell development model because the intracellular noise may play crucial roles in cellular behaviors [48,49]. The Waddington landscape has been proposed as a metaphor to explain the development and differentiation of cells [50]. Recently, the Waddington epigenetic landscape for the biological networks has been quantified and employed to investigate the stochastic dynamics of stem cell development and cancer [51][52][53][54][55][56][57].
Following a self-consistent approximation approach (see Methods), we calculated the steady state probability distribution and then obtained the energy landscape for the model of the early T cell development (full model in Fig 4A). For visualization, we selected two TFs (PU.1 and TCF-1) as the coordinates and projected the 4-dimensional landscape into a twodimensional space by integrating the other 2 TF variables (Fig 9A). Here TCF-1 is a representative T cell lineage TF, and PU.1 is a TF for alternative cell fates. We also displayed the landscape in a four-dimensional figure, where the three axes represent three TFs (TCF-1, BCL11B and PU.1), and the color represents the energy U (Fig 9B). Note that our major conclusions do not depend on the specific choice of the coordinate (see S7 and S8 Figs for landscapes with PU.1/BCL11B and PU.1/GATA3 as the coordinates).
In the case without Notch signal (N = 0), four stable cell states emerge on the landscape for the T cell developmental system (Fig 9). On the landscape surface, the blue region represents lower potential or higher probability, and the yellow region represents higher potential or lower probability. The four basins of attraction on the landscape represent four different cell states characterized by different TF expression patterns in the 4-dimensional state space ( Fig  9A and 9B provide two types of projections of the whole 4-dimensional landscape). These states separately correspond to ETP/DN1 (high PU.1/low TCF-1/low BCL11B/low GATA3 expression), DN3 state (low PU.1/high TCF-1/high BCL11B/high GATA3 expression), and two intermediate states (DN2a and DN2b, intermediate expression for the four TFs). The existence of four stable attractors is consistent with experiments [16][17][18][19]. As the Notch signal (N) increases, the landscape changes from a quadristable (four stable states coexist), to a tristable (DN2a, DN2b and DN3), to a bistable (DN2b and DN3) and finally to a monostable DN3 state (S9 Fig). These results provide a straightforward explanation for the irreversibility observed in experiments for the stepwise T cell lineage commitment.
To check whether our modelling results match experimental data quantitatively, we acquired two sets of gene expression data of the four core TFs for T cell development from previous publications [17,47], and mapped the normalized values (see Methods) to the landscape (Fig 9A and 9B, S7 and S8 Figs). Here, the golden balls represent the four steady states (characterizing four stages of T cell development) from the models, the red balls represent the gene expression data points (Data1) from Zhang et al. [47], and the green balls represent the gene expression data points (Data2) from Mingueneau et al. [17]. We found that these gene expression data agree well with our landscape models in the sense that each data point is almost positioned in the corresponding basin (Fig 9A and 9B, S7 and S8 Figs). We found that the landscapes give a better fit to Data2 (green points), since each green data point can be well positioned in one of the four basins, corresponding to four stages of T cell development. In fact, the two sets of data points are not very close to each other or to the steady states (golden points) from the models. This is reasonable because these two sets of experimental data are measured separately, probably in different conditions, and these data usually delineate the average of multiple measurements from different samples. Also, the gene expression fluctuations are common in biological systems. Therefore, our landscape pictures provide a natural way to reconcile the two different experimental data, i.e. the gene expression data do not have to be at the positions of steady states. Instead, the gene expression data for each individual To examine the transitions among individual cell types, we calculated kinetic transition paths by minimizing the transition actions between attractors [58,59], obtaining minimum action paths (MAPs). The MAPs for different transitions are indicated on the landscape (Fig  9). The white MAPs from the ETP state to the DN3 state, correspond to the T cell developmental process while the magenta MAPs from the DN3 state to the ETP state, correspond to reprogramming process. The lines represent the MAPs, and the arrows denote the directions of the transitions. The MAP for T cell developmental process and the MAP for the backward process are irreversible, since the forward and reverse kinetic paths are not identical. This irreversibility of kinetic transition paths is caused by the non-gradient force, i.e. the curl flux [60,61]. Here, the solid white lines represent three stepwise transitions from ETP to DN2a, DNa2 to DN2b, and DN2b to DN3, whereas the dashed white line represents the direct transition path from ETP to DN3. From the MAPs for T cell development, we found that the direct transition path is very similar to the stepwise transition path (the white solid line is similar to the white dashed line, Fig 9, S7 and S8 Figs), which indicates that the T cell developmental process needs to go through the two intermediate states (DN2a and DN2b). This confirms the critical roles of the intermediate states for the T cell differentiation. It is worth noting that the MAPs here quantify the most probable transition paths, which suggest the optimal path (with least transition action) for cells to switch from one state to another. However, in a realistic gene regulatory system, usually a signal is needed to induce rapid cell state transitions (e.g. the Notch signaling is used here to induce T cell development).
To investigate the dynamical developmental process of T cell for multiple TFs, we visualized the 4-dimensional MAP from the ETP to the DN3 state by discretizing the levels of the four TFs. We found that for T cell development, TCF-1 is upregulated first, followed by the activation of GATA3. This leads to the complete inactivation of the alternative fate TF PU.1 and the activation of BCL11B (Fig 10). Interestingly, this temporal order is in good agreement with experimental observations [62] (Also see the gene expression data of four core TFs for four stages in Fig 9, S7 Fig and S8 Fig). These results suggest that the sequence of switching on or off for different TFs can be critical for the lineage commitment of T cell development. Moreover, under the Bcl11b knockout condition (kB = 0), the landscape changes from a quadristable (four stable states coexist), to a bistable (ETP and DN2a) state (S10 Fig), which is consistent with the bifurcation analysis (Fig 7) and experimental observations [28].

Global sensitivity analysis based on landscape topography reveals the critical factors for T cell development
To identify the critical factors (regulations and TFs) which determine T cell development, we performed a global sensitivity analysis based on the landscape topography. Specifically, we use the transition action between attractors as a measure to quantify the feasibility of a transition between different attractors. A smaller transition action, corresponding to a smaller energy barrier, means a more feasible transition from one attractor to another. In this way, by changing the parameters each at a time we can identify the critical parameters for T cell development three TFs (TCF-1, BCL11B and PU. 1), respectively, and the fourth dimension (color) represents the energy U. The normalized gene express data (including TCF-1, BCL11B, GATA3, and PU. 1) of four stages for T cell development are mapped to the landscape, where the golden balls represent the four steady states (four stages of T cell development) from the models, the red balls (Data1) represent the data from Zhang et al. [47], and the green balls represent the data from Mingueneau et al. [17]. See S7 Fig and S8 Fig for the  (we use the transition from ETP to DN3 as an example). To do this, we constrict the models within the parameter region corresponding to the four-attractor system, so that we can make comparisons for the changes of transition actions as parameters are varied.
We identified some critical parameters of which the variations caused significant changes of transition actions between ETP and DN3 attractor. These parameters include the effective degradation rate of PU.1, (rdP), the regulated production rate of PU.1 (kP), the basal production rate of PU.1 (kP0), the threshold of the self-activation of PU.1 (KPP), and the threshold for the repression of PU.1 on GATA3 (KGP) (Fig 11). In particular, the increase of the selfactivation strength of PU.1 (i.e. decreased KPP) reduces the transition action from DN3 to DN2b (Fig 11B), indicating a less stable DN3 state and a more stable ETP state. This is reasonable because the PU.1 is a major TF for alternative cell fates (B-cell, dendritic-cell, and myeloid cell), and silencing of PU.1 is operationally important for T cell commitment [28]. Additionally, the increase of the repression strength of PU.1 on GATA3 (decreased KGP) raises the transition action from ETP to DN2a (Fig 11B), indicating a more stable ETP state and a less stable DN3 state, which is consistent with the observation that GATA3 is a critical TF promoting T cell development. Overall, these results from sensitivity analysis indicate that the PU. 1 synthesis/degradation related parameters, the GATA3 synthesis related parameters, and the regulations between PU.1 and GATA3 are critical to the dynamics and the cell fate decisions of T cell development. This indicates that the regulatory circuit between PU.1 and GATA3 plays critical roles for the cell fate determinations during T cell development.

Discussion
In this study, we identified two types of network motif families that are responsible for generating a four-attractor dynamical system commonly observed in stepwise cell differentiation. Some instances of these motifs were previously described and analyzed in the context of binary or ternary switches during lineage transitions [63][64][65][66][67], but the systematic analysis for these motifs was not performed to our knowledge. In addition, the design principle for multiple Network motifs regulating multistep cell fate transitions intermediate states was not clear. Our approach provides a comprehensive framework for analyzing systems with a complex dynamical property, a four-attractor system with stepwise transcriptional modulation, and we illustrate the intricate relationships among these motifs with an intuitive visualization method.
Previous studies on biological circuits governing irreversible transitions focused on the analysis of toggle switches which generate none-or-all type of responses [68,69]. Our work suggests that multistep or graded responses can be associated with irreversible transitions as well. Given the importance of graded response in various biological scenarios [70][71][72], we expect the design strategy that we found can be useful for discovery of natural-occurring irreversible graded responses or construction of synthetic biological circuits producing these responses. Our work also suggests that the response to signals, or the progression of lineage transition, may be proportional to the intensity and/or the duration of the signal. This is consistent with the previous observations that the duration of the morphogen signal can be critical for cell Fig 11. Global sensitivity analysis for T cell developmental model. Sensitivity analysis was performed for the 39 parameters in the T cell model. The transition actions between different states (S ETP->DN2a and S DN3->DN2b ) were calculated to quantify the sensitivity of parameters on the landscape. The Y-Axis represents the 39 parameters. The X-Axis represents the percentage of the transition action (S) changed relative to S without parameter changes. Here, S ETP-lineage choice [73,74]. Of note, when signal strength is converted to digital (none-or-all) response in early phases of signal transduction, its duration can play an essential role in determining the graded response [75].
In our systematic exploration in the network topology space, we took the assumption that network structure is correlated with its function, i.e. assuming the existence of functional motif structure in transcription regulatory networks. The notion of network motifs is very helpful for understanding many complex biological systems [76,77], but the richness of dynamic behaviors of these motifs is beyond their structures-distinct kinetic rates in the same motif can produce diverse responses [78]. Therefore, it is expected that the motifs that we discovered may be able to generate dynamical behaviors different from the four-attractor system (we will discuss some of them in the following paragraphs). We also expect that some of network motifs can be responsible for multiple functions by themselves, and this multifunctionality may explain the diverse motifs that we found for the four-attractor systems in the biological examples. Future work is warranted to examine the distributions of the diverse functions in the parameter space of the motifs that we found. Nonetheless, it is important to understand the capacity of the network motifs in terms of their functional outputs. Our work provides a holistic view of the potential network motif structures governing multistep cell lineage transitions.
Although network motifs with three positive feedback loops closing at a single factor (Type I motifs discussed in this study) were not systematically analyzed in previously studies to our knowledge, some simpler versions of Type I motif, e.g. a pair of interconnected positive feedback loops, have been described in various systems such as the epithelial-mesenchymal transition and the cancer progression [65,79]. These systems typically govern ternary switches with a single intermediate state. These studies and ours suggest a correlation between the number of positive feedback loops and the number of the intermediate states the system may be able to generate. In fact, early studies on multistable systems have shown the requirement of positive feedback loops for generating multiple steady states [80], which was proved mathematically [81]. Intriguingly, an ultrahigh feedback system similar to the Type I motifs was shown to govern irreversible transitions with low differentiation rates for adipocytes [82]. It would be interesting to examine whether controlling the low differentiation rate through cell-to-cell variability and controlling the number of intermediate states suggested by our model can be achieved in the same system. Our findings are consistent with the earlier work in that they highlight the importance of this type of signaling motifs in controlling cell differentiation by preventing the direct and homogeneous transition from the initial state to the final one.
Near symmetrical parameters in models based on a particular instance of the Type II motif class (the one with mutually inhibiting TFs) have been widely used to explain stochastic lineage choice observed in embryonic stem cells, developing hematopoietic cells and CD4 + T cells [83,84]. Our findings with Type II motifs complement these studies with newly identified functions of these motifs for cell differentiation. Instead of the stochasticity that breaks the symmetry of this motif, the Notch signal may be responsible for switching the system from one side (PU.1 high) to another (PU.1 low) in a stepwise fashion, and the intermediate states mark the stable stages where the system is relatively balanced in terms of two groups of competing TFs.
It was previously suggested that the network consisting of four core transcription factors governs a bistable switch with irreversible transition [41]. Our models based on this network provide explanations for additional experimental observations with respect to the multistep feature of the early T cell development (see S1 Text for a comparison between our model and the bistable model). Although it is possible that interconnection of multiple positive feedback loops simply enhances the robustness of the bistable switches, the observation that several important irreversible transitions in cell cycle progression are primarily controlled by two positive feedback loops implies that the enrichment of the positive feedback loops in the T cell transcriptional network is unlikely due to the intrinsic biophysical limits of positive feedback loops in generating bistable switches [46,69]. Instead, other cellular functions, such as generating the multiple intermediate states, might be the performance objectives for the design of this network. Future work with more systematic analysis is warranted to compare the parameter regions corresponding to the bistable, tristable and quadristable systems with ordered and unordered attractors.
Our model of early T cell development suggests that the differentiation program may be stopped at multiple locations in the state space of transcription levels of key factors. These multiple attractors may correspond to the lineage branching points at which the progenitor cells are given opportunities to be converted to T cell as well as other types of lymphocytes. As such, it is possible that this dynamical property is exploited to achieve a better control for the fate determination of the lymphoid progenitor cells at systems level. Given that subpopulations of NK cells and DCs are generated by the thymus [85][86][87], the multistep lineage transition provides a basis for channeling the lymphoid progenitor to multiple lineages in a precise manner.
Based on the recent landscape-path theory and the T cell gene regulatory network model, we investigated the stochastic dynamics of T cell development. We identified four stable cell states characterized by attractors on the landscape including ETP/DN1, DN3, and two intermediate states (DN2a and DN2b). These attractors (cell states) match two published datasets of gene expression values in a reasonable way [17,47]. We also calculated the kinetic transition paths between different cell states from minimum action path approaches. Importantly, from the MAPs of T cell development, we found that different TFs are switched on or off in different orders. For example, TCF-1 needs to be first activated, and then GATA3 is activated, leading to the inactivation of PU.1 and activation of BCL11B. These predictions agree well with experiments [28,62], which provides further validations for our mathematical model.
In our models, we only considered four core factors based on previous published T cell gene regulatory network for simplicity [41]. In the realistic biological system, there are more factors critical to T cell development [28]. It would be interesting to incorporate other important factors into the network and construct a more realistic model for T cell development. By studying the landscape of more comprehensive T cell development network, we will better understand the underlying regulatory machinery and obtain more insights into the intricate mechanisms for T cell development.
In summary, we identified a large family of network motifs that can generate four attractors that are observed in various biological systems involving cell lineage transition. We built a mathematical model for transcriptional network controlling early T cell development, and we found that the network underlying this developmental process is enriched with the motifs that we identified. The system with the four attractors has a remarkable irreversibility for transitions to multiple intermediate states when the differentiation signal is varied. We suggest that this multistep process may be useful for precise control of the differentiation of lymphoid progenitor cells towards T cell and other cell types. Our T cell model provides new insights into the complex developmental or regeneration processes, and our combined approaches of comprehensive analysis of network motifs for generating multistable systems and landscape-path framework provide a powerful tool for studying a wide range of networks controlling cell lineage transitions.

Framework of mathematical modeling
We used ordinary differential equations (ODEs) to describe the dynamics of the concentrations of transcription factors (TFs). We used Hill function to describe the transcriptional regulation by TFs. Each ODE has the following form: Here, X i represents the concentration of a transcription factor (TF). k 0;X i is the basal production rate of the TF in the absence of any regulator. k X i is the maximum production rate under the control of the transcriptional activators and inhibitors of this TF. β i,j denotes the weight of the influence of the TF j on i. The sum of the Hill functions determines the regulation of the production of this TF by other TFs. In each term of the summation, θ = 1 when the regulating TF (X j ) is an activator. θ = 0 when the regulating TF is an inhibitor. K i,j is the apparent dissociation constant of the regulating TF binding to its regulatory element of the promoter, and it describes the effectiveness of the regulation in terms of the concentration of the TF. n is the total number of regulating TFs. r d;X i is the effective degradation rate constant. The production rate of the proteins is assumed to be linearly correlated with mRNA production rate. Similar generalized forms of Hill function were previously used for analysis of a variety of gene regulatory networks [52,88]. One time unit of our model corresponds to 6 hours, and all the parameters are dimensionless.
To exclude the possibility that our conclusions are sensitive to the choice of the form of equations, we used an alternative form of ODE to describe the regulatory networks: In these ODEs, multiplication of Hill functions was used instead of addition. Similar forms of Hill function were also used for modeling a variety of gene regulatory networks [66,89]. With this form, the two types of network motifs that generated the four-attractor behavior are the same as those discovered with the additive form of Hill functions (S3 Fig). In fact, using both forms of equations gave rise to the same number of network topologies (216 topologies with the steady states shown in both S3 Fig and S4 Fig). Therefore, our conclusions are robust in terms of the choice of equation form.
During topology searching, random parameters values were chosen from defined ranges (S1 Table, see below).

Topology searching for four-attractor systems
Network topology searching was first performed for all possible topologies involving up to 3 nodes (TFs) and 6 regulations that are able to generate four-attractor systems with stepwise changes of TF levels. We sampled n of the 9 arrows in Fig 2A, where 1�n�6. The procedure for the exhaustive search is the following: 1) Choose n arrows from the 9 arrows in Fig 2A (9  choose n), which gives rise to 465 topology templates. 2) For each topology template containing n regulations, enumerate all 2 n networks (the number 2 represents positive and negative regulations), which gives 12258 networks in total. 3) Remove redundant networks that are different in terms of labels of nodes but are otherwise identical in terms of topology (isometric topologies), which produces 2114 non-redundant topologies. Three-node networks were previously used to explore several types of functional dynamics of network motifs [38,90]. For each topology, we performed random sampling of 10 6 parameter sets. For each parameter set, we selected 125 initial conditions in the three-dimensional state space ((0, 3.3) for each variable) using Latin Hypercube sampling, and then solved the ODEs numerically. We stopped the simulations at time point 500 and checked if the 125 ODE systems are stabilized at four or more distinct steady states. We next checked if the changes of the TFs are monotonically coupled. We first ordered the steady states by the levels of one TF, and then we looked for scenarios in which all other TFs monotonically increase or decrease with the ordered TF (i.e. the attractors with stepwise changes of the TFs). We excluded the scenarios in which one TF is not monotonically correlated with others in terms of their levels at the four attractors. Models that generated oscillations at the final time point were also excluded. The parameter sets which produced the stepwise changes of steady state were accepted and their associated network topologies were analyzed. Parameter values for the minimum topologies are listed in S2 Table.

Topology classification
To determine whether a network topology contains Type I and/or Type II motifs, we developed a simple algorithm for motif identification. We first enumerated all positive feedback loops (PFLs) in the network. A PFL is defined as any unbranched closed loop structure that has even number of negative regulations. We then followed the scheme described in Table 2 to check if Type I and/or Type II motifs exist in a specific network: we enumerated all combinations of three PFLs in the network. If any three PFLs share one or more TFs, then the network contains a Type I motif. If two of any three PFLs do not share any TF, but they (i.e. at least one TF from PFL 1 and at least one TF from PFL 2) are connected via the third PFL, then the network contains a Type II motif.
Complexity atlas was plotted for the obtained network topologies as described previously by Jiménez et al [40] (Fig 2B and Fig 4C).

Transcriptional network model for early T cell development
We built a model for early T cell development based on the regulations that were previously shown experimentally [91][92][93][94][95][96][97][98][99][100][101][102][103][104]. Information about experimental evidence is described in S3 Table. The form of equations is similar to Eq (2). We chose this multiplicative form of Hill functions because earlier experimental study suggested that regulations of Bcl11b gene are combined via an 'and' logic gate [105], which favors the use of multiplication. Although similar detailed information is not available for other TFs, we have shown that our main conclusions with respect to the multistep transitions controlled by a network motif family do not depend on the choice of the form of equations (Fig 2 and S4 Fig). Full list of equations is included in S1 Text. The parameter values were obtained by random searching described above with the four-attractor property as the selection criterion followed by minor manual adjustments. All parameter values are dimensionless. To our knowledge, there is no published experimental measurement that would allow us to directly constrain the ranges of these values except for the degradation rates, which have a unit of the inverse of time. These degradation rates were estimated from a previous study that measured the half-lives (approximately 4 hours) of the transcription factors [41]. The parameter values are listed in S4 Table. To explore the subnetworks of the T cell development model that are essential for the four-stage transition, we performed similar exhaustive search in a set of 1553 non-redundant topologies (2047 subnetworks) to find functional circuit in the model. We obtained 568 topologies (701 topologies) from the search, and we analyzed them with complexity atlas. Isometric topologies were removed in the simulations, but they are included in the complexity atlas so that we do not mix isometric topologies with possibly differential biological meanings specific to certain genes.
In the bifurcation analysis, the value of the parameter N (Notch signal strength) or k BCL11B (maximum production rate of BCL11B) is varied and the changes of the steady states of the system were analyzed. We let k BCL11B = 0 to simulate the Bcl11b knockout condition. To simulate the system under various scenarios of Notch signaling, we first varied the strength and/or duration of the Notch signal and checked the steady state distribution of the system under the varying strengths and durations. We tested 200X200 combinations of strengths and durations of Notch signals and obtained the phenotypes of the cells at the steady state. To simulate the fluctuating Notch signals, we divided the time window of the simulation into small intervals (0.1 unites of time). For each interval, we used a random number with a specified mean and an additive noise. The mean of the Notch signal first increased overtime and then became attenuated.

Enrichment analysis of the four-attractor motifs in the T cell model
To quantify the enrichment of various types of motifs, we used the generic definition of pvalue: the p-value for a particular motif is the probability of obtaining at least n number of motifs from a random network population, where n is the observed number of such motif in the T cell network. To compute the p-values, we first counted the frequencies of the positive feedback loop, Type I motif and Type II motif in the T cell model (i.e. n 1 , n 2 , n 3 , n 4 representing the numbers of positive feedback loops, Type I motifs, Type II motifs, and the sum of the Type I and Type II motifs respectively). Random networks were generated using two methods: 1) for each regulation in the existing T cell model, we randomly reassign its source and target TFs (referred to as 'permuted regulations'), and 2) for each pair of TFs from the network, we randomly assign a regulation (positive, negative or none) (referred to as 'permuted regulations'). For each of the two methods, we generated 10 5 networks, and we calculated the empirical p-values by counting the number of the random networks with the numbers of motifs not less than those of respective motifs in the T cell network. The method with permuted regulations is more biologically relevant because the number of the positive and negative regulations are retained in the random networks. We used the second approach as an alternative to exclude the possibility that the conclusion based on the trend of the p-values is due to the low number of networks containing the extreme amount of the motifs.

Optimization for performance comparison of two subnetworks
Due to the difficulty to compare the performances of regulatory circuits with different complexities in general, we selected two specific instances of Type I network motif for comparison. One of them contains only one Type I motif, whereas the other one contains multiple motifs. For each topology, we reduced the system to one ODE with quasi-steady state assumption and defined a continuous production rate function that can produce four attractors as a surrogate function (see S1 Text). Multiple runs of optimization using differential evolution algorithm was used, and 500 converged parameter sets for each circuit were used for comparison. This optimization method was previously used for finding optimum parameter sets and for comparing the performances of regulatory circuits [64,106,107].

Self-consistent mean field approximation for the quantification of energy landscape
The temporal evolution a dynamical system was determined by a probabilistic diffusion equation (Fokker-Planck equation). Given the system state P(X 1 ,X 2 ,. . .,X N ,t), where X 1 ,X 2 ,. . .,X N , represent the concentrations of molecules or gene expression levels, we have N-dimensional partial differential equation, which are difficult to solve because the system has a very large state space. Following a self-consistent mean field approach [52,61,108], we split the probability into the products of the individual probabilities: PðX; tÞ ¼ PðX 1 ; X 2 ; . . . ; X N; tÞ ¼ Q N i P i ðX i ; tÞ and solve the probability self-consistently. In this way, we effectively reduced the dimensionality of the system from M N to MN (M is the number of possible states that each gene can have), and thus made the computation of the high-dimensional probability distribution tractable.
Based on the diffusion equations, when the diffusion coefficient D is small, the moment equations can be approximated to [109,110]: Here, � xðtÞ; σ(t) and A(t) are vectors and tensors. σ(t) denotes the covariance matrix and A (t) is the jacobian matrix of Fð� xðtÞÞ. A T (t) is the transpose of A(t). The elements of matrix A are specified as: A ij ¼ @F i ðXðtÞÞ @x j ðtÞ . By solving these equations, we can acquire � xðtÞ and σ(t). Here, we consider only the diagonal elements of σ(t) from the mean field approximation. Then, the evolution of the probability distribution for each variable can be acquired from the Gaussian approximation: P x; t ð Þ ¼ 1 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2psðtÞ The probability distribution acquired above corresponds to one stable steady state or the basin of attraction. If the system has multiple stable steady states, there should be several probability distributions localized at each basin with different variances. Thus, the total probability is the sum of all these probability distributions with different weights. From the self-consistent approximation, we can extend this formulation to the multi-dimensional case by assuming that the total probability is the product of each individual probability for each variable. Finally, with the total probability, we can construct the potential landscape by: U(x) = −lnP ss (x). Here, P ss is the steady state probability distribution, and U is dimensionless potential energy. In this work, we define two quantities based on the landscape theory. One is the energy barrier height, which is defined as the energy difference between the local minimum and the corresponding saddle point. Another quantity is the transition action, which is defined as the minimum action from one attractor to the other. These two quantities both measure the difficulty of the transitions. However, the transition actions are suggested to provide a more accurate description for the barrier crossing between attractors or the transition rate [111]. Therefore, we used the transition actions to quantify the difficulty of the transitions between attractors in this work (see the following section for the approach of calculating minimum action paths).

Minimum action paths from optimization
Following the approaches based on the Freidlin-Wentzell theory [58,112,113], for a dynamical system with multistability the most probable transition path from one attractor i at time 0 to attractor j at time T, � � ij ðtÞ, t2[0,T], can be acquired by minimizing the action functional over all possible paths: Here F(ϕ ij ) is the driving force. This optimal path is called minimized action path (MAP). We calculated MAPs numerically by applying minimum action methods used in [58,112].

Comparison between simulated attractors and experimental data
Gene expression data for the four core TFs were obtained from two previous studies on T cell development (Zhang et al. and Mingueneau et al. [17,47]). We rescaled these data to the range of the attractors obtained from our models by linearly transforming the expression values so that they match the attractors approximately. This rescaling is necessary because some corresponding expression values from these two datasets differ from each other by more than 10-fold. The corresponding ratios between these two datasets vary significantly as well.
Supporting information S1 Text. Model equations, model reductions and evaluation procedure through optimization.
(DOCX) S1  The blue regions represent higher probability or lower potential and the yellow regions represent lower probability or higher potential. Four basins of attractions characterize four different cell states (ETP, DN2a, DN2b, and DN3). White solid lines represent the MAP from ETP state to DN2a, DN2b, and DN3 states. Magenta solid lines represent the MAP from DN3 to DN2b, DN2a, and to ETP state. Dashed lines represent the direct MAP from ETP to DN3 and the direct MAP from DN3 to ETP states, respectively. The normalized gene express data (BCL11B and PU. 1) of four stages for T cell development are mapped on the landscape, where the golden balls represent the four steady states (four stages of T cell development) from the models, the red balls (Data1) represent the data from Zhang et al. [47], and the green balls represent the data from Mingueneau et al. [17]. (TIF) S8 Fig. Landscape and corresponding minimum action paths (MAPs) for the T cell developmental network in the GATA3-PU.1 state space. The blue regions represent higher probability or lower potential and the yellow regions represent lower probability or higher potential. Four basins of attractions characterize four different cell states (ETP, DN2a, DN2b, and DN3). White solid lines represent the MAP from ETP state to DN2a, DN2b, and DN3 states. Magenta solid lines represent the MAP from DN3 to DN2b, DN2a, and to ETP state. Dashed lines represent the direct MAP from ETP to DN3 and the direct MAP from DN3 to ETP states, respectively. The normalized gene express data (GATA3 and PU. 1) of four stages for T cell development are mapped on the landscape, where the golden balls represent the four steady states (four stages of T cell development) from the models, the red balls (Data1) represent the data from Zhang et al. [47], and the green balls represent the data from Mingueneau et al. [17]. (TIF)

S9 Fig. Landscape changes as Notch signal increases.
On the landscape, the blue regions represent higher probability or lower potential and the yellow regions represent lower probability or higher potential. As the Notch signal (N) increases (from 0.1 to 0.2, 0.3, and 0.4), the landscape changes from a quadristable (four stable states coexist), to tristable (DN2a, DN2b and DN3 coexist), to bistable (DN2b and DN3 coexist) and finally to a monostable DN3 state. The landscape change shows how the Notch signal promotes the transition from ETP to DN3, quantitatively. (TIF) S10 Fig. Quasi-energy landscape for the Bcl11b knockout condition. With the Bcl11b knockout (kB = 0), the landscape changes from a quadristable (four stable states coexist), to a bistable (ETP and DN2a) state. On the landscape, the blue regions represent higher probability or lower potential and the yellow regions represent lower probability or higher potential. Two basins of attractions characterize two different cell states (ETP and DN2a). The landscape change shows that the Bcl11b knockout will make ETP/DN2a states more stable and DN2b/ DN3 less stable. (TIF) S11 Fig. Venn diagram of four types of network motifs that can produce four attractors with up to three TFs. Red and blue areas correspond to Type I and Type II motifs shown in Fig 2B. Green area corresponds to motifs that contain both Type I and Type II networks. Orange area corresponds to motifs that can only produce four unordered attractors, in which the concentrations of the TFs are non-monotonically correlated. Numbers in the diagrams denote the total numbers of non-redundant topologies for each type. The Type II (blue) and Hybrid (green) motifs can produce both ordered and unordered 4-attractor systems, depending on the choice of parameters. (TIF)