A plausible accelerating function of intermediate states in cancer metastasis

Epithelial-to-mesenchymal transition (EMT) is a fundamental cellular process and plays an essential role in development, tissue regeneration, and cancer metastasis. Interestingly, EMT is not a binary process but instead proceeds with multiple partial intermediate states. However, the functions of these intermediate states are not fully understood. Here, we focus on a general question about how the number of partial EMT states affects cell transformation. First, by fitting a hidden Markov model of EMT with experimental data, we propose a statistical mechanism for EMT in which many unobservable microstates may exist within one of the observable macrostates. Furthermore, we find that increasing the number of intermediate states can accelerate the EMT process and that adding parallel paths or transition layers may accelerate the process even further. Last, a stabilized intermediate state traps cells in one partial EMT state. This work advances our understanding of the dynamics and functions of EMT plasticity during cancer metastasis.


Author summary
Epithelial-mesenchymal transition (EMT) is a basic biological process, in which epithelial cells undergo multiple biochemical changes, lose cell-cell junctions and polarization, and become a mesenchymal phenotype with migratory and invasive properties. Recent studies have illustrated the existence and importance of the partial EMT states. It has become increasingly apparent that the EMT has strong differentiation plasticity. This plasticity is heavily implicated in cancer cell invasion and metastasis. However, it is still unclear how the number of intermediate states changes the EMT process. Here, we use a hidden Markov model to describe the EMT process. By fitting with the experimental data, we find that unobservable microstates exist within the observable macrostates: epithelial, partial EMT, and mesenchymal. Additionally, we find that increasing the number of states between the start and end of EMT or including alternative transition avenues via parallel paths or transition layers can accelerate the EMT process. This study suggests a non-trivial function of the EMT plasticity during cancer metastasis. Introduction Epithelial-to-mesenchymal transition (EMT) is a fundamental cellular process in which polarized epithelial cells lose various cell-cell junctions and adhesion and gain migratory and invasive properties to become mesenchymal cells [1,2]. EMT is very important in embryonic development, tumorigenesis, metastasis, tumor stemness, and therapy resistance [3,4]. Remarkably, EMT is not a binary process but instead proceeds with multiple partial intermediate states, collectively known as partial or hybrid EMT states [3,[5][6][7][8][9][10][11]. The partial EMT state retains some characteristics of epithelium but also shows features of mesenchymal cells [12][13][14]. One partial EMT state was predicted through mathematical modeling of the EMT core regulatory network and was verified with quantitative experiments by our previous works [5,6]. Thereafter, many different partial EMT states were proposed [8,9,[15][16][17]. More and more experimental data shows a different number of partial EMT states in various cancer cell lines [18][19][20][21][22][23]. Recently, several partial EMT phenotypes were found during cancer metastasis in vivo in a skin cancer mouse model [24,25] and prostate cancer [26]. While many partial EMT states have been found, their functions are still not fully understood during cancer metastasis [4,[27][28][29].
Currently, the function of partial EMT states has being studied in the context of coupling with other cellular processes. For example, acquisition of stem-like properties dictates its coupling with cancer stemness [11,[30][31][32][33][34], circulating tumor cells (CTCs) [35,36], and drug resistance [37]. Thus, the partial EMT cells hold the highest metastatic potential. Instead of full EMT, partial EMT is found to be critical for renal fibrosis [38][39][40]. There are many potential couplings of partial EMT and other biological processes, such as cell cycle [40], renal fibrosis [41] and metabolisms [42]. The cross-talk among the regulators of EMT and other cellular processes guides the coupling mechanism and full functional characteristics of partial EMT states.
While it is important to investigate specific functions of each partial EMT state in the context of additional cellular processes, an interesting question about the general function is whether the number of partial EMT states affect the transition itself. Given that different cell lines may have different partial EMT states, epithelial cells in different systems may also undergo different steps to become mesenchymal cells. Here, we use a general hidden Markov model to describe cell fate transitions during the EMT process. Our analysis makes several non-intuitive predictions. First, several microstates exist within one macrostate, which makes EMT a non-Markov process. Second, increasing the number of intermediate states can accelerate the EMT process. Third, transition in parallel and layer modes can further accelerate the process. Lastly, the existence of a stabilized intermediate state traps cells within their current phenotype. We conclude with discussions on how the dynamics and functions of EMT plasticity are controlled by the number of states during cancer metastasis.

Transition rate and energy barrier
In this work, we assume that the total energy barrier one epithelial cell has to cross to transition to the mesenchymal state is ΔE. If there is no intermediate state, then following the Arrhenius equation, the transition rate of EMT is k ¼ k 0 e À DE=k BT , where the value of k 0 is a pre-exponential factor, which determines the time scale of the system. For simplicity, we define the energy unit as k BT , thus we can simplify the equation to k = k 0 e −ΔE . Suppose there are N int intermediate states between the epithelial and mesenchymal states and that the energy is divided into N int + 1 steps; the transition rate for step i is k i ¼ k 0 e À dE i where δE i is the energy barrier for this step and ∑δE i = ΔE. We considered two scenarios: 1. The energy is divided evenly into all the steps, and thus the transition rate for each step is same:

Estimation of k 0 and the energy barrier ΔE
The total energy barrier from epithelial to mesenchymal, ΔE and k 0 , were estimated by the following equations: where N fit = 9 is the number of intermediate states and the transition rate k fit = 3.4261 is from the best fitting of the experimental data. For each ΔE, we can calculate k 0 and the transition rates for all cases.

Parameter fitting
Based on one intermediate state and an irreversible EMT process. First, we considered one intermediate state in the model to fit the parameters. We assumed EMT to be an irreversible process under high dose of inducer TGF-β.p ðtÞ ¼ ½p E ðtÞ; p P ðtÞ; p M ðtÞ� is the vector that represents the probability of a cell belonging in each state during the process of EMT. Thus, the equation governing the dynamics ofp is where k 1 is the transition rate from the epithelial state to the pEMT state, and k 2 is the transition rate from the pEMT state to the mesenchymal state. The equations are solved and the solution is: Based on a flexible number of intermediate states and an irreversible EMT process. The above model does not accurately describe the experimental data. Thus, we extended the model by considering N int + 1 microstates; this is based on an assumption that each macrostate, including the epithelial state and pEMT state, consist of multiple microstates. Suppose that EMT is an irreversible process where the cells transition independently from one microstate to the next at the same rate k. Based on the assumption that the ΔE is evenly divided into these steps, the following equations can be used to represent the dynamics of the probability of the microstates: Based on a flexible number of intermediate states and a reversible EMT process. EMT has some probability to be reversible even under high dose of inducer TGF-β, which may result from the noise in the cell or other stochasticity. We also considered a case where the EMT process is reversible with the cells transitioning independently between one microstate and the next at the forward rate k 1 and backward rate k −1 . The following equations can be used to represent the dynamics of the probability of the microstates: For the one-path scenario. When there is a total of N int intermediate states, the probability of a cell being in the n-th intermediate state is given by: The corresponding mean first arrival time (MFAT) to the mesenchymal state is which indicates that the MFAT non-linearly depends on N int .
For the parallel path scenario. The probability of a cell in the n i -th intermediate state of the i-th path is given by: The FAT distribution is given by: where N pth is the number of paths and N i is the number of intermediate states in the i-th path. The MFAT to the mesenchymal state is where k A = ∑k i and k i is transition rate in the i-th path.
For the layered scenario. The probability of a cell in the i-th layer is given by: and the FAT distribution is given by: and the MFAT to the mesenchymal state is given by: where N L1 is the number of states in the first layer, N ly is the number of layers, and k is the transition rate from one intermediate state to the next (the same for all steps in the layered case).

The cases with one stabilized state
For any case that has one stabilized state at S, such that 1 < S < N int + 1, the probability of a cell in each state goes as follows: Here, k 1 denotes the transition rate from the non-stabilized state and k 2 the transition from the stabilized intermediate state.
In order for the state to be stabilized as described, it must be true that k 2 < k 1 . The MFAT to the stabilized state. The MFAT to the stabilized state is Thus, the acceleration effect of increasing the number of intermediate states or adding paths or layers also applies to the stabilized state.
Mean dwelling time. The mean dwelling time distribution is used to quantify the time cells spend within the stabilized intermediate state for various cases. The ODE of the probability of cells in the stabilized intermediate state is where k 1 P S−1 represents the transition into the stabilized state and k 2 P S represents the transition out. The mean dwelling time (D S (t)) is defined as the average time that cells spend within this intermediate state, which only depends on k 2 . D S (t) is as follows:

Stochastic model
To simulate the stochasticity of the cell state transitions during EMT at the single-cell level, we also developed a stochastic model using the Gillespie algorithm.
1. Define the system according to the cell state transitions during EMT, using the following matrices: S is the cell state the transition comes from, P is the cell state that the transition is to, and vector K denotes the transition rates. For example, for the system with only one path and N int intermediate states, S and P have N int + 2 columns and N int + 1 rows while K has one column and N int + 1 rows, with the pattern: 6. Determine the step in which the cell state transition occurs, r, which satisfies: aðiÞ < rn 2 � a 0 < P i¼r i¼1 aðiÞ. 7. Update the system: x 0 = x 0 − S(r, :) + P(r, :) and t = t + dt.
8. Repeat step 3-7 until t is more than the maximum time, T max .

EMT is a non-Markov process
Consistent with theoretical prediction [5], three steady states were found in TGF-β induced EMT in the MCF10A cell line with quantitative measurements of E-cadherin and Vimentin at the single cell level [6]. That is, EMT progresses through three functional cell states in this cell line: from the initial epithelial state to the partial EMT state, where cells lose some of the cellcell adhesion, and then to the full mesenchymal state ( Fig 1A). As shown in Fig 1B (left), this system can be represented with a metaphorical three-well landscape in one dimension along the mesenchymal marker (M-Marker) axis. Each well represents a stable or metastable cell phenotype during the EMT process and the transitions among them are indicated with arrows. With high concentration of TGF-β, the cell will be driven from the epithelial state (the first well) to the mesenchymal state (last well) through the partial EMT state (middle well), indicated by solid arrows. It is noted that the energy barrier for the forward transition is much smaller than the energy barrier for the reverse transition of each step at high concentration of TGF-β. Thus, the forward transition rate (Fig 1B, solid arrows) is much larger than the reverse transition rate (Fig 1B, dash arrows). The dynamics of the M-Marker (measured by the Vimentin protein) at the single cell level shows two steps with cells staying in the intermediate state temporally (Fig 1B, right) [5]. The underlying mechanism is that during EMT the phenotype changes of one cell are controlled by complex genetic regulatory networks and stepwise activation of multiple feedback loops, as demonstrated by Ref. [5,6].
Due to the cell-cell variability, the cells do not synchronize the transition through these states but make the transitions in a stochastic manner. However, the fraction of cells in each state shows deterministic dynamics, as seen in the quantified experimental data (Fig 1C) from Ref. [6]. To better understand EMT at the population level, we first build a simple model by assuming that cells are initially held in the epithelial state and will be driven by a high dose of TGF-β, as used in the experiments, to the partial EMT state before ending in the mesenchymal state, all with constant average rates (k 1 and k 2 ) (Fig 1C, top). We fit this simple model with the experimental data and find that the model does not describe the data with strong accuracy (Fig  1C, bottom). An inaccurate fitting was also found using a two-state model without any partial EMT states (S1 Fig, panel A). With three-state or two-state models, the transition from the partial EMT state to the mesenchymal state shows a quicker dynamic than the observed data. Thus, three-state or two-state models are not a good representation of the EMT process at the population level, and the cells still pass through other intermediate states temporarily before reaching the final mesenchymal state, which is the only attractor at high concentration of TGF-β.
To find out the underlying mechanism for the observed dynamics, we extend the simple model by adding several microstates for each cell macrostate (Fig 1D) by assuming that some unmeasured variables can further distinguish the epithelial state and partial EMT state into several microstates, with the epithelial and partial EMT states being the macrostates used in describing the overall EMT process. While we do not know the actual number of the epithelial and partial EMT microstates (N E and N P , respectively), a fitting of the extended model is done with all the combinations of N E and N P (Fig 1E) by assuming an equal transition rate from one state to the following state (k). According to the fitting score (root mean squared error), the best fit is located when N E = 5 and N P = 5 (white square in Fig 1E). The best fitting displayed in Fig 1F shows the optimal conditions in which our model reproduces the EMT dynamics at the population level with better accuracy than that of the assumption of only one microstate in both the epithelial and partial EMT states. It is noted that the three macrostates used in our description are defined by E-Cadherin and Vimentin singe-cell data in Ref. [6], not including other makers. In general, the macrostates are defined by the corresponding steady states observed in the experiment with the defined markers. This implies that the number of macrostates observed may differ depending on the markers used. Four macrostates were defined based on additional makers such as Ovol2 [9]. Here, we limited the fitting of the Markov model to the temporal dynamics of the three macrostates as defined in Ref. [6]. Currently, there is no experimental evidence for the existence of nine intermediate states in this cell line, suggesting the existence of microstates during EMT. These results suggest that EMT is a non-Markov process with many hidden epithelial microstates and partial EMT microstates, which could further be distinguished by measuring other variables in the system. This conclusion is further confirmed with a stochastic model to demonstrate the evolution of cell transitions in these microstates (see Materials and methods for more details). Fig 1G  shows 100 typical simulated stochastic trajectories of cells to complete the EMT process. With all cells beginning in the epithelial state, prominent cell-to-cell variation is shown in the time it takes a cell to convert to different microstates, indicating unsynchronized transition at the single cell level. At the population level, the statistic results with 10,000 simulated cells show the dynamics of the distribution of cells in the microstates (Fig 1H). Few reach the mesenchymal state (red bar) by day two. However, most cells will reach the final EMT state by day four. This is also consistent with the experimental data [6] (S1 Fig, panel B). Furthermore, fitting the experimental data with a model by considering the reversible state transitions suggests more microstates (S2 Fig, panel A). Taken together, our analysis based on fitting a general Markov model with the experimental data suggests that EMT is a non-Markov process, where several microstates exist within the epithelial and partial EMT states.

Increasing the number of intermediate states can accelerate EMT
Our analysis with a Markov model for EMT at the population level based on the experimental data of MCF10A cells suggests that multiple microstates exist in each macrostate during the EMT process. However, the number of microstates and macrostates can be cell-type specific. Different numbers of observable steady states have been found during EMT and cancer metastasis [9,[15][16][17][18][19][20][21][22][23], which leads us to wonder how the number of micro-and macro-states affects the EMT dynamics and cancer metastasis. For this reason, we now consider all states between the initial epithelial and final mesenchymal as "intermediate" states in order to generalize our results and allow for application into cancer cell lines. To compare the systems with different numbers of intermediate states, we made one assumption: the total energy barrier for the full transition from epithelial (E) to mesenchymal (M) is the same despite the total number of states needed to complete EMT. As shown in Fig 2A (left), if the energy barrier from E to M is ΔE without any intermediate states (blue curve), adding one additional state makes the transition two steps and the barrier becomes ΔE/2 (red curve) for each step. Similarly, the barrier is ΔE/3 for each step if two intermediate states are considered (Fig 2A, right, red curve). This assumption comes from the hypothesized monotonical energy gradient based on epigenetic changes between different EMT states [43] and existence of 'checkpoints' in the EMT continuum [44]. This assumption is also consistent with the cascading bistable switches mechanism in which EMT proceeds through step-wise activation of multiple feedback loops [5,6,9,45]. The cells need to change the profiles of gene expression to make the full EMT transition. Under this assumption with one intermediate state, the cells can make changes on the expression of some genes as the first step for a partial transition, and then make changes on the rest of the genes for the second step to complete the transition. The profiles of the genes that control the eukaryotic cell phenotypes are usually sophisticated by positive feedback loops to avoid undesired random phenotype switching from noises or short pulse stimulus. These feedback loops can be mutual inhibitions between two groups, one of them controls cell phenotype I while the other group controls cell phenotype II. Thus, the cell fate transitions through EMT intermediate states are analogous to crossing a series of energy barriers controlled by these positive feedback loops.
First, we systematically study how the EMT process changes with fixed ΔE and the number of intermediate states, N int . The distributions of the first-arrival time (FAT), defined as the time that cells take to arrive to the mesenchymal state, is used to measure how fast the EMT process is completed. As shown in Fig 2B, with increase of N int , the FAT distribution first shifts to the left and then to the right, becoming increasingly more narrow. To see this affect more clearly, we further calculate the mean first arrival time (MFAT) and find a non-monotonic dependence of MFAT on N int ( Fig 2C). As N int increases, the MFAT first shows a rapid decrease. However, after a certain N int , the MFAT begins to slowly increase again. These data suggest that the partial intermediate states have a potential function of accelerating the EMT process. Furthermore, we analyze how ΔE affects the EMT process. As shown in Fig 2C, the MFAT displays the same non-monotonic pattern for three different ΔE values, although the position of the minimum varies. At these minimum points, the EMT process is completed the fastest. This interesting phenomena is further confirmed by the analysis of the MFAT in the space of ΔE and N int (Fig 2D). The same trend of MFAT can be observed for all ΔE > 2 (threshold shown by the dash line). This result suggests that the EMT process is not a continuous process but proceeds with a finite number of discrete microstates.
For a ΔE value lower than the threshold, MFAT constantly increases with N int (Fig 2D). The distribution of the first-arrival time for one case with small ΔE is shown in S3 Fig. One can see that the transition time is already very small in this case and that increasing intermediate states barely increases the transition rate for each step, thus causing more time to complete full EMT. This is generally not the case for the EMT process in a real system, as the minimal MFAT is less than one day, but it demonstrates the underlying mechanism of the non-monotonic dependence of MFAT on N int . That is, increasing N int after the minimum does not help to increase the transition rate for each step, it just simply increases the number of steps. Thus, there is a trade-off to accelerate EMT through increasing the number of intermediate states. The dependence of the minimum MFAT alongside the intermediate state in which it occurs (N min int ) is shown in Fig 2E. With increase of ΔE, both the minimal MFAT and N min int increase, although a small decline of minimal MFAT is observed for large ΔE. Taken together, a potential function of multiple partial EMT states is to accelerate EMT, and the number of intermediate states necessary to achieve fastest EMT increases with the total energy barrier.

Adding parallel paths or transition layers further accelerates the EMT process
We have discussed the scenario with only one path in which epithelial cells proceed through a step-wise transition of intermediate states to become mesenchymal cells. Recently, parallel EMT paths were reported [46,47]. Here, we consider multiple paths in parallel ( Fig 3A) and study how the number of paths (N pth ) affects EMT. For  (Fig 3A).
To demonstrate the effect of multiple paths, we first study the system with nine intermediate states (N int = 9, the best fit from Fig 1E). The FAT distribution to the mesenchymal state is shown in Fig 3B under various N pth and three levels of energy barrier, ΔE. For small ΔE (left panel), the FAT distribution shifts to the left as N pth increases, implying that more paths create a faster transition. However, for large ΔE (right panel), the FAT distribution reflects the opposite, implying that more paths create a slower transition. These also suggests a non-monotonic dependence of FAT on N pth for moderate ΔE. As shown in Fig 3B (middle panel), the FAT distribution peak shifts to the left but the distribution width becomes larger with increase of N pth . This suggests that the monotonic/non-monotonic relationship between the FAT and N pth is dependent on the energy barrier.
To further confirm this conclusion, we systematically study how the MFAT of the mesenchymal state changes in the space of ΔE and N pth (Fig 3C). The MFAT dependence on N pth indeed shows three trends according to the value of ΔE. For small (below the bottom white dashed line) or large (above the top white dashed line) values of ΔE, the dependence is monotonic, either decreasing or increasing respectively as exemplified in Fig 3D (blue and yellow  curves). On the other hand, for a bounded range of ΔE values between the two white dashed lines (Fig 3C), it is non-monotonic as exemplified in Fig 3D (red curve). The non-monotonic dependence of MFAT on N int can change depending on N pth (S4 Fig). For N pth = 1, MFAT shows a non-monotonic dependence on N int , the same as Fig 2C. However, the position of the minimum shifts to the right as N pth is increased to 2*3. The non-monotonicity is then lost with a continuous increase of N pth . The minimum of MFAT decreases monotonically with the increase of N int . In order to maintain this dynamics, the number of parallel paths must also increase with N int (Fig 3E). That is, increasing N pth makes the dependence of MFAT on N int more monotonically decreasing. These data suggest that increasing both N int and N pth together accelerates the EMT process.
We have now discussed the scenario in which there is only one path and the scenario of multiple paths in parallel and found that increasing the number of the intermediate states or parallel paths could accelerate the EMT process. Here, another scenario is considered by assuming that the epithelial cell has to pass a fixed number of intermediate states to become mesenchymal. In other words, the epithelial cell must pass N ly of layers to become mesenchymal (Fig 4A). This is similar to the parallel paths scenario but has one difference; it is possible that two paths converge at one intermediate state in the layered scenario. The difference between two topologies is exemplified in S5 Fig, panel A. One can see that a single extra transition step exists in the layered case compared with the parallel case. In the layered topology, the transition rates in all steps are the same, which depends on the number of layers N ly . The MFAT to the mesenchymal state as a function of ΔE is not equivalent for the two cases (S5 Fig,  panel B). Interestingly, the two curves cross at one specific ΔE. For ΔE values below this crosspoint, EMT is completed faster in the parallel topology. However, for higher ΔE values, the parallel topology completes EMT at a slower rate than the layered topology.
Similar to the parallel path scenario, the first-arrival time distribution depends on ΔE ( Fig  4B). If ΔE is small, the FAT distribution shifts to the right with increase of N ly (Fig 4B, left  panel), while it shifts to the left as ΔE becomes large (Fig 4B, right panel). The FAT shows a non-monotonic dependence on N ly with the moderate ΔE (Fig 4B, middle panel). Fig 4C  shows the MFAT in the space of N ly and ΔE for the system with nine intermediate states. In general, a higher energy barrier gives a slower mean time arrival regardless of the number of layers. However, the trend of the MFAT versus N ly depends on ΔE. MFAT increases with N ly for small ΔE (below the bottom white dashed line), but decreases for large ΔE (above the top

PLOS COMPUTATIONAL BIOLOGY
white dashed line) and shows a non-monotonic dependence for moderate ΔE (between the two white dashed lines), as exemplified by the three cases in Fig 4D. Similar to the parallel path scenario, the minimum of MFAT decreases monotonically with N int as long as more layers are provided for the system with large number of N int (Fig 4E). This is also confirmed with various values of ΔE (Fig 4E and S6 Fig).
Taken together, adding more parallel paths or transition layers can further accelerate the EMT process. In both scenarios, a non-monotonic dependence of the mean first arrival time on the number of paths or layers is found. The MFAT increases with the number of intermediate states after the minimal if the system only has one path, but can further decrease with multiple paths or transition layers. Thus, a combination of increasing the number of intermediate states and parallel paths (or transition layers) can always accelerate the EMT process.

Stabilized intermediate state traps a cell within its current phenotype
We have discussed the scenarios in which the same energy barrier is considered for each step during the EMT process. However, it is very possible that one of the intermediate states is stabilized by specific regulators for some cancer cells [48], making the transition from this step more difficult. While mesenchymal cells contribute to cancer metastasis in many cases, the partial EMT states were found to be more aggressive [30,49]. Here, we consider this scenario where one of the intermediate states is more stable than the others; we call this state the "stabilized" state, which is related to certain states within the more aggressive forms of cancer cells. As shown in Fig 5A, our metaphoric landscape now has one well that is deeper than the others. In this stabilized state, more energy is needed for the transition to the next state. That is, the energy barrier for the transition at this step (ΔE 2 ) is larger than the energy barrier for other steps (ΔE 1 ). Two cases are considered here: 1) ΔE 2 increases and ΔE 1 decreases in order to maintain the overall energy barrier ΔE (Fig 5A, "constant ΔE case"), and 2) ΔE 2 increases as ΔE 1 remains the same to change the overall energy barrier (Fig 5A, "varying ΔE case"). Fig 5B  represents the diagram of the EMT process with one stabilized state, which includes two transition rates, k 1 as the transition rate from the non-stabilized states, and k 2 as the transition from the stabilized state.
Similar to the mechanism for the accelerated transition to the mesenchymal state in Figs 2-4, increasing the number of intermediate states before the stabilized state and adding layers or paths should also accelerate the transition to the stabilized state, given that the formula of the MFAT to the stabilized state is similar to the one to the mesenchymal state (see Materials and methods). To further understand the impact of a stabilized state on EMT, the distribution of cells at the stabilized state and the mesenchymal state is analyzed with different ΔE 2 : ΔE 1 ratios. As the ratio increases, the dwelling time inside the stabilized state grows and it takes more time to arrive to the mesenchymal state for both the constant and varying ΔE cases (Fig 5C). That is, the greater the ratio, the more time the cancer cells can stay in the aggressive stabilized state for cancer metastasis. It also implies that cells stay longer in the macrostate where the stabilized intermediate state belongs (Fig 5D, top panel) and that the placement of the stabilized intermediate state within this macrostate does not affect the macrostate behavior (Fig 5D, bottom panel). The dependence of the mean dwelling time within the stabilized state and the MFAT is further analyzed. As shown in Fig 5E and 5F, with increase of either the ratio ΔE 2 : ΔE 1 or the overall energy barrier, the mean dwelling time within the stabilized state and the MFAT increase. The same dependence is found for the varying ΔE case (S7 Fig, panels A-B) but the scale of changes significantly increases. Another difference between two cases is the reverse dependence of MFAT on ΔE 2 : ΔE 1 as shown in S7 Fig panels C-D when ΔE is small. Taken together, a greater overall energy barrier or greater intermediate state energy barrier ratio will cause for a longer stay in the aggressive stabilized state as well as a slower transition into the mesenchymal state; this may contribute to cancer metastasis.

Discussion
Multiple partial EMT states have been proposed and verified to exist in the EMT process, and the number of states involved depends on the cell line. Here, we focused on the dynamics and functions of EMT plasticity and provided microstate and macrostate concepts for EMT. Based on the parameter fitting of a Markov model of EMT with the experimental data, we proposed a statistical mechanism for EMT in which many unobservable microstates exist within the observable macrostates. That is, EMT is a process with many microstates, which can be mapped into different observable macrostates. The microstates may encompass other dimensions of the cell that are coupled to the EMT process, such as the cell cycle, stemness, and metabolism. How the number of microstates is determined is not clear but is believed to largely depend on the cell line, which may be revealed with analysis of single-cell time-course data [50].
Here, we performed a systematical analysis on how the number of intermediate states changes cell transformation. We considered several scenarios including a single transition path, parallel transition paths, and layered transitions. We found that in the single transition path scenario, increasing the number of intermediate states can accelerate EMT, but too many intermediate states can also potentially decelerate EMT, thus showing non-monotonic behavior. However, adding parallel paths or transition layers can further accelerate the EMT process, especially for a system with a large number of intermediate states.
Our results suggest that the number of intermediate EMT states may function as an indicator for the malignancy of the cancer. That is, the more intermediate EMT states in one cancer type, the better the chance it will metastasize. A complementary feature of multiple intermediate phenotypes was reported to stabilize the cancer stem cell population [51]. Thus, the malignancy level can be quantified by the number of potential intermediate EMT states. It is estimated that metastasis causes 90% of deaths related to cancer [49]. If we can stop or slow down the transition, we have a great chance of treating cancer. Our results suggest one potential strategy targeting on the EMT spectrum, i.e., reducing the number of the potential partial EMT states.
Lastly, the existence of a stabilized state within the EMT process causes a slower transition. It is reported that cells in the partial EMT state contribute more to cancer metastasis than the mesenchymal state [52]. Further more, some partial EMT states can be stabilized by the cancer cells [53]. Our simulation shows that stabilizing one intermediate state will trap the cells there for a much longer time period, slowing down full EMT. Multiple intermediate states emerge due to various interlinked positive feedback loops formed in the regulatory network of EMT, which has been demonstrated in many previous works that combine mathematical modeling and quantitative experiments [5-11, 45, 54, 60]. It also has been shown that removal of counteracting determinants traps cells in the rare myeloid transition state [55]. Similarly, if we can tune the strength of the feedback loops that are responsible for the generation of these partial EMT states, the partial EMT states will destabilize and even disband. This will make the first step of metastasis mediated by EMT or partial EMT more difficult and thus give us more time to treat the cancer.
Here in this work, we did not consider the molecular mechanism of the intermediate states but focused on the how the number of the intermediate states affect the EMT process. Further work is needed to verify the prediction derived here and the detailed molecular mechanism needs to be determined for the potential therapeutic targets. Our results suggest that a small perturbation of the signaling pathway could change the EMT process significantly and thus EMT-related diseases, such as cancer metastasis and renal fibrosis. Our previous work on renal fibrosis suggests that knockout of the EMT genes might not be the optimal treatment design for acute kidney injury and long-term fibrosis [41]. Thus, future works can search control strategies to slow down or accelerate EMT with dynamic perturbation of the signaling pathway toward optimal treatment of cancer or fibrosis.
The accelerating function of intermediate states on EMT is analogous to the phenomena in protein folding [56]. It is becoming increasingly apparent that many other cellular processes have multiple intermediate states instead of just one binary or continuous process. For example, stem cell differentiation consists of many intermediate states [57][58][59]. Thus, it will be exciting to analyze one biological process in different perspectives, including statistical mechanics, mathematical modeling, and single-cell analysis.