Phenotypic equilibrium as probabilistic convergence in multi-phenotype cell population dynamics

We consider the cell population dynamics with n different phenotypes. Both the Markovian branching process model (stochastic model) and the ordinary differential equation (ODE) system model (deterministic model) are presented, and exploited to investigate the dynamics of the phenotypic proportions. We will prove that in both models, these proportions will tend to constants regardless of initial population states (“phenotypic equilibrium”) under weak conditions, which explains the experimental phenomenon in Gupta et al.’s paper. We also prove that Gupta et al.’s explanation is the ODE model under a special assumption. As an application, we will give sufficient and necessary conditions under which the proportion of one phenotype tends to 0 (die out) or 1 (dominate). We also extend our results to non-Markovian cases.


Introduction
With the same genetic background, cell population may have different cellular phenotypes. This has been one of the major topics in the research of cell population dynamics [1,2]. Very recently much attention has been paid to the stochastic conversions between different phenotypes [3,4]. For example, we know that cancer stem cells can give rise to cancer non-stem cells, but cancer non-stem cells can also transform back to cancer stem cells [5,6]. Generally, we can use a branching process (stochastic model) [7][8][9][10][11] or an ODE system (deterministic model) [12] to describe the dynamics of such cell population with multiple phenotypes. However, in many experimental settings, it is difficult or even impossible to count the total cell population [10,11,13]. Thus in the last fifty years, people began to consider the proportions of cell individuals with distinct phenotypes instead of the absolute numbers of cells of various phenotypes [7].
We know that through multistep accumulation gene mutations, healthy cells gradually transform to malignant cancer cells, which is the current view of cancer progression [14]. Among those mutations, most of them are neutral ("passenger mutations") and have no effect a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 on cell proliferation. Only a small portion of mutations will bring growth advantage ("driver mutations") [15]. Since the emerging of mutations can be regarded as purely stochastic, we can model such procedure with multitype branching processes (cf. Bozic et al.'s model [15,16]). There have been some results about the emerging time for certain number of driver mutations, and the relation between the number of emerged driver mutations and emerged passenger mutations [15,17].
In the experiments on breast cancer cell lines, Gupta et al. [4] found that the proportion of each phenotype will tend to a certain constant regardless of the initial population states ("phenotypic equilibrium"). They built a Markovian model, assuming that the evolution of the phenotypic proportions satisfies an n-state Markov chain, and used the ergodicity of the Markov chain to explain this phenomenon [4]. However, we find that the Markovian model is just the ODE system model under a special condition. We determine this condition and its biological meaning. Furthermore, we try to remove this condition and explain the experimental phenomenon in [4] under more general context.
In the deterministic model (ODE system), we only consider the average behavior of cell population dynamics (which requires a large initial population). However, using the stochastic model (branching processes), we can study the trajectory behavior. We prove that the proportions will converge not only on average, but also almost surely. This implies that even with a small initial population, we can still observe "phenotypic equilibrium".
In the theory of multitype branching processes, people have observed similar proportion convergence phenomenon and proved such phenomenon in several limit theorems under different conditions [18][19][20]. Those are possible ways to explain "phenotypic equilibrium", but those required conditions may not be satisfied in experiments. Thus we improve those limit theorems by dropping redundant conditions. We will see that the conditions we need are all biologically reasonable. Therefore, we give a stochastic explanation of "phenotypic equilibrium". This result may also be of interests to probabilists.
Generally we only consider Markovian branching processes, but sometimes the biological process is not memoryless, thus we need to consider non-Markovian branching processes. We show that under some conditions, the non-Markovian branching processes can be transformed into Markovian branching processes. Using this trick, we demonstrate similar results for non-Markovian branching processes.
In Section 2, we will define some notations, and give the mathematical description of our models, which is based on [8] and [20]. In Section 3, we will describe under which condition the deterministic model becomes the Markovian model in [4]. In Section 4, we will prove that under some mild conditions, the "phenotypic equilibrium" phenomenon will always happen in the Markovian branching process model. Specifically, we will improve a limit theorem about proportion convergence in multitype branching processes. We will also apply our results to Bozic et al.'s model. In Section 5, as an application of our conclusions, we will investigate under what conditions one of the phenotypes will die out or dominate. In Section 6, we will show that the above conclusions are still valid in more general cases.

Notations and model description 2.1 Notations
Boldface letter, like A, represents matrix. A 0 means the matrix transpose of A. I is the identity matrix. Letter with arrow above, likeũ, represents row vectors.1 and0 represent all ones and all zeros vectors. Consider the population of cells with n phenotypes: Y 1 , Y 2 , Á Á Á, Y n . In the stochastic model,XðtÞ ¼ ðX 1 ðtÞ; X 2 ðtÞ; . . . ; X n ðtÞÞ is the population at time t, where X i (t) is the population of phenotype Y i . P i ðtÞ ¼ X i ðtÞ= P n i¼1 X i ðtÞ is the proportion of phenotype Y i , as long as the denominator is not zero.PðtÞ ¼ ðP 1 ðtÞ; P 2 ðtÞ; . . . ; P n ðtÞÞ. In the deterministic model,xðtÞ ¼ ðx 1 ðtÞ; x 2 ðtÞ; . . . ; x n ðtÞÞ is the expected population at time t. We only consider the case where each x i (t) is nonnegative, and at least one of them is positive (to guarantee jxj > 0). jxj ¼ P n i¼1 x i ðtÞ is the total population.p ¼x=jxj is the proportions of different subpopulations among the total population.

Stochastic model
Assume that the population of cells have n phenotypes: Y 1 , Y 2 , Á Á Á, Y n . Assume that all the cells evolve independently. (During the exponential growth period, this assumption is almost true [21]) We can present the generalized cell divisions, death and phenotypic conversions as the following reaction form: It means that for an Y i cell, it will live an exponential time (we will consider non-exponential lifetime in Section 6) with expectation 1/α i and turn into d i1 Y 1 cells, d i2 Y 2 cells, Á Á Á, d in Y n cells, where d i1 , d i2 , Á Á Á, d in are random variables taking nonnegative integer values. d i1 , d i2 , Á Á Á, d in are not necessarily independent, but they are assumed to be independent with the exponential reaction time of any cell.
For example, an asymmetric division . So the probability distribution on the possible reactions gives the joint probability distribution of d i1 , d i2 , Á Á Á, d in .

The relation between the Markovian model and the deterministic model
In the Markovian model [4], it is assumed that the population proportionsp satisfies the Kolmogorov forward equations of an n-state Markov chain: where Q is the transition rate matrix, satisfying1Q 0 ¼0. In this section, we will discuss whether such assumption can be satisfied in the deterministic model. From Eq (1), we have dp dt Ifb ¼ k1 for some constant k, then Eq (3) becomes dp=dt ¼pðA À kIÞ, and 1 0 ðA À kIÞ 0 ¼b 0 À k1 0 ¼0 0 . Thus Eq (3) has the same form of Eq (2). Ifb 6 ¼ k1, there are non-zero quadratic terms of p i (t) in Eq (3), implying that Eq (3) does not have the same form of Eq (2).
Thus we have Theorem 1. Eq (4) is the sufficient and necessary condition for that the proportions of different phenotypes in the deterministic model (1) satisfy the Kolmogorov forward equations of an nstate Markov chain.
Now we know that the Markovian model is a special case of the deterministic model. In biology, Eq (4) means that the growth rates (average number of descendants produced per unit time) of different phenotypes are the same. This condition might be well satisfied for breast cancer cells, which explains why the data fitting in [4] is satisfactory.

Asymptotic behavior in general cases
In general cases, Eq (4) is not satisfied since different phenotypes may differ in cell cycling time [22,23], then the Markovian model is invalid. Thus we need other methods to study the asymptotic behavior of the population dynamics. In this section, we will prove that under some mild conditions, the proportions of different phenotypes will tend to some constants regardless of initial population states.
From Perron-Frobenius theorem [24,25], we know that A has a real eigenvalue λ 1 (called Perron eigenvalue), such that for any eigenvalue μ 6 ¼ λ 1 , Re μ < λ 1 . λ 1 has a left eigenvector u = (u 1 , u 2 , Á Á Á, u n ) (called Perron eigenvector), satisfying u i ! 0, 8i and P n i¼1 u i ¼ 1. When λ 1 is simple, suchũ is unique. We know that the set of all n-order real square matrices with repeated eigenvalue has measure 0 (as a subset of R n 2 ) [26]. Thus it is reasonable to assume that λ 1 is simple.

Deterministic model
We have proved the following theorem in Appendix B of [26].
Theorem 2. Assume that λ 1 is simple. Starting from any initial value except for the point in some zero-measure set, we have (x 1 (t), x 2 (t), Á Á Á, x n (t))/exp(λ 1 t) !cũ as t ! 1, where c > 0 is a constant. In this case, the solution of Eq (4) will tend toũ as t ! 1. Thus Eq (4) has one and only one stable fixed pointũ and no stable limit cycle.

Stochastic model
Since 1960s, probabilists proved that for a continuous-time multitype branching process, XðtÞ=e l 1 t ! Wũ under different conditions, where W is a nonnegative random variable. In [19,27] and [28], it is required that λ 1 > 0 and A is irreducible (this implies λ 1 is simple). In [19] it is proved that W = 0 or W > 0 according to whether the population will become extinct. In [18], it is required that the branching process is discrete in time. In [10,11] it is required that the initial population tends to infinity. Janson [20] requires that λ 1 > 0, λ 1 is simple, and assumes a special condition about communicating classes structure (see Remark 2). Based on [20] and [19], we will prove the convergence theorem without Janson's last assumption (Theorem 3). We can see the benefit of this improvement in Section 5.

Preliminaries.
In this section, we assume that λ 1 is simple and positive. λ 1 > 0 means that the total cell population is increasing.
Sometimes, the transformation from one phenotype to another phenotype is not reversible. For example, a mature human red blood cell (which loses its nucleus) cannot transform back to a zygote. Thus we need to classify phenotypes according to communicating behaviors. In mathematical language, we need to study communicating classes of A when A is reducible.
If a cell of phenotype Y i can produce (directly or indirectly) a cell of phenotype Y j and vice versa, then we say phenotype Y i communicates with phenotype Y j (Y i $ Y j ). Since "$" is an equivalent relation, we can divide the n phenotypes into several disjoint sets (called communicating classes) according to A [29]. Then we can order the classes and rearrange the phenotypes suitably to make A block-triangular. (Each diagonal block corresponds to a communicating class). Thus the eigenvalues of A consist of all eigenvalues of diagonal blocks. Every eigenvalue corresponds to a diagonal block, and then corresponds to a communicating class. (See [20] and [18] for details).
Denote the communicating class corresponding to the Perron eigenvalue λ 1 by T. , where each W represents a different nonnegative matrix (not 0). Assume that D 3 has the Perron eigenvalue λ 1 , then D 3 corresponds to the communicating class T. Denote the other three communicating classes by C 1 , C 2 , C 4 . For two communicating classes C i and C j , we write C i ) C j if there exist phenotype X k i 2 C i and X k j 2 C j such that a k j , ki > 0. For two communicating classes C and D, we write C ! D if there exist communicating classes C = C 1 , Then we can illustrate the communicating classes in the example above as Fig 1. For a communicating class C, defineĈ ¼ fY i jY i 2 C j ; C j ! Cg. In other words,Ĉ is the set of all phenotypes that can produce (directly or indirectly) phenotypes in C. In the example For a communicating class C, define " C ¼ fY i jY i 2 C j ; C ! C j g. In other words, " C is the set of all phenotypes that can be produced (directly or indirectly) by phenotypes in C. In the example above " For the Markovian branching processXðÁÞ, we say that a cell V with phenotype inT becomes "essentially extinct" if at some time no cell of any phenotypes inT is V or its descendants. In other words, V and its descendants become extinct insideT . We say that a trajectory of the branching processXðÁÞ becomes "essentially extinct" if at some time no cell of any phenotypes inT remains. This means that we can never get a cell with phenotypes in T any more. If so, we cannot have the desired convergence property (Theorem 3), since the proportions of phenotypes in T should be positive (Lemma 2). Let the branching processXðÁÞ start at any initial populationXð0Þ as long as it has some cells with phenotypes inT .

Results and proofs.
We now state the main result of this paper and then give the proof of it.
Lemma 2. Assume that λ 1 is simple. Then u i > 0 , Y i 2 " T . Proof. Apply the Perron-Frobenius theorem to A " T , the restriction of A on " T , and letw be its Perron eigenvector. w T , the restriction ofw on T cannot be0, otherwise λ 1 is an eigenvalue of A " T nT , a contradiction. From Lemma 1 we know thatw is positive. Set u i = w i if Y i 2 " T , and T , thenũ is the Perron eigenvector of A. Thus u i > 0 , Y i 2 " T . Lemma 3. (Lemma 9.8 in [20]) Assume that λ 1 is simple and positive. Then we have almost surely e À l 1 tX ðtÞ ! Wũ as t ! 1, where W is a nonnegative random variable, and PðW > 0Þ > 0. Lemma 4. (Lemma 9.7 (ii) and (iii) in [20], originated from Theorem V.7.2 in [19]) Assume that λ 1 is simple and positive, and " T contains all phenotypes, then W = 0 if and only if the branching process becomes essentially extinct almost surely.
Remark 2. Janson's paper [20], Section 2] has six fundamental assumptions (A1)-(A6). Assumptions (A1)-(A5) have been satisfied in this paper (regarding (A5) as "the process is not essentially extinct at time 0"). Assumption (A6) " " T contains all phenotypes" is only used in Lemma 4. In fact, we will prove (in Lemma 7) that Lemma 4 is still correct without Assumption (A6). Thus we can drop Assumption (A6) in the main result. Assumption (A6) implies all phenotypes should have the same exponential growth rate λ 1 , and no phenotype will die out or dominate (see Section 5), which is not necessarily satisfied in experiments. For example, in Bozic et al.'s paper [15], they consider tumor cells which gradually cumulate mutations which accelerate cell growth. Since tumor cells with more accelerating mutations cannot switch back to tumor cells with less such mutations, they must have different growth rates. Also, the proportion of tumor cells with less accelerating mutations will gradually decrease to 0, which contradicts with assumption (A6).
The following lemma is a modification of the second Borel-Cantelli lemma. We base our proof on Theorem 2.3.6 in [30].
Lemma 5.  Lemma 3, the set of such trajectories has positive probability), we can find an essentially non-extinct cell with phenotype in T within finite time. If we can find such cell at time t, then we can find such cell at any time τ > t.
Proof. If at some time t all cells with phenotypes inT nT die out, then at least one of the remaining cells with phenotypes in T is not essentially extinct.
Otherwise, at each time t = k (k 2 Z þ ), there exists one cell E k with phenotype inT nT. (For different k, E k may be the same cell). Let B k (k 2 Z þ ) be the event that during the time interval [k, k + 1), the cell E k produces (directly or indirectly) at least one cell with phenotype in T.
If B k happens, choose one such cell with phenotype in T and put it in a special set S. Consider any two cells F and G in S, and assume F is produced in the time interval [i, i + 1), G is produced in the time interval [j, j + 1), and i < j, where i; j 2 Z þ . Then E j is the ancestor of G.
Since E j has phenotype inT nT, and F has phenotype in T, F cannot be the ancestor of E j . Since E j is still alive at time t = j, when F has been produced, E j cannot be the ancestor of F. Thus F cannot be the ancestor of G. Since G is produced after F, G cannot be the ancestor of F. In sum, one cell in S cannot be the ancestor of another cell in S. Thus all cells in S are independent.
Consider two phenotypes Y i and Y j , and assume a cell with phenotype Y i can produce a cell with phenotype X j directly, namely Pðd ij > 0Þ > 0. Because of Markovian property, within a time span of 1/n, the probability for a cell with phenotype Y i to produce a cell with phenotype Y j directly is Z ij ¼ ½1 À exp ðÀ a i =nÞPðd ij > 0Þ > 0. Let Z ¼ min i;j fZ i;j : Pðd ij > 0Þ > 0g. For a cell with phenotype inT nT, it can produce a cell with phenotype in T within n steps. Thus the probability of B k is no less than η n , regardless of what happens before time t = k. Now we can use Lemma 5 with = η n , and there will be an infinite number of cells in S, except for a zero-measure set of trajectories. According to Lemma 3, the probability for one cell in S to become essentially extinct is less than 1, thus the probability for all cells in S to become essentially extinct is 0, and at least one cell in S is not essentially extinct, except for a zero-measure set of trajectories.
Lemma 7. Assume that λ 1 is simple and positive, then W = 0 if and only if the branching process becomes essentially extinct almost surely.
Proof. (: For a trajectoryXðÁÞ outside the zero-measure exclusion set of Lemma 3, assume that at some time τ ! 0 (dependent on the trajectory), X i (τ) = 0 for all Y i 2T . For any Y j 2 T, 0 = lim t ! 1 e − λ 1 t X j (t) = Wu j . From Lemma 2, u j > 0. Thus W = 0 almost surely.
): Assume that P(W = 0 & the trajectory is not essentially extinct) = P 0 > 0. According to Lemma 6, we can find time t 0 > 0 large enough such that P(W = 0 & the trajectory is not essentially extinct & there exists an essentially non-extinct cell with phenotype in T at time t 0 ) ! P 0 /2 > 0. On this set, only consider this essentially non-extinct cell and its descendants from time t ! t 0 , then the population is restricted on " T and we can use Lemma 4. Now we have W > 0 except for a zero-measure set of trajectories, which is a contradiction.
From Lemma 3 and Lemma 7, we can obtain Theorem 3.

Remark 3.
The assumption of λ 1 > 0 is not too strong. If λ 1 < 0, then from Theorem 2, the expected populations decay to0. Therefore this process will become extinct almost surely. For λ 1 = 0, consider an example that each cell always have exactly one child, and the child can be any phenotype with equal probability. Then the total population is fixed, and the proportions will always fluctuate, so there is no convergence [20]. For Gupta el al's experiment, the initial cell population is very large in cancer cell lines, thus the probability of essential extinction is quite small. Therefore, the proportions will almost always tend to the same constants. This gives a satisfactory stochastic explanation of the phenotypic equilibrium phenomenon reported in [4]. The deterministic model only reflects the average behavior of many trajectories (or equivalently a large initial population). When the cell number is relatively small, the stochasticity is not negligible, and it is not very reasonable to assume the cell number changes continuously. So the stochastic model is more effective than deterministic model. That is why we also prove the same result for stochastic model. [15]). Cells can acquire driver mutations, but cannot lose them. Also we do not consider cell death, so there is no essential extinction. If we only consider cells with no more than n driver mutations, then the population of cells with exactly n mutations will grow with exponential growth rate 1 − (1 − s) n /2, which is larger than that of cells with less driver mutations. So cells with n driver mutations will dominate exponentially fast. Generally, the population of cells with n driver mutations will grow exponentially with rate 1 − (1 − s) n /2, and as long as the next driver mutation emerges, its proportion will decay to 0 exponentially fast. Also, we should notice that such results are valid for almost every trajectories. Here we do not consider the difference in passenger mutations, since they have no effect on cell growth, and considering them will make the Perron eigenvalue not simple. The model in [15,16] is discrete-time, but we can see from Section 6 that we can still apply our results.

When will one proportion tend to 0 or 1?
In population dynamics, we are also concerned about when one phenotype dies out or dominates. In terms of the notations in this paper, we need to consider when P i (t) ! 0 or P i (t) ! 1 as t ! 1.
In this section, we will still assume that the Perron eigenvalue λ 1 of A is simple and positive. Then from Theorem 3, we have ðP 1 ðtÞ; P 2 ðtÞ; . . . ; P n ðtÞÞ !ũ ¼ ðu 1 ; u 2 ; . . . ; u n Þ almost surely in the stochastic model. Thus we can get the following corollaries from Lemma 2.
Corollary 4. P i ðtÞ ! 0 , Y i = 2 " T . Corollary 5. P i ðtÞ ! 1 , " T ¼ T ¼ fY i g. Remark 5. From Corollary 4 we can see that the sufficient and necessary condition under which no phenotype dies out, namely 8i; P i ðtÞ ↛ 0, is that " T contains all phenotypes. This is just Janson's last assumption.
Remark 6. If we find that P i (t) ! 0, P j ðtÞ ↛ 0 in an experiment, then we know that the phenotype Y j will never transform to Y i in any way. If we find that P i (t) ! 1, then we know that the phenotype Y i will never transform to any other phenotypes.

Model generalization: non-exponential lifetime
In the previous sections, we assumed that the lifetime of a cell is exponentially distributed and independent of the type and number of its descendants. However, in real biological system, the lifetime distribution should be more like lognormal, gamma, Weibull, or exponentially modified Gaussian distribution [31,32]. Furthermore, the time needed for division and conversion have different distributions [32]. In this way the process is a multitype Bellman-Harris branching process (also called age-dependent branching process) [33], no longer Markovian.
We can use the "device of stages" method to approximate a non-exponential random variable with several exponential random variables [34]. This indicates that through adding supplementary sub-phenotypes, we can simulate a non-Markovian branching process with a Markovian branching process. See the example illustrated in Fig 2. Here ðY 1 1 Þ; . . . ; ðY 9 1 Þ are supplementary sub-phenotypes. We artificially assume such supplementary sub-phenotypes exist just by technical reasons. They do not have biological meanings. When we count Y i 1 as Y 1 , the process has the same distribution with the original one. If we have convergence with this new process, then we also have convergence for the original process. An Y 1 cell has probability a 1 1 =ða 1 1 þ a 5 1 þ a 9 1 Þ to divide into Y 1 + Y 1 , and probability ða 5 1 þ a 9 1 Þ=ða 1 1 þ a 5 1 þ a 9 1 Þ to convert into Y 2 . Here we set a 1 1 , a 5 1 , and a 9 1 to be large enough while keeping their proportions, so that the time needed for the first step is ignorable (exponential random variable with expectation 1=ða 1 1 þ a 5 1 þ a 9 1 Þ).  [34], any non-negative random variable can be approximated to any accuracy by such combination of convolutions of exponential random variables. Thus we can simulate such non-Markovian branching processes to any precision with Markovian branching processes. Here the lifetime of a cell can be non-exponential, and the lifetime of a cell can depend on the type and number of its descendants. Now we can apply Theorem 3 to those sub-phenotypes. The proportion of each sub-phenotype converges to a constant. Thus the proportion of each phenotype (including all its sub-phenotypes) converges to a constant. This proves the "phenotypic equilibrium" phenomenon in a more realistic stochastic model. In addition, the conclusions in Section 5 are still valid.
Remark 7. The most unrealistic aspect of exponential lifetime is that the density function reaches maximum at 0, but one cell cannot divide right after its birth. For the sum of several independent exponential variables, the density function at 0 is 0. By the law of large numbers, the density function of the sum of n independent exponential variables with parameter nλ will have a sharp peak near λ when n is large.
Remark 8. The proportion convergence theorem for non-Markovian (age-dependent) branching processes can be proved directly, but under stronger conditions [33].

Conclusion and discussion
We have presented a unified stochastic model for the population dynamics with cellular phenotypic conversions. We have given the sufficient and necessary condition under which the dynamical behavior of our model can be described by an n-state Markov chain. In general case, we have proved that the proportions of different phenotypes will tend to constants regardless of their initial values, and we have investigated the sufficient and necessary conditions under which one phenotype will die out or dominate. We also extend our model to non-Markovian case while keeping the above conclusions valid. In this way we explain experimental phenomenon in [4].
As remarked in Section 4.2, we improve a limit theorem in branching processes, which may be of theoretical interests.
Our results can also apply to cancer progression models with gene mutations, such as [15][16][17].
Since the phenotypic conversions have been reported in various cellular systems, such as E. coli [35] and cancer cells [6,36], we hope that our model here could be applied as a general framework in the study of multi-phenotypic populations of cells.
With the improvement of experiment methods, we will accumulate more and more data for single cell. In single cell level, the stochasticity is not negligible anymore. Thus we should build detailed stochastic models (branching process would be a good framework) for cell population dynamics. Such models will provide new insights and predictions. In the meanwhile, stochastic process related theories should attract more attention.
There are some possible improvements about this research. First, we assume that the branching process is time homogeneous, namely the birth and death rates keep the same for all time. However, as time goes on, the cell density increases, and the birth and death rates should change [21]. Thus a possible improvement is to have time-dependent or density-dependent d ij . Second, we only prove the convergence for t ! 1, but in experiments we only have finite observation time. Thus it is meaningful to estimate the convergence rate. Third, we only consider finite many phenotypes. We find that there is essential difficulty to build similar theory for infinite-type branching processes. However, there have been some works considering infinite many mutation states with multitype branching processes, such as [17].