Intra-tumor heterogeneity, turnover rate and karyotype space shape susceptibility to missegregation-induced extinction

The phenotypic efficacy of somatic copy number alterations (SCNAs) stems from their incidence per base pair of the genome, which is orders of magnitudes greater than that of point mutations. One mitotic event stands out in its potential to significantly change a cell’s SCNA burden–a chromosome missegregation. A stochastic model of chromosome mis-segregations has been previously developed to describe the evolution of SCNAs of a single chromosome type. Building upon this work, we derive a general deterministic framework for modeling missegregations of multiple chromosome types. The framework offers flexibility to model intra-tumor heterogeneity in the SCNAs of all chromosomes, as well as in missegregation- and turnover rates. The model can be used to test how selection acts upon coexisting karyotypes over hundreds of generations. We use the model to calculate missegregation-induced population extinction (MIE) curves, that separate viable from non-viable populations as a function of their turnover- and missegregation rates. Turnover- and missegregation rates estimated from scRNA-seq data are then compared to theoretical predictions. We find convergence of theoretical and empirical results in both the location of MIE curves and the necessary conditions for MIE. When a dependency of missegregation rate on karyotype is introduced, karyotypes associated with low missegregation rates act as a stabilizing refuge, rendering MIE impossible unless turnover rates are exceedingly high. Intra-tumor heterogeneity, including heterogeneity in missegregation rates, increases as tumors progress, rendering MIE unlikely.

ter cells, causing an aneuploid state.
• Aneuploidy := a chromosome number that is not a multiple of the haploid complement.
An euploid cell carries a multiple of the same set of chromosomes. Haploids (1 copy of each chromosome), diploids (2 copies) and polyploids (> 2 copies) are all examples of euploidy. • Karyotype := a numeric vector with one entry per chromosome type (e.g. we have 22 types when considering the set of human autosomes). Each entry indicates the copy number of that type of chromosome. Each karyotype corresponds to a compartment of a multi-compartment ODE. • Ploidy := the total number of chromosomes occurring in the nucleus of a cell. Also referred to as DNA content in the continuous setting. • Karyotype composition := the relative frequency of all karyotypes in a population at a given time. • Set of viable karyotypes := all possible karyotypes associated with non-negative growth rates. This can either be a single interval (e.g. 22 -88 chromosomes) or multiple intervals of karyotype viability interspersed between nonviable regions.

Relationship between timescale and characteristic amount of DNA shifted per missegregation
Let T p be the time it takes for a mitotic cell to segregate its chromosomes among daughter cells. Given that cells divide at a rate and that only dividing cells missegregate (at a rate ), we expect the timescale of missegregations to be proportional to T p . We approximate the shift in DNA content over time as Fickian di↵usion, which gives us L p , the characteristic amount of DNA content shifted during a missegregation event, as: L p ⇠ p T p [1]. From here we can approximate the timescale on which missegregations happen as: T p ⇠ L 2 p /( ).

Numerical methods for calculating the critical curve
Critical curves are derived by leveraging the fact that the critical turnover rate for a given is when the rate of missegregation induced cell death equals the growth rate ( µ). Let J be the Jacobian defined in (10). The system reaches a steady state when n i J = 0. Nontrivial solutions can be found by choosing functions for the mis-segregation rate and death rate µ (which parameterise the matrix J) such that the dominant eigenvalue is zero. A robust method for determining pairs of values (B, M ) on the critical curve (which can cope with heterogeneous functions for , µ), is to: 1) choose an input value B for the function (i, B), then 2) find the value of M parameterizing µ(i, M ) such that the dominant eigenvalue of J is equal to zero. The eigenvector corresponding to this eigenvalue is the steady state of the system. Although this method is robust, it requires multiple generations of the matrix J with di↵erent values of M and so can become slow when J is a large matrix. A considerable time saving can be achieved when death rate does not depend on copy number state, i.e. µ = M . In that case, J can be expressed in terms of the adjusted matrix J ⇤ : where J ⇤ represents the transition matrix with µ = 0 and I is the identity matrix. It can be shown that the eigenvectors of J are identical to those of J ⇤ , therefore the steady state population distribution (i.e. the karyotype composition) does not depend on the death rate µ and is given by the dominant eigenvector of J ⇤ . Denoting this eigenvectorn i , after normalisingn i such that the entries sum to one, P in i J ⇤ gives the net growth rate of the cell population when µ = 0 and the population distribution has reached steady state (although net population growth still occurs when µ = 0). Therefore, the true steady state (and therefore the critical curve) is found by choosing the value of µ which exactly negates this growth, i.e: µ = P in i J ⇤ . We note that if death rates are heterogeneous then J ⇤ is no longer independent of the death rate µ, i.e. the approach relies on the assumption of homogeneous rates.
Both of the above approaches work when a single chromosome type has a finite viable copy number interval as well as when multiple chromosomes have finite intervals (i.e. n i becomes n~i). However both become infeasible when we have k 1 chromosome types, each with multiple copies, as the matrix J becomes too large. However the critical curve can still be generated by extrapolating from the single chromosome case. Summing over the rows of the matrix q ij (eq. (7)) gives a probability vector q ⇤ i , which contains the probability that a daughter of a cell with i chromosome copies receives a viable number of copies. The overall probability that a daughter cell from the steady state population receives an unviable number of copies for a single chromosome type is therefore: q d = 1 q ⇤ i ·n i . The probability that a daughter receives viable copy numbers for all k chromosome types is: The rate of change of the alive cell population is given by: , where n a is the total number of alive cells. Setting to zero, we have µ = 2q a 1.

MIE is impossible for infinite compartments -continuum argument
To appeal to the continuum model we have to assume that only small changes in DNA content (small jumps) happen during a given cell division. Under this assumption equation (9) can be expanded as: Defining p = i/K, and p = 1/K, for K ! 1 we have: This leads to: where p 2 (0, 1) and n(0) = n(1) = 0. We make use of the Ansatz, n = e ↵t U (p), which results in the eigenvalue problem: For simplicity, we define q = 1 K 2 and convert to Sturm-Liouville form: Multiplying by U and integrating over p, we obtain an equation for ↵ (the Rayleigh Quotient): For K ! 1 we may neglect the terms with K 2 and obtain Provided that r = µ > 0 for all p we define m = inf( ( µ)) and P = sup( ), it follows that ↵ m P > 0.
Finally, we note that the Rayleigh quotient satisfies ↵ min  ↵  ↵ max for all amendable test functions U , where ↵ min , ↵ max are the smallest and largest eigenvalues, respectively. Hence, ↵ max > 0 and so MIE is impossible. Note that the proof did not rely on the whole integral having r > 0. Suppose that we have an almost everywhere di↵erentiable function U that is 0 on [0, 1]\E, with E ⇢ [0, 1] and m, P defined as above. Then (23) becomes: provided that |E| > 0. On a final note, this may seem to imply that we have shown that a finite number of compartments in the small jump case can't have MIE, but this would be false. We must take a closer look at the passage to equilibrium. For simplicity suppose the viable karyotype interval is at the beginning i = 1, . . . , f(K) where r > 0. Defining p = i/K as before, note that the viable interval length is on the order of f (K)/K. For the measure of this interval to remain finite, we require that f (K) ! cK for K ! 1 for some 0 < c < 1 to rule out MIE. The viable interval length must grow at the same order as the number of viable intervals grows, in order to be able to rule out MIE using this method.

Ruling out MIE
Here, we establish su cient conditions for MIE to not occur. These are based on Gershgorin's circle (GC) theorem which states that for matrix A with elements a ij , the eigenvalues satisfy: Put di↵erently, every eigenvalue of A lies within at least one of the discs with center a ii and radius R i . Since MIE can be evaded if the maximum eigenvalue exceeds 0, a su cient condition is that none of Gershgorin's circle contain a part of the negative reals. That is a ii R i > 0 for all i.
Using GC, a su cient condition to avoid MIE is given by: Of course, we know that for large K, no missegregation rate will lead to MIE; this just provides a su cient condition.
If , and µ vary with ploidy i, we can still obtain a su cient condition to avoid MIE given by: If is homogeneous, then this reduces to The general problem (i.e. any arbitrary relation between , and µ) can be handled analogously, but conclusions can only be made for specific forms of q. We require both (11)- (12). As both of these need to be positive, we can find the minimum of these, which will provide su cient condition to escape MIE.
Let us consider a more complicated example to show the power of this method. Suppose that , µ are homogeneous, but that q ij = q(i ! j) obeys the following relationship This q can be thought of as one that favors small jumps but all jumps are permitted. Let us suppose that there is no limit on the number of compartments, then: This implies the relation for  i : We now look at the sum in (11) (note the symmetry q(i ! i + h) = q(i ! i h)), The sum in (12) is trickier since in this case we are looking at the possibilities q(i + h ! i) and it is clear that q(i + h ! i) 6 = q(i h ! i) (because any given ploidy state i is more likely to emerge from missegregations in a higher ploidy cell than in a lower ploidy cell). We look at the first part of the sum first valid when ⌧ 1. The sum from the other direction is nonzero only if h di/2e: Therefore, the sum term in (12) ( P j6 =k j q jk ) becomes: The bigger sum sets the threshold and is maximal for large i. With this choice of q, it is (11) which is the dominant condition. Thus, a su cient condition to escape MIE is given by: The appearance of this relation potentially suggests a fundamental limit on a kernel of this type. Indeed, it is not hard to show that for general q with q ii = 1 i , (11) yields Since both (11) and (12) must be met, the above condition is the upper bound on the i 's required to evade MIE.

Extensions to model whole-genome doubling and micronuclei formation
Though WGDs are technically feasible in the model (through an i parent giving birth to 2i and 0 ploidy o↵spring), this ignores the biological mechanisms for how WGD events may occur. There are two mechanisms for WGD [2,3] -endoreplication (i.e. completely skipping mitosis) and cytokinesis failure which often generates binucleated whole genome doubled cells. Chromosomes in the two nuclei can then merge the subsequent mitosis after nuclear envelope breakdown. We can modify equation (9) to account for WGD by introducing terms where ✓ i/2 = 0 if i/2 is not an integer. Mounting evidence suggests that a cell which mis-segregates many chromosomes has a higher risk to die than a cell mis-segregating only one chromosome [4,5]. Our current model structure however does not distinguish between these scenarios. To account for this intolerance, we can amend the inflow term by introducing a karyotype transition coe cient ij . One possible form = (1+|i j|) ↵ with ↵ < 0, where only relative distance is important. Other forms that take the parental state into account could also be considered. Combining this with the above, leads to The only requirement is that ij must fall o↵ with increasing distance between states. Yet another assumption, which could be eventually relaxed is the ploidy conservation in equation (3). The errors during mitosis could lead to the formation of micronuclei which are not directly involved in further downwind mitosis [6][7][8][9][10]. This amounts to replacing equation k for all k.  [11] and growth rates taken from multiple sources (cited in third column). The death rate is inferred. Many of the birth rates are comparable, but the net growth rates vary over orders of magnitude, which implies very di↵erent death rates. Last column shows percent contribution of respective normal human cell types to a total daily mass turnover of 80 g (taken from [12]) -these are correlated to both birth and death rates reported in tumors (Pearson r 0.923; P < 0.026).