Supporting Information S1 Appendix Mathematical Background Mathematical Modeling of Multiple Pathways in Colorectal Carcinogenesis using Dynamical Systems with Kronecker Structure

1 Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany 2 Data Mining and Uncertainty Quantification (DMQ), Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany 3 Image and Pattern Analysis Group (IPA), Heidelberg University, Heidelberg, Germany 4 Department of Applied Tumor Biology (ATB), Institute of Pathology, University Hospital Heidelberg, Heidelberg, Germany 5 Institute of Pathology, University Hospital Leipzig, Leipzig, Germany

We represent the graph as an adjacency matrix A, that is, a matrix which has as many rows and columns as there are vertices in the graph. The entry a i,j at position (i, j) of the matrix A indicates whether there is no connection (a i,j = 0), a weak relationship (a i,j is small) or a strong bond (a i,j is large) between the vertices i and j.
The value a i,j is often referred to as the weight of the edge ij. Vertices i and j which are connected by an edge are called adjacent and are denoted by i ∼ j. Vertices may also be connected to themselves. In this case, the edge is called a self-loop.
It is also possible to indicate directed edges, by letting a i,j (weight of the edge from i to j) differ from a j,i (weight of the edge from j to i). A directed graph with no directed cycles is called a directed acyclic graph (DAG). The vertices of a DAG can be ordered such that the adjacency matrix is an upper triangular matrix.

Combining Processes and the Cartesian Product of Graphs
Now assume that we have two processes, e.g., the occurrence of mutations in two different genes of the same cell, which we want to combine and represent as one single process. For this combination we have the following key assumptions: Existence of States The states in the combined process should exactly be all possible combinations of states from the underlying processes.
Edge Connectivity We require that in the combined model only one of the processes can evolve to the next state at any point in time.
Independence of the Processes We require that the processes are independent of each other.
In the context of the example above, we interpret from the first assumption that all combinations of mutations in the different genes are possible (i.e., there are no mutations that prevent other mutations) and that there are no additional states. This also implies that the order in which mutations are accumulated is ignored.
The edge connectivity assumption states that no two mutations in different genes can occur at the exactly same point in time.
April 27, 2021 2/10 The third assumption entails that a mutation in one gene does not change the mutation probability in another gene. While the independence is a strong requirement, it makes the mathematical analysis more amenable by reducing the number of parameters and improving the model structure. From a medical point of view, we consider this independence for all mutational processes where there are no data currently available about mutational dependencies or where data suggests independence.
Whenever there are data available, we introduce this dependency in our model (see also Section 3.1.5 in the main text).
This mathematical approach allows for the implementation of further genetic dependencies as soon as novel experimental data are generated. As the matrices are built in a structured way using the Kronecker structure, adding novel dependencies will not lead to a bottleneck with respect to computational costs.
Consider two processes that are represented by the graphs G 1 = (V 1 , E 1 ) and . The combined process, which satisfies our key assumptions, is then represented by a graph G = (V, E), that is given by the Cartesian product [1], denoted by a small square , of the graphs G 1 and G 2 (S1-1) The set of vertices V of the Cartesian graph product is given by the Cartesian This means the vertex set V consists of all possible combinations of vertices from the first graph with vertices from the second graph In other words, the first requirement is satisfied.
The edge set E is made up of edges of the forms where the vertices v 1 , w 1 ∈ V 1 are adjacent in the graph G 1 and similarly for v 2 , w 2 ∈ V 2 . This means that we connect the states in our combined process such that April 27, 2021 3/10 each edge corresponds to a single edge in exactly one of the underlying processes. From this we establish that our second assumption is also fulfilled.
As each edge in E corresponds to exactly one of the edges in E 1 ∪ E 2 , we can transfer the edge weights from the graphs G 1 and G 2 to G. From this, we conclude the satisfaction of the independence assumption.
We can extend this combination to more than two processes by iteratively applying the definition to two of the processes.
To relate the Cartesian product of graphs to their adjacency matrices, we first need to consider the Kronecker product and sum of two matrices.

Kronecker Product and Sum
The Kronecker product A ⊗ B ∈ R mp×nq of two matrices A ∈ R m×n and B ∈ R p×q is defined [2,3] by the block matrix For square matrices A ∈ R n×n and B ∈ R m×m , we further define the Kronecker sum where I m (resp. I n ) denotes the identity matrix of size m × m (resp. n × n).
The Kronecker product and sum possess several favorable properties, among which is the compatibility with the matrix transpose (S1-6) Let A 1 and A 2 be the adjacency matrices of the graphs G 1 and G 2 . The adjacency matrix of the Cartesian Product G 1 G 2 is given by the Kronecker sum A 1 ⊕ A 2 [4].
This connection between the Cartesian product of graphs and the Kronecker sum is  The Kronecker product is not commutative, i.e., in general we have Accordingly, the graph products G 1 G 2 and G 2 G 1 are not equal to each other. However, as they are isomorphic to each other, we are free to choose an ordering of the matrices as long as we use the same ordering in all computations.

Linear Dynamical Systems
An important and highly used class of equations for dynamical systems in mathematical modeling are linear differential equations of the forṁ where A ∈ R n×n is a matrix (later chosen to be the adjacency matrix of a graph), which is constant with respect to time, and x 0 ∈ R n is a vector representing the initial value.
April 27, 2021 5/10 Linear differential equations have a unique solution [5, p. 60], which is given by where expm(A) describes the matrix exponential of the matrix A, which is defined by Computing the matrix exponential [6] can be done in a variety of different ways.
However, as the matrix exponential is multiplied with the vector x 0 , we do not need to compute the matrix exponential as a full matrix, but only the action of the matrix exponential on the vector x 0 . Algorithms for this task are studied in the context of exponential integrators [7,8].
The definition of the matrix exponential directly yields that the matrix exponential of a triangular matrix (as it is in our case) is again a triangular matrix.
The matrix A in the linear dynamical system will be the Kronecker sum of several matrices (Eq (13) in the main text). The matrix exponential of such a matrix simplifies according to [ expm (A i ) , (S1-10) where [n] denotes the set of integers from 1 to n.
Thus, instead of computing the matrix exponential of one large matrix, we can compute the matrix exponentials of several small matrices and connect them with the Kronecker product, which gives an additional performance boost for numerical computations.
The Eq (S1-10) can be seen as a generalization of e a+b = e a e b for real numbers a and b. However, it is important to note that this statement does not hold for the standard matrix addition and product. Thus, in general, we have [9, Theorem 10. In the model components for dependent mutational processes, we consider sums of matrices (see Eq (10) in the main text).
While in this case the matrices cannot be simplified as much as for the independent model component, we can still use the exponential integration mentioned above.
Further mathematical analysis is however possible in the context of operator splitting [10] or the Baker-Campbell-Hausdorff formula [11].
Often, the initial value x 0 itself can be written as a Kronecker product, i.e., x i , (S1-12) where the x i are vectors with sizes corresponding to the sizes of the matrices A i . With this, the solution of the dynamical system (S1-7) can be expressed as where the last equality is due to the mixed product property of Kronecker products [3].
In most cases, we are not only interested in a single entry of the solution vector x(t), but in the sum of several entries. To achieve this, we consider the scalar product v x(t) of the vector x(t) with a vector v of the same size which has a 1 in all states we want to consider and a 0 everywhere else.
Often this vector can, similarly to the initial value x 0 in Eq (S1-12), be written as the Kronecker product of vectors v i with sizes corresponding to the matrices In this case, the accumulation simplifies to where the first equality follows as above from the mixed product property of Kronecker products and the second one is due to the fact that the Kronecker product of real numbers (here: v i expm (tA i ) x i ) is the standard product of real numbers.
April 27, 2021 7/10 The introduction of a state for dead or disappearing crypts is natural from a mathematical point of view. We collect the probabilities that a crypt with a certain genotype dies in a vector d, which has as many entries as there are genotypes.
The state vector of the linear ordinary differential equation (10) in the main text is extended by an additional state x d which models the number of crypts in the death state, i.e., crypts that have disappeared. Similarly, the matrix is extended by one row and one column to include the probability vector d. This leads to the linear ordinary differential equation The 0 in the matrix denotes a row vector containing only zeros. Its interpretation is, that a dead crypt cannot move to a different state, i.e., become alive again, but stays dead, which is also expressed by the 1 in the matrix. The initial condition x d (0) = 0 means, that there are no death crypts at time t = 0.
As above, we write the solution of this linear ordinary differential equation using the matrix exponential Hereby, ϕ denotes the function ϕ(x) = e x −1 x , which is extended to matrix arguments in the standard way [9]. The second equality follows from (a slightly modified version of) Theorem 1 in [12].
This means that -when using the same parameters as in the original model -the April 27, 2021 8/10 number of crypts of a certain genotype is not changed compared to the original model.
The number of crypts in the death state is given by x d (t) = t d ϕ tA x 0 .