Inferring Growth Control Mechanisms in Growing Multi-cellular Spheroids of NSCLC Cells from Spatial-Temporal Image Data

We develop a quantitative single cell-based mathematical model for multi-cellular tumor spheroids (MCTS) of SK-MES-1 cells, a non-small cell lung cancer (NSCLC) cell line, growing under various nutrient conditions: we confront the simulations performed with this model with data on the growth kinetics and spatial labeling patterns for cell proliferation, extracellular matrix (ECM), cell distribution and cell death. We start with a simple model capturing part of the experimental observations. We then show, by performing a sensitivity analysis at each development stage of the model that its complexity needs to be stepwise increased to account for further experimental growth conditions. We thus ultimately arrive at a model that mimics the MCTS growth under multiple conditions to a great extent. Interestingly, the final model, is a minimal model capable of explaining all data simultaneously in the sense, that the number of mechanisms it contains is sufficient to explain the data and missing out any of its mechanisms did not permit fit between all data and the model within physiological parameter ranges. Nevertheless, compared to earlier models it is quite complex i.e., it includes a wide range of mechanisms discussed in biological literature. In this model, the cells lacking oxygen switch from aerobe to anaerobe glycolysis and produce lactate. Too high concentrations of lactate or too low concentrations of ATP promote cell death. Only if the extracellular matrix density overcomes a certain threshold, cells are able to enter the cell cycle. Dying cells produce a diffusive growth inhibitor. Missing out the spatial information would not permit to infer the mechanisms at work. Our findings suggest that this iterative data integration together with intermediate model sensitivity analysis at each model development stage, provide a promising strategy to infer predictive yet minimal (in the above sense) quantitative models of tumor growth, as prospectively of other tissue organization processes. Importantly, calibrating the model with two nutriment-rich growth conditions, the outcome for two nutriment-poor growth conditions could be predicted. As the final model is however quite complex, incorporating many mechanisms, space, time, and stochastic processes, parameter identification is a challenge. This calls for more efficient strategies of imaging and image analysis, as well as of parameter identification in stochastic agent-based simulations.

In this work a three-dimensional (3D) multi-scale model for tumor growth is developed, where each cell is considered as an individual agent ( [1], [2], [3] and [4]) and all molecular key players are represented as continuous function of concentrations in time and space. Parts of the model description are adaptations from [4].

Spatial Discretization
For numerical simulations of the cells a three dimensional Voronoi tessellation is implemented, where each lattice point can host only one cell at any time (see Fig. S1). The construction points of the Voronoi tessellation are uniformly distributed among the cubes of a square lattice with lattice constant a. One point in that tessellation is randomly placed into each cube. Constant a is chosen such that the average volume of a Voronoi cell, a 3 , corresponds to that of a cell, V = π/6d 3 . The cell diameter was found to be d = 16.8µm in this work. I.e. a = 3 √ V . The domain of numerical simulation is divided into 100x100x100 lattice points, which corresponds to a total volume of about 2.4mm 3 . For the numerical simulations of the molecular dynamics the initial square lattice is used as spatial discretization. We assume that the molecular concentrations are piecewise constant among the cubes. Figure S1: Spatial discretization. Schematic representation of the lattice topologies.
Actually, the selected modeling framework permits simulations to be scaled up to cubic centimeter sizes, though at the expense of lower spatial and functional resolution. Alternatively, hybrid models might be used, zooming in at the cell scale in regions of interest. However, we have chosen a CA model representing each cell individually to allow for cell heterogeneity at the cellular scale.

Cell Processes
The cell processes considered in this model are summarized as follows: Cell Division: A Poisson process implies exponentially distributed waiting times. On the other hand, a chain of m consecutive Poisson processes leads to an Erlang-distributed waiting time of the whole chain. Here m determines the sharpness of the distribution around average waiting time. In the model the cell cycle is modeled as such a chain of m d sub-processes. Beginning in cycle step i = 1, a Figure S2: Flowchart of all accessible states, processes and rates. Schematic representation of how the rates of all the biological processes a cell can perform in its actual state/condition are set iteratively for all cells.
cell progresses in the cell cycle from i to i + 1 with rate m d k div with k div being the division rate. When i = m g the cell expands to a random free neighbor site. If other cells already occupy all neighboring lattice sites then one neighboring site will be freed first by pushing adjacent cells along the shortest path toward the closest free lattice. When i = m d then the cell divides into two daughter cells of which one keeps occupying the mothers lattice site and the second a randomly chosen free neighbour site. Then both cells decide with probability p div whether to reenter the cell cycle, setting i = 1, or to become quiescent.
Cell Quiescence: Once quiescent, a cell can reenter the cell cycle with rate p re k re .
Migration: In our model, cells are able to move to a randomly chosen free neighbor site with same (hopping) rate kmig (see Table S1 and Figure S1). This migration rate k mig can be obtained from the relation k mig = D/l 2 , where D is the diffusion coefficient of the cell and l the hopping distance (= cell diameter) [5]. The diffusion coefficient depends on the properties of the cell and the surrounding (liquid) media D = k B T /(6πηR 0 ), where k B (= 1.38 · 10 −23 JK −1 ) is the Boltzmann constant, T (= 310.15K) is the temperature, η (= 700µN sm −2 for H 2 O at 37 • C) and R 0 is the hydrodynamic radius of the cell which can be assumed to be approximately the cell radius, R0 = Rcell. For an average cell diameter of 20µm and a medium viscosity close to that of water k mig will thus be about 1.75 h −1 (approximately 1 cell diameter in 105 minutes). Similar estimates have been independently derived in [6] and references therein.
Apoptosis (programmed cell death): All cells can undergo apoptosis, a programmed cell death (see Figure S2). When this occurs, cells activate the apoptotic pathway, which will lead to cell shrinkage, nuclear fragmentation, chromatin condensation, chromosomal DNA fragmentation and cell fragmentation into apoptotic bodies. The corresponding process is accounted for in the model by changing from a proliferating or quiescent stage to a dead stage at a rate k apt (see Table S1). Within the simulated sensitivity analysis we varied the apoptosis rate and found that only very small apoptosis rates were in agreement with the dead cell profile in the data, so that we finally neglected apoptosis.
Necrosis: If dying cells are not able to initiate the apoptotic pathways due to cellular injury, intoxication or a dysfunctional apoptotic pathway as in the case of many cancer cell lines they will undergo necrosis. In the model cells, which are deprived by nutriments and exposed to lactate, change from a proliferating or quiescent stage to a dead stage at a rate k nec (see Table S1).
Lysis: Disposal of cellular debris resulting from apoptosis or necrosis is carried out by a lysis process [7], for which a lysis rate k lys = 0.035h −1 (about 30h) will be assumed (see Figure S2 and Table S1). This is about 10 times slower than phagocytosis (digestion of cellular debris by macrophages) observed in vivo ( [8]), but within the range reported for in vitro cultures (0.002 h −1 for Fibrobacter succinogenes [9]; 0.07 h −1 for VO 208 Hybridoma cell line [10]). It should be noted in this context that tumor growth within the size limits considered in this work is closer to in vitro cultures than to in vivo situations. Lysis is mimicked in our model by means of a Poisson process, which removes dead cells from the lattice.

Time Evolution of the System
We implement a version of the Gillespie algorithm [14], adapted to the cell population system considered in this work. To this end, some explanations are in order. We shall index by µ a class of a process that cells can perform, say proliferation, migration, apoptosis or necrosis and lysis. The related process rate, denoted by r µ does not need to be constant for all cells. For example, proliferating rates depend on the cell lines considered. Therefore, a process P x,µ is specified by its process class µ, but In the following  [ECM ] min (0...0.1) 0.003 fitted * Those parameters not found in the literature were either estimated or fitted from the experimental data presented in this article or in referenced work. The impact of choices on the resulting effects was subsequently analyzed in sensitivity analysis (within ranges indicated). ** The diffusion coefficients in medium were assumed to be 30 times larger compared to the cellular phase of the spheroids as indicated in this table. also by the cell x where it takes place. For instance if µ represents proliferation, the corresponding replication process can be summarized as follows: where 2x represents the two daughters arising from x. The algorithm describing the temporal evolution of the system is shortly described in the provided pseudo code. Lines within braces at each step refer to the procedure sketched at the end of the description.
Step 0 (lines 1 to 8 -Initialization): The tumor spheroid is initialized in the center of the domain. Within a sphere of radius R init cells, where a fraction of p init qui is quiescent from the beginning, occupies all lattice sites. Then the list of all possible processes at this state of the system P is set accordingly. Set the time variable t to zero and initialize the unit-interval uniform random number generator (URN).
Steps 1 to 4 (lines 9 to 34): A step-by-step time evolution is processed within a loop. The following steps 2 to 5 are repeated until the system reaches the end time t max , when there are no more processes to execute (for instance all cells are dead) or the maximum number of cells considered is reached (N max cells = 10 6 cells).
Step 1 (line 11 -Total Transition Rate): Calculate and store as r Σ the sum of rates r x,ν of all processes P x,ν ∈ P .
Step 3 (lines 15 to 24 -Process Selection): The process P y,µ to perform during this iteration is chosen randomly from the list of all processes P taking into account that the probability of each process P x,ν ∈ P to be chosen is proportional to its rate r x,ν . As proposed in (Gillespie, 1977) this can be done as follows: generate a second random number u2 using the unit-interval uniform random number generator. Then step-by-step sum up as r Σ the rates r x,ν of all processes P x,ν ∈ P until the condition r Σ − r x,ν < u2 · r Σ ≤ r Σ is satisfied. Then store P x,ν as the selected process in P y,µ .
Step 4 (lines 25 and 26 -Update cellular system): Adjust the cell set to account for what has happened during process P y,µ (e.g. remove cells, add cells, move cells or change environment of a cell). Update the list of possible processes according to the new cell set (e.g. add/remove processes after an environmental change).
Step 5 (lines 27 to 31 -Update molecular concentrations): For a fix time step dt the molecular concentrations are updated to the steady state solutions of the respective equations 19, 28, 29 and 30.

Supporting Information -Alternative Model Mechanisms & Parameters Sensitivity
In this section we present some sensitivity analysis as well as alternative mechanisms which are not mentioned, but largely guided our decisions toward the final models presented in the main text. The core part of the propose hybrid model are the transitions between internal cell states (cell cycle, G 0 and death) depicted in figure 5. Here the model outcome largely depends on the respective transition rates and their dependency on the cell arrangement and local molecular concentrations.

Cell Cycle & Quiescence
As described above and shown in figure 5, the decision whether a cell continues to proliferate or becomes quiescent is made at a check-point directly following cell division and depends on probability p div . Once quiescent, the decision to reenter the cell cycle depends on p re . On the other hand, k re determines how fast cells reenter the cell cycle from G 0 under favorable conditions. Several examples are next shown on how the choice of both probabilities influences growth curves and radial profiles.
Hypothese I: Long-range contact inhibition controlling cell cycle progression First, we assume that cell cycle progression and reentrance (from G 0 ) are controlled in a similar manner by extra-cellular matrix (ECM) and contact-inhibition where p max div and p max re are the base probabilities, L the cells distance to the closest free space, ∆L the respective reference distance, and [ECM ] min the minimally required ECM. Figure S4 shows, how varying k re , or p max re (asymmetric scenario), or p max div and p max re simultaneously (symmetric scenario) effect the growth curves and radial distribution of proliferating cells. In the entirely symmetric case, i.e. the effective rates p div k div,m and p re k re are kept equal, the model is in good agreement with the experimental growth curves for low progression and reentrance probabilities. On the other hand, the fraction of proliferating cells is either underestimated at the outer border or underestimated at the center. In the asymmetric case, growth curves and radial profiles are in much better agreement between experiment and model, if either p max re or k re are close to zero.
Hypothese II: Short-range contact inhibition controlling cell cycle reentrance An alternative assumption is, that quiescent cells reenter the cell cycles only if they "sense" free space in their direct vicinity, for example via membrane sensors: Figure S5 shows that reentrance which can only be initiated if free space is accessible in direct vicinity, is more stable to different reentrance rates k re . Here, the critical amount of ECM a cell needs to progress in the cell cycle has a much larger impact on the proliferating cell fraction. Modeling ECM, especially at the outer border, is essential to explain the low fraction of proliferation cells observed experimentally.  Figure S4: The reentrance mechanism is similar to the division: the probability to proceed in the cell cycle depends on available space in the extended neighborhood. Experimental (mean, standard deviation) growth curves (left) and radial profiles of Ki67-positive cells (right) are compared to model simulations (black lines). Notice that the solid black curves are precisely the same as p re = p div and k re = k div . Moreover, by construction the curves in the bottom row become identical to those in the center row, if for the modifier 0 < α ≤ 1 in the bottom row the setting p max re = α is chosen, and in the center row k re = k div,m α is chosen.  Figure S5: The reentrance mechanism is distinct from division: the probability to proceed in cell cycle depends on available space in direct neighborhood. Experimental (mean, standard deviation) growth curves (left) and radial profiles of Ki67-positive cells (right) are compared to model simulations (black lines). The model simulations were done for different reentrance rates k re (top) and the ECM growth threshold [EM C] min . Figure S6 shows simulation results with different combinations of mechanisms from model 4 for condition III and in each case with suppression of contact inhibition. The parameters have been chosen as in Fig. 10 (main text). Suppressing of contact inhibition leads to large deviations from experimental data, particularly for the KI67 profile. As long as quiescence due to waste exposure is not present, the KI67 profile remains stationary, otherwise it drops with time. So for the former one might speculate that a proper calibration of parameters may permit reproduction of the KI67 profile. As explained in the main text of the paper, our and previous findings in models and the data of Sutherland and coworkers (see text) suggest that nutriment -dependent proliferation control is not sufficient to explain the experimental findings: Firstly, between [G]=25mM and [G]=5mM the experimental proliferation pattern does almost not change while a difference in glucose medium concentration would cause a difference in the size of the proliferating rim if proliferation would depend on the glucose concentration [G]. A different size of the proliferating rim is basically not observed between [G]=25mM, 5mM (this was also the case in EMT6/Ro cells, see Freyer, Sutherland, 1986). However, if cell cycle entrance were independent of [G], then [G]=1mM would not lead to a drop in proliferation. Still it would be an interesting future goal to fit all model parameters new when dropping contact inhibition to see how close to the data one could come, which as it is very time consuming, we would leave for the future, when automated fitting procedures for stochastic cell models in high dimensional parameter spaces are available.

Choice of Death-Rate-Lactate-Dependency
Experiments show that cells exposed to larger lactate concentrations also have an increased death rate (e.g. ref. [15,13]). Figure S7 shows measurements done by Ozturk et al. [13]. We tested how the radial distribution of necrotic cells changes if the relation between death rate and local lactate concentration is described by either a linear dependence, a Hill equation, or a Heaviside function.
Very interestingly, we obtained the best agreement between model simulations and experimental findings for parameter values which closely resemble the lactate-dependent death rate found independently by Ozturk et al. [13] (Fig. S7). These authors observed a death rate increasing with lactate. Figure S7: Relation between lactate concentration and death rate. Death rates measured by Ozturk et al. [13] for different lactate concentrations (dots) are compared to three theoretical descriptions: Heaviside, Hill and linear function. Figure S8 shows a comparison of the three modeled kinetics for high glucose medium concentrations (25mM) with the corresponding experimental data. The growth curve is not affected by the choice of kinetics. On the other side, the necrotic profiles directly depend on it. While the Heaviside function (see figure S8(d)) leads to too sharp transition between viable and necrotic tissue regions, a linear dependency (see figure S8(a)) would lead to over-smoothed transitions. Only the Hill equation (see fig. S8(b)) could explain the radial profiles shape. For 25mM glucose and 0.28mM oxygen, oxygen becomes the limiting factor before glucose. Thus, initially the viable rim is much thicker as cells still have enough glucose to survive in anaerobe conditions. But over time, lactate accumulates as a direct side-product of anaerobic metabolism until it reaches toxic concentrations.

Stochasticity & Fluctuations of Model Output
We tested how the model results vary starting from the same initial condition, but with different random seeds. Figure S11 shows the growth curves as well as the radial profiles of 10 single simulations compared to the mean and standard deviation of the corresponding experiments. We note that the fluctuations in the experimental data seems to be much higher than between independent simulations. Possible reasons may be heterogeneities in the experimental initial conditions as well as measurement errors which add up to the true underlying biological fluctuations.

Model Predictions
The following figure shows the temporal evolution of the radius as well as the radial profiles predicted for 4 nutriment conditions where no experimental data was available.  Supporting Information -Image Analysis / Data Extraction Figure S13 shows how the radial information is extracted from the micrographs, in this specify example from a Ki67-stained slice of a tumor spheroid grown in a well-nourished medium ([G] = 25mM, [O 2 ] = 0.28mM ) and sacrificed after 24 days. First all nuclei and those stained positively with Ki67 are counted for different intervals, or bins, of distances to the outer spheroid border. The bin size is 1µm.
Then the values in both bins are divided by each other to get the fraction. Finally from the fractions of all the individual images one can calculate the mean and standard deviation over many micrographs of different spheroids. As figure S13 shows, increasing bin size is not changing the overall shape of the radial fraction profile, but rather smoothes out some noisy spikes toward the spheroid centre which were mainly a result of lower cell counts in the central regions. Here the different colors correspond to different bin sizes used to create the curves. The inset shows the raw cell count (bin size = 1µm) of Ki67-positive (red) and all nuclei (blue) in the region where the fraction of Ki67-positive nuclei is close to 0. All curves are averaged over 6 individual images. The Ki67-positive cell counts were multiplied by 10 for better visibility.