Emergent Stratification in Solid Tumors Selects for Reduced Cohesion of Tumor Cells: A Multi-Cell, Virtual-Tissue Model of Tumor Evolution Using CompuCell3D

Tumor cells and structure both evolve due to heritable variation of cell behaviors and selection over periods of weeks to years (somatic evolution). Micro-environmental factors exert selection pressures on tumor-cell behaviors, which influence both the rate and direction of evolution of specific behaviors, especially the development of tumor-cell aggression and resistance to chemotherapies. In this paper, we present, step-by-step, the development of a multi-cell, virtual-tissue model of tumor somatic evolution, simulated using the open-source CompuCell3D modeling environment. Our model includes essential cell behaviors, microenvironmental components and their interactions. Our model provides a platform for exploring selection pressures leading to the evolution of tumor-cell aggression, showing that emergent stratification into regions with different cell survival rates drives the evolution of less cohesive cells with lower levels of cadherins and higher levels of integrins. Such reduced cohesivity is a key hallmark in the progression of many types of solid tumors.


Introduction
Tumor cells and tumor structure both evolve over periods of weeks to years (somatic evolution). Microenvironmental factors such as levels of nutrients and oxygen, growth factors, host immune response and the structure and composition of the extracellular matrix (ECM), exert selection pressures on tumor cells, which influence both the rate and direction of evolution of specific behaviors (often designated hallmarks), especially the development of tumor-cell aggression and resistance to chemotherapies [1]. A particular problem for cancer treatment is that recurrent tumors are typically more aggressive than primary tumors, and that tumors often become resistant to chemotherapeutic agents. Thus treatment may have the iatrogenic effect of making the cancer more aggressive and less treatable. Cells within a single tumor vary in their morphological and molecular features. The tumor microenvironment is also heterogeneous, with marked spatial and temporal variations in the concentrations of metabolites and therapeutics, confounding efforts to assess how specific selection pressures favor particular cell phenotypes in vivo.
Somatic evolution can also lead to apparently paradoxical responses to treatment. e.g., metronomic chemotherapy, the near continuous administration of cytotoxic drugs at low doses with no extended interruption, induces dormancy in some types of cancers [2], though the effectiveness of metronomic therapy is greater in chemo-naïve than in previously-treated patients. Nutrient starvation (e.g. due to antiangiogenics) can cause tumor cells to shrink and enter a state of reversible dormancy, resuming active growth and proliferation when the microenvironment changes and more nutrients become available [3]. Starvation often (but not always) reduces the effectiveness of chemotherapies. Thus the same treatment may help some patients with a given tumor type and harm others. Despite the growing number of available tests for specific markers in tumors, in many cases, we cannot predict which patients a treatment regime will benefit and which it will harm. Understanding the interacting evolutionary pressures within a tumor will therefore be an essential step in enabling personalized and more effective treatment regimes. Because resources are limited and the number of potential treatment regimes limitless, exhaustive combinatoric patient-based trials with different combinations and regimes of drugs range from impractical to impossible. In addition, such studies can only determine optimal conditions for population-average responses and not for personalized treatment of individuals. Ideally, we would like to be able to predict how a tumor in a specific patient will react to a given treatment regime based on easily measured biomarkers. Virtual-tissue models of tumors may provide a pathway to developing such predictions.
CompuCell3D (CC3D) [15] is a free, fully open-source modeling environment for developing and running GGH-based multi-cell, multi-scale virtual-tissue simulations. CC3D provides many multi-cell virtual-tissue model components, including GGH solvers for cell movement, PDE solvers for chemical fields, reaction-kinetics solvers for biochemical networks, visualization tools, etc. CC3D supports compact model specification using a combination of Compu-Cell3D Markup Language(CC3DML) and Python scripting. CC3D allows specification of arbitrary subcellular signaling, regulatory and metabolic dynamic network models using SBML and executes these models and their control of cell-level and tissue-level behaviors. CC3D also supports large-scale/macroscopic continuum models via Python scripting. CompuCell3D users can easily share models, which stimulates simulation code reuse and facilitates collaborative model development. CC3D's use of CC3DML and Python for model specification greatly reduces the effort to develop biological simulations compared to traditional simulation development using C++ or Fortran. The CompuCell3D website (www.compucell3d.org) provides free downloads of the CompuCell3D environment for the Windows, OSX and Linux operating systems and all model code.
This paper demonstrates step-by-step, the creation and execution of a multi-cell virtual-tissue computational model of tumor growth and evolution, providing a platform for the study of the influence of micro-environmental factors on tumor progression. The Appendix provides Table 1. Reference parameter set. The entire model is available at www.compucell3d.org/Models-listed as Emergent Stratification Model. For an explanation of the effective units of these parameters see text.

.25
PStem Michaelis-Menten coefficient, see Eq (14) 0.00256 QCancer glucose max. uptake rate, see Eq (14) 1.69 QCancer Michaelis-Menten-Coefficient, see Eq (14) 0.00256 PCancer glucose max. uptake rate, see Eq ( listings of all models in this paper. Our model focuses on small, early-stage benign tumors and on early development of metastases rather than on developed primary tumors, to identify the pattern of cell-behavior selection. While the average number of cells in our 2D model is less than 1000 at any time and the total cell turnover is on the order of tens of thousands of cells, much fewer than in even a small real tumor, we can map these simplified model tumors onto real tumors either by considering each model cell to represent an ensemble of hundreds or thousands of real cells or by considering each model tumor to represent a peripheral microportion of a much larger tumor mass.

Methods: Multi-Cell Virtual-Tissue Models of Tumor Evolution
Our model includes multiple cell agents of different types. Each cell agent has a set of defining properties, including its volume, surface area, intrinsic motility and two distinct classes of adhesion molecules on its membrane. Each cell interacts with neighboring cells and surrounding ECM and stromal tissue primarily by adhesion. We model ECM and stromal tissue as a simple uniform medium, rather than representing individual ECM fibers and stromal cells explicitly and model cells' adhesion to ECM fibers and stromal cells using appropriate contact-energy coefficients-see later sections. We also include a diffusive growth-limiting nutrient (glucose). In our model, sufficient glucose availability promotes cell growth and proliferation, while glucose depletion causes accumulating cell damage, which may lead to cell death. We assume that all cells except stem-like cancer cells can undergo a limited number of cell cycles (Senescence). We represent the effects of all heritable genetic and epigenetic mutations as random variations of the parameters controlling cell behaviors, with variation occurring immediately after a cell division. Outline: We first describe briefly model components (i.e., objects and processes) and their biological relevance and background. We then discuss the simplifying assumptions that allow us to abstract solid-tumor biology to construct a biomodel. Appendix 1 provides an implementation of this biomodel as a simulation, specified in CC3DML and Python. Finally, we compare the simulation results to experiments to estimate the model's parameters.

Cells
Biomodel: Our generic solid tumor includes two main classes of generalized cells: tumor cells and stromal tissue. Tumor cells have two subtypes: stem-like cancer cells and somatic cancer cells. All tumor cells assume one of three possible states: Proliferating cancer cells-PCancer (PC), PStem (PS), which grow and divide, quiescent cancer cells-QCancer (QC), QStem (QS), which do not grow, and Necrotic cells (N), which die. An extracellular Medium (M) represents an aggregate of stromal cells and extracellular matrix (ECM).
We define a separate CC3D cell type for each class of cells which has a distinct set of biological behaviors and properties. While all cells of a given type have the same initial list of defining parameters, the properties of each cell of a given type can change during a simulation. We usually limit the number of cell types to no more than 15 to make the model intelligible (For our specific CC3D implementation of cell types, see Table 2). Fields Biomodel: Tumor growth in vivo depends on the levels of multiple diffusing substances, including blood nutrients (e.g. glucose and fatty acids), tissue oxygen, growth factors and pH. In our model, we assume that glucose is the main growth-limiting nutrient and include a diffusing field (G) representing glucose (see "Glucose Transport" section for details).

Biological Processes
Cell Motility, Time and the Glazier-Graner-Hogeweg Model Biomodel: Cells move by making cytoskeletally-driven membrane protrusions and retractions and by making and breaking connections with their neighbors and the ECM. In the absence of external stimuli (e.g., gradients of chemical attractants or substrate properties), cells like fibroblasts migrate in a persistent random-walk pattern, where the cell typically moves at least one cell diameter in a given direction before changing direction. In our model, we assume that tumor cells in a cluster migrate via a random walk without a prespecified persistence length.
To simulate the dynamic motility of cells, we use the GGH model, also known as the Cellular Potts Model (CPM), a multi-cell, lattice-based, stochastic methodology for representing tissues. The GGH model uses spatially-extended domains of voxels on a fixed cell lattice to represent cells. Since such domains may also represent cell subcomponents, clusters of cells or portions of ECM, we call the domains generalized cells. Each voxel in the cell lattice has a positionx and an index σ (x) denoting the index of the generalized cell to which it belongs. For convenience, we also assign a cell type τ (σ) to each generalized cell σ. One or more generalized cells can represent a single biological cell. In the latter case, the biological cell corresponds to a cluster of generalized subcells, which we can use to represent cell compartments or cell polarity, or to assemble cells with complex shapes.
To model the dynamics of generalized cells, we associate an effective energy term with each generalized-cell behavior which involves motion (e.g., chemotaxis), force-mediated interaction (e.g., cell-cell adhesion, an external force), or size or shape constraint (e.g., the cell volume), etc.. . . In some cases, an effective energy term represents a physical interaction energy, e.g., cell-cell adhesion. In other cases, (e.g., chemotaxis) the effective energy is a shorthand to produce the desired behavior of cells and does not correspond to an actual energy.
To model a motile cell that has a defined volume and surface area, we use two effective energy terms in the form of constraints: the cell-volume (first term) and cell-surface constraints (second term): where v σ and s σ denote a generalized-cell's instantaneous volume or instantaneous surface area and V σ and S σ denote a generalized-cell's target volume and target surface area, respectively. The constraints are quadratic and vanish when v σ = V σ and s σ = S σ . l vol s and l sur s are the constraint strengths which correspond to elastic moduli (the higher l vol s or l sur s the more energy a given deviation from the target volume or surface area costs).
The GGH model represents cytoskeletally-driven cell motility as a series of stochastic voxelcopy attempts. For each attempt, we randomly select a target site,ĩ , in the cell lattice, and a neighboring source sitej . If different generalized cells occupy these two sites, we calculate change in the energy (ΔH) that would occur if we were to copy the index in the the source-site voxel onto the target-site voxel. For almost all cell behaviors and interactions, the evaluation of ΔH requires calculations localized to the vicinity of the target voxel only.
The probability of accepting a voxel-copy attempt (i.e., overwriting the value in the targetsite voxel) is: for DH 0 where f (ΔH) is a decreasing function, bounded between 0 and 1. Here as in most GGH models, we set where T m is a parameter describing the amplitude of cell-membrane fluctuations. T m can be a global parameter, cell specific or cell-type specific. The net effect of the GGH voxel-copy algorithm is to lower the effective energy of the generalized-cell configuration in a manner consistent with the biologically-relevant "guidelines" in the effective energy: cells maintain volumes close to their target values, mutually-adhesive cells stick together, mutually repulsive cells separate, etc..... The ability to make voxel copies which temporarily increase the effective energy helps avoid trapping the configuration in local effective-energy minima. The algorithm evolves the cell-lattice configuration to simultaneously satisfy the constraints, to the extent to which they are compatible. Generalized-cell movements are perfectly damped (i.e., average velocities are proportional to applied forces), which makes the updating method numerically stable, making the GGH model a robust tool for building virtual-tissue models [16].
The average value of the ratio DH T m for a given generalized cell determines the amplitude of fluctuations of the generalized-cell's boundaries. High DH T m results in rigid, barely-or non-motile generalized cells and little cell rearrangement. For low DH T m , large fluctuations allow a high degree of generalized-cell motility and rearrangement. For extremely low DH T m , generalized cells may fragment in the absence of a constraint sufficient to maintain the integrity of the borders between them. Because DH T m is a ratio, we can achieve appropriate generalized-cell motility by varying either T m or ΔH. Varying T m allows us to explore the impact of global changes in cytoskeletal activity. Varying ΔH allows us to control the relative motility of the cell types or of individual generalized cells by varying, for example, l vol s , l sur s , V σ , or S σ . The generalized cell representing the stromal material around the tumor has unconstrained volume and surface area. Medium voxels can be both source voxels, e.g., during retraction of the trailing-edge of a generalized cell, and target voxels, e.g. during formation of lamellipodia. Since Medium represents largely passive material, We use the amplitude of cytoskeletal fluctuations of the non-Medium target or source generalized cell to determine the acceptance probability for a voxel-copy involving Medium. Parameter Estimation: In CC3D, the size of the cell-lattice voxel sets the spatial resolution of the simulation. Here a square cell-lattice voxel (2D) represents 16 μm 2 . Our tumor cells have an initial volume of 256 μm 2 . We relate the simulation's MCS time-scale to minutes by comparing cell-migration speeds in simulations to typical cell-migration speeds in experiments. Simulated tumor cells migrate at speeds of about 0.1 voxel/MCS. Experimentally, tumor-cell migration speeds range between 2 μm/hour to 12 μm/hour [17]. Matching the experimental value of 4 μm/hour sets one MCS to 6 min. Both [18] and [19] find that the active cells that lead to metastases move with relatively high speeds.
ECM fibers play a crucial role in enabling such rapid cell-migration. Cancer invasion models based on game theory reach similar conclusions [20], as do other CA models (e.g., [21]) that focus on the relative significance of the rates of cell death, proliferation and migration on tumor growth and spread. However, none of these models captures the role of cell adhesion changes (one of the classical hallmarks of cancer progression) on the initiation of the cancer invasion. Our model concentrates on the evolution of cell adhesion in tumors due to the tumor's changing microenvironment. While, it does not explicitly represent ECM fibers or the intracellular signaling pathways that help control cell migration [19], we hope to include these more detailed submodels in future studies.

Cell Adhesion
Biomodel: Cell adhesion due to molecular binding of transmembrane adhesion receptors on one cell to either ligand transmembrane adhesion receptors on another cell or to the ECM is one of the most important interactions during tissue morphogenesis and maintenance [22]. A wide variety of adhesion molecules can mediate both cell-cell and cell-ECM adhesion. Detailed models of adhesion are complex because of the multiple interactions affecting these molecules, which occur in both the cytoplasmic and extracellular domains [23,24]. Here, we assume that a single cadherin species mediates cell-cell adhesion via homotypic cadherin-cadherin binding and a single integrin species mediates cell-stromal tissue adhesion via heterotypic integrin-fibronectin binding. We simplify the strength of adhesion as the product of the number of bonds between two adhering objects and the strength per bond. Bond formation contributes a negative energy to the total effective energy in the model, and breaking a bond requires an energy at least equal to this binding energy. represent inter-cellular adhesive interactions, we add adhesion term to the effective-energy in Eq (1): The outer sum of the (first) adhesion term iterates over all cell-lattice voxels. We refer to the closest voxels tox as 1 st order (nearest neighbor) voxels, and voxels further away as 2 nd , 3 rd order, etc.... neighbors of voxelx . The inner sum (overỹ) iterates over all voxels in the neighborhood of voxelx up to order N max . J sðxÞ;sðỹ Þ is the adhesion energy per unit contact area between two generalized cells-σ (x) and σ (ỹ). δ is the Kronecker delta function: and the ð1 À d sðx Þ;sðỹ Þ Þ factor ensures that we only count energies between voxels belonging to different cells. Inclusion of the adhesion term in Eq. (4) reduces the amount of cell-cell contact between generalized cells with high values of J sðxÞ;sðỹ Þ and increases the amount of contact between generalized cells with low values of J sðx Þ;sðỹ Þ . In our model, we express J sðx Þ;sðỹ Þ as a function of the adhesion-molecule concentrations on generalized-cells' surfaces. As we noted above, several classes of adhesion molecules can reside on a generalized-cell's surface and we treat the ECM and stromal cells as a generalized cell. The binding energy per unit area between two adhering generalized cells due to integrin-like or cadherin-like adhesion molecules is: where, for cells i and j, m and n are classes of adhesion molecules, k m, n the affinity coefficient for those species, N i m and N j n their densities (the number of adhesion molecules of classes m and n, respectively, per unit area), and F is a function which describes the phenomenological relationship between the densities and the binding energy. In our model, we set which assumes that each molecule of class m on cell i can bind only once to a molecule n on cell j. Including all possible combinations of bonds between different classes of adhesion molecules, the adhesion energy per unit area is: To calculate the net contribution of adhesion to the effective energy, we sum the adhesion energies at every cell-cell and cell-stroma (cell-Medium) interface: where we have replaced indicies i and j with σ (x) and σ (ỹ) respectively, following the notation of Eq (1). Comparing Eqs. (4), (8) and (9), we see that: Using Eq (10) we can express the GGH effective energy as: See Table 3 for the CC3DML implementation of the adhesion energy and for initial parameter estimates. Parameter Estimation: To form an initially cohesive solid tumor, we set (see Table 3) the AdhesionMoleculeDensity and the homotypic and heterotypic BindingParameters to values that produce a positive surface tension between tumor cells and Medium, which we calculate using the surface tension formula: A high positive γ produces a cohesive cluster of tumor cells, while zero (non-cohesive cells) or negative γ (invasive cells) indicates that tumor cells can separate from the tumor cluster and invade the surrounding Medium. Positive surface tension corresponds to cell-lattice configurations where replacing cell-cell interfaces with cell-Medium interfaces requires energy input-hence, in the absence of external stimuli, cells stick together, while negative surface tension causes cells to tend to separate, because creating cell-medium interfaces is energetically favorable.
We assume that heterotypic adhesion is weaker than homotypic adhesion and limit both Int and Cad densities to 16 units. Higher Int and Cad densities can cause generalized cells to fragment and introduce cell-lattice-alignment artifacts (see CC3D manual for details). In our model, the initial cohesive solid tumor has a γ of about 2.4. Later in the simulation, some cells evolve to express lower levels of Cad and higher levels of Int, reducing the surface tension and causing the cells to invade the Medium.
Glucose Transport Biomodel: The blood vessels in biologically-normal stromal tissue supply nutrients at a rate which depends on the metabolic needs of the tissue, keeping nutrient levels in stromal tissue within physiologically-normal ranges. In our model, Medium supplies G at a constant rate α. Since Medium represents the normal cells in the stromal tissue as well as the ECM, it also takes up glucose at a rate proportional to the local glucose concentration G(x): where is the first-order glucose uptake rate. In the absence of tumor cells, G approaches the equilibrium glucose concentration for normal stroma. Tumor cells take up glucose at a rate which is a Michaelis-Menten function of the local glucose concentration G(x), cell-type (τ) and cell-state (s): where u max (τ, s) is the maximum uptake rate and K(τ, s) is a Michaelis constant, both of which depend on the cell-type (τ) and cell-state (s). Glucose diffuses at a constant rate D G , with time-dependent secretion and uptake: CompuCell3D includes a steady-state diffusion-equation solver, which is appropriate for a fast-diffusing species like glucose. Table 4 shows how the model specifies the behavior of glucose using this solver.
Parameter Estimation: We set the Glucose diffusion coefficient to 600 μm 2 /s which corresponds to 13500 voxel 2 /MCS. In the absence of tumor cells, Glucose concentration in the simulated stromal tissue approaches an equilibrium Glucose concentration of 5mM. 3D does not impose specific units for field concentration. We use femto-mole (fmol) per voxel as our unit and impose a scaling factor so that 5mM corresponds to 0.32 fmol/voxel. Experimentally, cell glucose consumption rates are about 0.1 fmol/cell/s, which corresponds to 2.25 fmol/voxel/MCS, so we set the Glucose consumption rate of tumor cells in their queiscent phase to be 0.75 of that of tumor cells in their proliferating phase (1.69 fmol/voxel/ MCS). We assume that stromal cells (represented implicitly by Medium) are the same size as tumor cells, occupy 20% of the stromal volume, and consume Glucose at 1 3 the rate of tumor cells (0.75 fmol/voxel/MCS). For these parameters, Medium must supply Glucose at a rate of 0.145 fmol/voxel/MCS to maintain a stationary and uniform Glucose concentration of 0.32 fmol/voxel over the entire Medium in the absence of a tumor. If the Glucose concentration is stationary and uniform, both the left-hand side of Eq (15) and the diffusion term on the right hand side are zero, so the decay coefficient of Glucose (the first-order consumption rate), = 0.45. We set the Michaelis constant of Glucose (MichaelisMentenCoef) to 0.04 mM (0.00256 fmol/voxel).
Cell-State Transitions Biomodel: Experimentally, microenvironmental factors including mechanical stress, hydrostatic pressure, low pH and starvation can cause temporary or permanent changes in tumor cells [25,26]. In our model, however, we include only cell-state transitions due to nutrient availability. During a period of starvation (in a low-nutrient regime), tumor cells accumulate damage at rates that depend on their local nutrient levels. When accumulated damage in a cell passes a threshold which depends on the cell type and state, the cell dies (an irreversible celltype transition). When a tumor cell divides, it resets its damage level to zero, so daughter cells do not inherit any accumulated damage from their parent.
Real quiescent tumor cells become proliferative only when they experience sufficient levels of nutrients for long enough periods (refractory behavior) [3]. We model this refractory behavior of individual quiescent-state tumor cells by requiring them to accumulate health, which biologically corresponds to repair of damage due to hypoxia or other harsh microenvironmental conditions. Cells that experience higher nutrient concentrations accumulate health at a faster rate, which saturates for high nutrient concentrations. To simulate transitions between normal and damaged cell states, we calculate the accumulated starvation factor, the cumulative effect of a suboptimal level of nutrients on the cell. When the starvation factor reaches a critical threshold, tumor cells (PC, QC, PS, QS) become necrotic cells. Different cell types have different starvation thresholds, x thresh . To calculate the cumulative starvation factor, we periodically check the glucose concentration at the center of mass of each cell. If the concentration is lower than the cell's concentration threshold, the cell accumulates starvation damage at a rate increasing with the difference between the concentration threshold and the glucose concentration, x = x thresh −G according to a Michaelis-Menten function with a saturation limit, m: where m is the maximum uptake rate of glucose in a given cell and k is a Michaelis constant. Table 5 shows the Python code for this function.
When, during a single MCS, the available glucose concentration is lower than x thresh the cell's actual uptake of glucose is less than its uptake at a concentration of x thresh . Because cell starvation occurs when the glucose concentration falls below this threshold value, we define the starvation-factor increment as: In an analogous way, we define a cumulative "health" factor for cells: In our model, at each MCS we check the concentration of glucose, G(x COM ), at each cell's center of mass, then increase either the cell's cumulative starvation factor by ΔS or the cell's cumulative health factor by ΔQ, depending on whether the cell's x thresh is greater or less than x. We then iterate over each generalized cell in the simulation and determine whether the starvation coefficient is above the type-transition threshold for that generalized-cell's type, in which case we change the generalized-cell type to Necrotic. The parameters ip.PNeThr0, ip.QNeThr0, ip.SNeThr0 and ip.QSNeThr0 store the starvation thresholds for PCancer, QCancer, PStem and QStem generalized-cell types respectively. For quiescent cells (generalized-cell types QCancer and QStem) we also check the health factor. If it is above the threshold for that generalized-cell state (ip.QPThr0 and ip.QSSThr0, respectively), we change the generalized-cell state to PCancer and PStem respectively. After a cell-type transition, we reset the health factor to zero. Table 6 presents the Python implementation of the model's cell-type transitions, based on the starvation coefficients calculated in Table 7 in the Appendix.
Parameter Estimation: We assume that cells starve and accumulate damage when the Glucose concentration drops below 0.5 mM (0.032 fmol/voxel) (compare to [25,26]). We calculate PNeThr0 and SNeThr0 based on the assumption that the cells die if they experience a glucose concentration of zero for 24 hours (damage accumulates at its highest rate for 24 hours = 102 MCS) [25,26]. We assume that QCancer and QStem cells are more resistant to starvation than proliferating cells, so that both QNeThr0 and QSNeThr0 are twice as large as PNeThr0. We calculate QPThr0 and QSSThr0 based on the assumption that QCancer and Qstem cells that experience 5mM glucose for 24 hours switch to a proliferative state.
Cell Growth and Cell Death Biomodel: A subpopulation of tumor cells proliferates when it has access to sufficient levels of nutrients. In our model, only PC and PS generalized cells grow and divide (via mitosis) once they reach their doubling volume. PC and PS generalized cells grow by consuming glucose. PC and PS generalized cells grow only when the glucose supply is above a threshold and these generalized cells grow faster for higher glucose levels. Necrotic generalized cells shrink at a constant rate, which is independent of the level of nutrients. In our model, once a generalized cell becomes Necrotic it cannot change back into any other generalized-cell type. Quiescent cancer cells (QC) and quiescent stem-like cells (QS) neither grow nor shrink. A generalizedcell's volume and surface constraint parameters remain unchanged from the time its state changes to QC or QS.
We implement generalized-cell growth and shrinkage by manipulating generalized-cells' V i s, S i s and λs in Eq (1).
Proliferating cancer cells (PC) and proliferating stem (PS) cells grow at a rate: where ΔV i is the increase in the target volume of generalized cell i per MCS.x i COM is the center of mass of generalized cell i and Gx i COM À Á is the glucose concentration at this point. G thresh is the minimum glucose concentration which allows cells to grow. k is the growth speed and θ the Heaviside step function: We update the generalized-cells' target volumes once per MCS. To maintain the generalizedcells' surface-to-volume ratios, we adjust the generalized-cells' target surface area as: where q s, v = 4 is a scaling factor derived on the assumption that cells in the simulation should be nearly circular in shape. In our 2D simulations, the relationship between surface area and volume for a circular generalized cell is: Necrotic generalized cells shrink and disappear by reducing their target volumes at a constant shrinkage rate until their target volumes reach 0: where d V Necrotic ¼ 0:05 is the constant shrinkage rate for necrotic cells and V i is the current target volume of the Necrotic generalized cell. Once a generalized-cell's target volume reaches 0, it will disappear as neighboring generalized cells overwrite its voxels. Table 8 shows the Python implementation for these mechanisms.
Parameter Estimation: We assume that PC and PS generalized cells that experience 5mM glucose take 24 hours to reach their doubling volume and divide. We set g in Eq (21) to 0.34 fmol/voxel and set k(incvol) = 0.2 so it increases the generalized-cells' target volume by 16 voxels in 24 hours. To allow elongated generalized-cell shapes and prevent alignment of generalized-cell boundaries along the cell-lattice's symmetry axes, we set q s, v = 4, slightly higher than ffiffiffiffiffi ffi 4p p in Eq (24).

Mitosis, Senescence and Mutation
Biomodel: When a growing cell reaches its doubling volume, it undergoes mitosis. Depending on the type of cell undergoing mitosis and the number of cell cycles it has completed, the two daughter cells may change their types and cadherin and integrin expression levels.
Mitosis splits cells into two daughter cells of roughly equal volumes. The Python code in Table 9 sweeps through all generalized cells once per MCS to identify generalized cells that should divide.
In our model, we refer to a cell just before mitosis as a parent cell. Mitosis divides the voxels of the parent generalized cell roughly equally between a generalized cell which keeps the identity of the parent generalized cell and a new daughter generalized cell. While this usage differs from the biological terminology, it reflects the actual bookkeeping in the simulation. Mitosis resets the parent and daughter generalized-cells' target volumes and target surface areas to their reference values, and their cumulative damage and health to 0. In the model only two generalized-cell types grow and divide: PC, PS. A senescence counter, n d , counts the number of divisions a PC or QC generalized cell has undergone since its transition from a PS generalized cell. We assume that cancer cells typically undergo a maximum number of divisions m d . On each division of a PC parent generalized cell we pick R from a Gaussian distribution r: where r (x; m d , δ d ) is the Gaussian probability distribution of variable x with mean m d and standard deviation δ d and If n d > R both the parent and daughter generalized cells become Necrotic. For PC generalized cells, m d = 8 and δ d = 2.
When a PS generalized cell divides, the parent generalized cell becomes a QS generalized cell and the daughter generalized cell has a probability of 1ip.probstem of becoming a QC generalized cell and of ip.probstem of becoming a QS generalized cell. We normally set ip.probstem = 0.2. As the "Cell-State Transitions" section describes, high nutrient availability induces quiescent stem cells (QS) to become proliferating stem generalized cells (PS). Our simulation implements the senescence and stem generalized cell ! cancer generalized cell transition using the Python updateAttributes function in Table 10.
CompuCell3D calls the updateAttributes function immediately after a parent generalized cell divides. The function first resets the parent and daughter generalized-cells' V i s, S i s and λs. ip.V0 stores their reference target volume and ip.LBD_V0 and ip.LBD_S0 store their reference λ vol and λ sur (see Eq (1)).
During mitosis, we also simulate random changes in the surface expression of adhesion molecules. Each parent generalized cell has a small probability (p m = 0.1 stored in ip.probmut) of randomly changing its adhesion-molecule expression levels. We scale expression levels of cadherins and integrins to the range 0 to 16. After mitosis, we draw a number R from a gaussian distribution r (x;j am , δ am ), where j am is the current adhesion-molecule expression level for the parent generalized cell and δ am is the standard deviation of the gaussian distribution. In our simulations, we set δ am = 2.0. We store the cadherin expression level in the jcadh variable and the integrin expression level in jint. ip.cadhstdev stores the value of δ am , which we assume is the same for cadherins and integrins.
When R is within the allowed expression-level interval ([0, 16]) we set: otherwise, we reject the mutation. Our model varies only the levels of cadherin and integrin and not the level of fibronectin (FN)-i.e., the adhesion molecule associated with Medium.
Daughter generalized cells inherit their parent generalized-cell's adhesion molecule expression levels. Table 11 shows Python code implementing this mutation mechanism. Parameter Estimation: We limit the AdhesionMoleculeDensity of both Int and Cad to values between 0 and 16 because this range produces both positive (cohesive) and negative (invasive) surface tensions and minimizes cell-lattice artifacts that can occur for large, positive surface tensions. Our algorithm rejects adhesion-molecule expression-level changes if they would cause the adhesion-molecule expression level to fall outside the allowed interval, which eliminates bias towards the end-values of the interval. Setting expression levels that fall outside the allowed interval to end-values causes a bias towards the end-values of the interval. For any initial value, in the absence of selection, random mutation leads the population average to drift towards the middle of the interval. Distinguishing such random-walk drift from evolution due to selection pressure is difficult, so we set the initial adhesion-molecule expression levels to 8, the middle of the allowed interval.
To facilitate simulation of the cell processes and behaviors described in this paper our simulations include modules which the center of mass and set of voxels belonging to each generalized cell. Appendix A. (Table 12) shows CC3DML implementation of these modules.
As an initial condition, we place an 8 2 -voxel quiescent stem (QS) generalized cell in the middle of the cell lattice. Table 13 shows the CC3DML implementation of this initial condition. Table 14 shows the CC3DML to define the cell-lattice dimensions (<Dimensions>), type of boundary conditions (<Boundary_x>, <Boundary_y>), average amplitude of cell membrane fluctuations (<Temperature>), neighbor range used to pick source and target voxels for voxel-copy attempts (<NeighborOrder>), simulation duration in MCS (<Steps>) and the seed for random number generator (<RandomSeed>).

Size and Shape Dynamics
In our simulations, the initial cancer stem (QS) generalized cell grows into a cohesive solid tumor that reaches a maximum diameter of about 500 μm during the first simulated month. During the next simulated 10 days (day 30 to day 40), tumor generalized cells at the center of the tumor die and form a necrotic core that is about 250 μm in diameter (Fig 2A.1 and 2B.1). Necrotic. Row B) Accumulated damage due to starvation. Row C) Surface tension between tumor generalized cells and Medium, see Eq (12). Times: first column: * 40 days, second column: * 1 year, third column: * 3 years. CompuCell3D code that we used to generate this figure can be downloaded from www.compucell3d.org/Models. See also Table 1 for the list of parameters used to generate this figure. At this stage, some PS generalized cells at the periphery also die through senescence after reaching their maximum number of divisions. Since we do not model angiogenesis, the total amount of Glucose the Medium supplies limits the diameter of a compact tumor to under 500 μm. This nutrient-limited regime lasts between four and twelve simulated months. As tumor generalized cells progressively become less cohesive (due to mutations accumulated during mitosis) (Fig 2A.2), the tumor becomes less spherical and elongates, usually leading to splitting of the tumor into two clusters that are nearly equal in size. The less cohesive cluster then usually outgrows the more cohesive cluster, because its elongated shape has more contact area with Medium, which allows higher rates of Glucose transport into the cluster (Fig 2B.2). Any clusters that do not contain stem generalized cells eventually disappear. Tumor generalized cells become potentially invasive if they express low levels of cadherin and high levels of integrin. When such invasive generalized cells come to the edge of their cluster and contact the Medium, they can leave their clusters and invade the Medium (Fig 2A.2). Generalized cells that have invaded the Medium proliferate faster than those remaining in a cluster (see population plots in Fig 3A-3B), since glucose levels outside clusters are significantly higher than levels inside or at the periphery of clusters. Thus the spreading invasive cells eventually outcompete nearby tumor because they have higher probability of survival than generalized cells in a larger cluster.
For the reference set of parameters, predominantly dispersed, invasive generalized cells become dominant after about 3 simulated years (Fig 2A.3)-see section "Parameter Effects on Tumor Growth and Evolution" for detailed studies of the parameter space. After this stage, small clusters (less than 20 cells) continuously form and disintegrate, as stem-like generalized cells leave their clusters or die due to rapid accumulation of damage. Even though we did not vary the fraction of stem-like offspring of stem-like generalized cells or allow this fraction to evolve in this paper, the fraction of stem-like generalized cells in the total population gradually increases with time. We will present our studies of the effects of evolutionary change on the fraction of stem-like generalized cells in a future paper.

Starvation and Evolution of Cell Adhesion
In our model, invasion of Medium is a winning strategy for generalized cells. Generalized cells gain invasive behaviors in two key phases. In the initial phase, as long as generalized cells remain in clusters, differential cell adhesion leads to cell-sorting in which generalized cells expressing lower levels of cadherins and higher levels of integrins (relative to the typical levels in the cluster) move to the surface of the cluster and generalized cells expressing higher levels of cadherins (relative to typical values in the cluster) move towards the center of the cluster (the black arrow in Fig 4C). As generalized cells move towards the center of the cluster, they cease to proliferate, starve and eventually die. Thus generalized cells which remain on the surface of the cluster proliferate more rapidly than those in the interior of the cluster, selecting for the least cohesive generalized cells in the cluster. Since a generalized cell's cadherin level has a greater effect on its radial position in the cluster than its integrin level, initial selection predominantly favors generalized cells with decreased cadherin levels (Fig 4A). Because starvation selects for lower relative adhesion between generalized cells, the typical cadherin level of the generalized cells in the cluster decreases continuously until, in the second phase, the least cohesive generalized cells (those with high levels of integrin and low levels of cadherin) in the cluster are able to migrate out of the cluster. This invasive phenotype corresponds to a negative surface tension between Medium and tumor generalized cells (Fig 2C.2, Fig 4B-4C). As we discussed in section "Size and Shape Dynamics", invasive generalized cells then come to dominate the tumor-cell population, and spread throughout the Medium.

Parameter Effects on Tumor Growth and Evolution
Our model of emergent tumor stratification has about 50 parameters. Some parameters values are available in the cancer literature, others are not, including the probability of a random mutation of a generalized-cell's cadherin expression and the distribution of changes in cadherin expression on mutation. In any case, because we are simulating a much smaller number of generalized cells than the number of cells we would find in a real tumor, we must greatly increase the probability of mutation per generation and the magnitude of the typical phenotypic consequence of a mutation, if we wish to observe a significant evolution of the phenotype. Since the goal of this work is to examine how specific types of selection pressure contribute to the emergence of specific tumor morphologies, we care more about relative rates of mutation than absolute rates and the sequence of evolutionary events rather than their absolute times of occurrence. Despite these caveats, we observe patterns of changes of cell behavior which should be robust if we were to reduce the overall rates of mutation and increase the simulation size to more realistic values. As long as all cells are sufficiently cohesive that they remain attached to the primary cluster, selection favors cells with lower relative cell-cell adhesivity and higher cellstroma adhesivity. Because only cells with lower relative adhesivity benefit, the mean absolute adhesivity decreases until cells are able to separate from their cluster and invade the surrounding tissue. To check the robustness of this evolutionary pattern to the mutation rate and amplitude parameters, we varied the probability of cadherin-level and integrin-level mutation P m and the standard deviation δ am of the Gaussian probability distribution used to select the mutated cadherin and integrin expression levels (see eqs. (26) and (28)), using the same P m and δ am for both cadherin and integrin.
For each value of P m (from the set {0.1, 0.2, 0.3}) and each value of δ am (from the set {0.5, 1.0, 1.5, 2.0}) we ran 10 simulation replicas with different random seeds. We then correlated P m and δ am values with the typical behavior of the replicas. As we would expect, larger P m and δ am decreased the time until the first appearance of generalized cells able to invade the surrounding stromal tissue and increased the average number of clusters at a given time. Despite these quantitative differences, the qualitative series of changes in generalized-cell and cluster properties were independent of the mutation rate and amplitude.

Parameter Scan Methodology
To facilitate parameter scans, we developed a parameter-scan module for CompuCell3D that automatically runs multiple replicas of a simulation over a user-defined parameter space. We analyzed our parameter-scan results using Python scripts employing the 3rd-party tools Pandas (http://pandas.pydata.org/), numpy (http://www.numpy.org) and scikit-learn (http:// scikit-learn.org/).
The key to efficient management of studies which require long simulations is to store sufficient information to support flexible postprocessing. Otherwise, a change to the analysis protocol can require time-consuming rerunning of all replicas in the entire simulation set. We therefore store, every 1000 MCS, a summary file (a data snapshot) including each generalizedcell's index, type, volume, surface area, cadherin and integrin levels, and center-of-mass position. Although a typical simulation replica has only a few hundred generalized cells at any time, a typical run generates between 20,000 and 50,000 tumor generalized cells as cells divide and die. In addition, generalized cells frequently change type (see Fig 1). When we approximate generalized-cell-type-dependent metrics, e.g., the average distance generalized cells of a given type travel over a fixed time, we bin the data for that generalized cell based on the type it had when it first appeared in a snapshot. While a recording interval of 1000 MCS is too slow for some types of measurement, it suffices for evolutionary changes, which happen on an even slower time scale.
To aggregate the behaviors of multiple simulation replicas with identical P m and δ am , for each simulation metric we used envelop plots showing the minimum, maximum, and median metric value as a function of time. In a substantial number of cases, the behavior of a few replicas differs significantly from the behavior of a typical replica (see cluster plots for e.g., P m = 0.1 and δ am = 1.0). However, none of these cases refutes our basic conclusions about the evolution of invasive phenotypes.

Generalized-Cell-Type Populations
The number and fraction of generalized cells of each type changes with time. The number of Necrotic generalized cells increases as the cluster grows, then decreases once generalized cells begin to invade the surrounding tissue, see Fig 5. For P m = 0.1, δ am = 0.5 and for P m = 0.1, δ am = 1.0 the number of Necrotic generalized cells decreases slowly, while for other combinations of P m and δ am the number of Necrotic generalized cells decreases rapidly. A cluster must contain at least one stem-like generalized cell to survive, so we expect the fraction of stem-like generalized cells to increase as the number of clusters increases and the cluster size decreases. The number of stem-like generalized cells increases continuously. For P m = 0.1, δ am = 0.5 and P m = 0.1, δ am = 1.0 the number of stem-like generalized cells increases more slowly than for other combinations of parameters (see Figs. 6 and 7). As more generalized cells develop an invasive phenotype, the initial cluster gives way to many smaller clusters. Because these clusters are small, almost all generalized cells receive sufficient glucose and the number of necrotic generalized cells decreases as the number of proliferating non-stem cancer generalized cells begins to saturate (see Fig 8). We see a clear distinction if we compare the numbers of quiescent (Fig 9) and proliferating (Fig 8) non-stem cancer generalized cells. Both saturate and begin to decrease, the former as soon as a few generalized cells establish new clusters, the latter only when small clusters fill the entire tissue.

Number of Clusters
To count the number of generalized-cell clusters, we used the DBSCAN algorithm [27] in the scikit-learn Python package (http://scikit-learn.org). DBSCAN views clusters as areas of high density, separated by areas of low density. We defined a cluster as a group of at least five generalized cells whose centers of mass are at most 6 voxels from their nearest cluster-mate (Fig 10) and configure DBSCAN algorithm accordingly. The number of clusters increases in time once generalized cells become able to invade the stromal tissue and establish new clusters. If we define the aggressiveness of the tumor as the rate at which new clusters develop, the aggressiveness increases for larger P m and δ am . For many combinations of P m and δ am , the number of clusters eventually peaks and then decreases when medium-sized clusters split into isolated generalized cells and transient associations of fewer than five generalized cells. We can understand this second transition if we remember that the mean motility of the generalized cells increases continuously in time. When this motility is too small, all of the generalized cells remain condensed in a single cluster. For larger motilities, the large cluster breaks up into multiple medium-sized clusters. However, still larger motilities lead to a second phase transition in which all clusters become unstable, reducing the number of clusters (5 or more generalized cells separated a COM-to-COM distance of at most 6 voxels) that DBSCAN algorithm detects. If we also count transient associations of 2, 3 and 4 generalized cells (Fig 11) and also include isolated generalized cells as separate clusters (Fig 12), the number of clusters increases and eventually saturates, when it reaches the maximum the available space permits.
The surface tension between generalized cells determines whether they can invade the surrounding stromal tissue. For P m = 0.1,0.2 and δ am = 0.5, the average surface tension stays positive at all times (see Fig 13). For all other combinations of P m and δ am , the average surface tension eventually decreases below zero, after which the number of clusters rapidly increases. The number of Necrotic generalized cells decreases a short time later. These observations show that, even in the absence of other factors stimulating invasiveness, a high rate of cadherin mutation enables rapid metastasis. Consequently, studies of the regulation of adhesion molecules may give researchers clues to designing therapies to slow the rate of increase of tumor invasiveness. Such therapies would be especially useful to patients undergoing any of the many treatments, which can increase the invasiveness of surviving tumor tissue. Fig 14 shows the lifetimes for different initial generalized-cell types for different values of P m and δ am . As we expect, Necrotic generalized cells have short lifetimes and total travel distances, so the period a generalized cell spends in necrosis has a small effect on its total lifetime and travel distance. Quiescent non-stem generalized cells (QC) clearly have a lower probability of dying by senescence than proliferating non-stem generalized cells, since they do not divide, but, since quiescent generalized cells are closer to glucose-depleted regions than proliferating generalized cells, we might expect them also to have a higher probability of entering a glucosedepleted region and becoming necrotic, shortening their lifetime. However, in all cases, the proliferating non-stem-like generalized cells have a shorter lifetime than the quiescent nonstem-like generalized cells, indicating that necrosis is more important than starvation in determining their lifetime. As a result, initially quiescent non-stem-like generalized cells travel a longer total distance than initially proliferating non-stem-like generalized cells (Fig 15). Since stem-like generalized cells do not die from senescence, but are otherwise identical in properties to non-stem-like generalized cells, we would expect them to live longer than non-stem-like generalized cells, and, indeed, the typical lifetime of stem-like generalized cells is several-fold greater than that of non-stem-like generalized cells. As a result, initially proliferating stem-like      Average total distance traveled by generalized cells, by initial generalized-cell type. In each bar, the black-red boundary is the minimum total distance, the red-blue boundary is the median total distance and the top of the blue portion is the maximum total distance, all averaged over all cells of the specified initial generalized-cell type and over all replicas for a given P m and δ am .

Average Lifetime and Total Travel Distance of Generalized Cells
doi:10.1371/journal.pone.0127972.g015 generalized cells travel a longer total distance than proliferating non-stem-like generalized cells and quiescent stem-like generalized cells travel a longer total distance than quiescent nonstem-like generalized cells (Fig 15). The relationship between proliferating and quiescent stemlike generalized cells is more complex. Since these cells generalized do not experience senescence we would expect that proliferating stem-like generalized cells would have longer lifetimes than quiescent stem-like generalized cells. We see this relationship only for P m = 0.2, δ am = 0.5, P m = 0.3, δ am = 0.5 and P m = 0.3 and δ am = 1.0. In addition, for stem-like generalized cells, the relative lifetimes do not strictly correspond to the total travel distances, with the longer-lived type having a shorter total travel distance for P m = 0.1, δ am = 0.5, P m = 0.2, δ am = 0.5, P m = 0.3, δ am = 0.5.

Conclusion
As in clinical and experimental cancer progression, simulation replicas of our model with the same parameters can produce a range of outcomes. However, for all replicas in the parameter range we studied, the qualitative evolution of adhesivity and initiation of metastasis agree with experimental observations of cell-behavior evolution in tumors, with single generalized cells or clusters of generalized cells developing high motility and detaching from the primary tumor [19], showing that the evolutionary trend and its consequences are robust. Together, these results suggest that weakening cell-cell adhesion and strengthening cell-stromal-tissue adhesion are primary enablers of metastasis.

Future Work
Our simple model provides an easy-to-extend framework for exploring additional determinants of tumor evolution, some of which are difficult or impossible to measure or control in vitro and in vivo. The model enables tracking of cell-scale population heterogeneity and dynamics, cell-lineages, fitness and selection pressures. Most importantly, it also allows prediction of population-level drift (rates of evolution) of cell behaviors as a function of a limited number of experimentally-accessible parameters. These predicted population-level drifts can identify the factors most likely to promote or retard cancer progression and thus optimize cancer therapies. Thus extensions of our model can provide deeper insight into what drives tumor evolution, especially when we couple signaling and metabolic network models with cell-and tissuebehavior models. This coupling bridges the gap between molecular-scale observations and tissue-scale tumor progression. e.g., in real tumors, changes in the adhesive properties of cells result from both heritable changes in the form or number of cell adhesion molecules and from cells' response to hypoxia in tumors [28]. We are conducting simulations to explore the differences in tissue-level phenotype of these two mechanisms, alone and in combination. We are also simulating the effect on tumor phenotypes of variations in the fraction of stem-like progeny of stem-like-cell divisions, which prior computational studies have suggested promote increased agressiveness in surviving tumor tissue after therapy [21]. Adding a model of ECM alignment due to tension forces created by the tumor cells would be another natural extension of our model, since in in vitro experiments, cells follow aligned ECM fibers when they leave the primary tumor [19].

Appendix: Simulation Implementation
This appendix presents the CC3D simulation implementation of the biomodels from the main text. The listings are in CC3DML and Python.

Generalized-Cell Type Specification
The simulation defines six generalized-cell types (Medium, proliferating non-stem (PCancer), quiescent non-stem (QCancer), Necrotic, proliferating stem (PStem) and quiescent stem (QStem) generalized cells, which we abbreviate as M, PC, QS, N, PS, QS in the text) (see subsection "Cells" under "Biological Components"). Table 2 defines these generalized cell types in CC3DML: Table 3 defines the generalized cells' type-dependent initial molecular densities of cadherin and integrin adhesion molecules: Table 2. Generalized-cell type definitions in CC3DML.

Glucose Transport
Since Glucose diffusion is fast compared to the rate of cell reorganization, Glucose reaches its steady-state distribution. Consequently we set the left-hand side of Eq (15) to zero and use the following CC3DML syntax for the CC3D steady-state diffusion solver:

Generalized Cell Type Transition Algorithm
We specify the algorithm to implement generalized-cell type changes as follows:

Generalized-Cell Growth and Death Algorithm
The generalized-cell growth and death algorithm can be conveniently expressed in Python code as follows: After each MCS, we iterate over all generalized cells in the simulation and shrink Necrotic generalized cells by a constant amount (ip.decvol) provided that their volume is greater than their current target volume (cell.targetVolume). PCancer and PStem generalized cells grow according to Eq (21) where ip.PGrThr0 and ip.SGrThr0 store values of G thresh from Eq (21) for PCancer and PStem generalized cells respectively and ip.incvol stores the value of k from Eq (21). ip.ktgs stores the value for q s.v (see Eq (23)).

Mitosis, Aging and Mutation Algorithms
ip.volmaxmit stores the value of the doubling volume for PCancer and QCancer generalized cells whereas ip.Svolmaxmit stores the doubling volume for PStem and QStem generalized cells. The first for loop builds a list of generalized cells to divide and division takes place in the second for loop. In our model quiescent generalized cells do not grow however, a growing proliferating generalized cell with a volume close to doubling volume can become quiescent. In this special situation quiescent generalized cell can divide.

Helper CompuCell3D modules, initial conditions and basic GGH parameters
Here we present the CC3DML modules which track centers of mass and monitor sets of pixels belonging to each individual generalized cell: As an initial condition, we single 3x3 voxel quiescent stem cell located in the middle of the lattice: Table 11. Python implementation of updateAttributes function (continuation of  Table 12. CC3DML instantiation of center of mass and pixel tracker modules.

Simulation parameters
For completeness, Table 1 lists the parameters that generated Fig 2.