A computational modeling approach for predicting multicell spheroid patterns based on signaling-induced differential adhesion

Physiological and pathological processes including embryogenesis and tumorigenesis rely on the ability of individual cells to work collectively to form multicell patterns. In these heterogeneous multicell systems, cell-cell signaling induces differential adhesion between cells that leads to tissue-level patterning. However, the sensitivity of pattern formation to changes in the strengths of signaling or cell adhesion processes is not well understood. Prior work has explored these issues using synthetically engineered heterogeneous multicell spheroid systems, in which cell subpopulations engage in bidirectional intercellular signaling to regulate the expression of different cadherins. While engineered cell systems provide excellent experimental tools to observe pattern formation in cell populations, computational models of these systems may be leveraged to explore more systematically how specific combinations of signaling and adhesion parameters can drive the emergence of unique patterns. We developed and validated two- and three-dimensional agent-based models (ABMs) of spheroid patterning for previously described cells engineered with a bidirectional signaling circuit that regulates N- and P-cadherin expression. Systematic exploration of model predictions, some of which were experimentally validated, revealed how cell seeding parameters, the order of signaling events, probabilities of induced cadherin expression, and homotypic adhesion strengths affect pattern formation. Unsupervised clustering was also used to map combinations of signaling and adhesion parameters to these unique spheroid patterns predicted by the ABM. Finally, we demonstrated how the model may be deployed to design new synthetic cell signaling circuits based on a desired final multicell pattern.


Introduction
Multicell patterning processes in embryonic development and tumorigenesis are regulated by dynamic cell-cell signaling and differential adhesion between cell subpopulations [1,2]. Steinberg and colleagues [3] previously proposed the differential adhesion hypothesis to explain how patterns emerge at the multicell level in heterogenous mixtures of cells. Specifically, they proposed that cell subpopulations self-organize to minimize the adhesive free energy between cells such that more adhesive cell subpopulations are surrounded by less adhesive ones [3][4][5]. The differential adhesion hypothesis has been validated experimentally [3][4][5][6][7] and incorporated into computational models of cell sorting and morphogenesis [8][9][10][11]. However, this conceptual model does not account for how dynamic cell-cell signaling processes, which regulate adhesion protein expression, may yield emergent patterns that differ from those predicted by purely equilibrium considerations (i.e., minimization of adhesive free energy).
The ability of a cell to adhere to another cell may be time-and space-dependent based on cell-cell signaling that alters expression of adhesion proteins. In embryonic development, for example, cell-cell interactions initiate transcriptional processes that regulate expression of specific adhesion proteins, leading to the segregation of cell subpopulations into organ-specific tissue layers [12][13][14][15][16][17][18][19]. Similarly, in many cancers, signaling cascades cause individual cells to downregulate expression of the cell adhesion protein E-cadherin, leading to epithelial-mesenchymal transition (EMT) [20][21][22][23][24]. EMT creates the opportunity for differential adhesion between epithelial cells and the mesenchymal counterparts they can generate, impacting the spatial configuration of these subpopulations throughout the tumor [25][26][27][28]. Emergent cell patterning within tumors can impact tumor growth, migration, and response to treatment [20,[28][29][30][31][32][33][34]. Understanding the mechanisms by which multicell patterning occurs can inform drug targets for various congenital diseases and other pathologies that occur throughout adulthood.
Synthetic biology approaches have been used to impart self-organizing properties to multicellular systems [2,[35][36][37][38][39][40][41]. These systems provide an ideal experimental setting to explore pattern formation resulting from dynamic cell-cell signaling because the cell interactions are clearly defined and programmable. A recent example of interest is the bidirectional signaling system engineered by Toda et al. [2], in which sender cells drive expression of a particular cadherin in receiver cells that then reciprocally drive expression of a different cadherin in the initial sender cells (Fig 1A). The authors observed that the relative homotypic and heterotypic cadherin interaction strengths directly impacted observed spheroid patterns. When the network was engineered with N-cadherin (Ncad) and P-cadherin (Pcad), which both have high Specifically, high-Ecad receiver cells formed a core encapsulated by a shell of low-Ecad sender cells [2]. In both systems, the number of poles and core/shell spheroids formed varied across trials with the same initial conditions [2], but the factors that led to this variable outcome were not systematically investigated. These results open several questions about how combinations of cell-cell signaling and differential adhesion events affect multicell patterning. The multicell patterns that form in these systems may depend on the initial ratio of cell subpopulations, the order of signaling events, and/or the induced adhesion strengths between individual cells, but more work is needed to explore such potential dependencies and how combinations of these perturbations may affect the robustness of pattern formation.
While it would be impossible to explore the effects of perturbing different system parameters exhaustively through experiments, computational models provide an efficient method for predicting unique multicellular patterns for different parameter combinations [42][43][44][45][46]. ABMs are a particularly useful computational approach for modeling heterogeneous cell types and spheroid patterning because ABMs can represent individual cells within a population and predict how cell type-specific behaviors produce spatial patterns at the multicell level [8][9][10]28,[47][48][49][50][51][52][53][54]. Here, we developed two-dimensional (2D) and three-dimensional (3D) ABMs of the Ncad/Pcad bidirectional signaling system engineered by Toda et al. [2]. Cells were modeled as agents expressing different cadherins with a specified time delay in response to heterotypic receptor-ligand interactions and as differentially adhering to like or unlike cells based on the homotypic and heterotypic adhesion properties of the cadherins. Model parameters were tuned to fit the ABM to experimental observations we made for a 1:1 ratio of the seeded cell types, and the fitted model was experimentally validated against experiments with unequal cell seeding densities. Unsupervised machine learning was then employed to map different combinations of perturbed model parameters (e.g., characterizing cadherin induction and adhesion strengths) to specific spheroid patterns predicted by the ABM, including the previously described core/pole and core/shell patterns and others resembling a soccer ball, bull's eye, or stripes. The ABM was also used to design synthetic cell-cell signaling circuits that encode specified multicellular patterns, demonstrating a potential practical application of this modeling approach.

2D and 3D ABMs predict the impacts of dynamically activated differential adhesion on spheroid patterning
2D and 3D ABMs were developed based on the synthetic Pcad/Ncad circuit engineered by Toda et al. [2] (Fig 1A). The models were seeded with "sender" and "receiver" cells. Sender cells express blue fluorescent protein (BFP+), CD19 ligand, and green fluorescent protein (GFP) receptor. Receiver cells express the CD19 receptor (-/-). In response to CD19 receptorligand binding between the two cell types, receiver -/-cells express Ncad and GFP to become GFP/Ncad cells. GFP/Ncad cells then behave as sender cells and reciprocally promote Pcad and mCherry expression in BFP+ cells (that then behave as receiver cells) via GFP ligandreceptor binding, forming mCherry/Pcad cells. Hence, in this synthetically engineered multicell system, both cell types can function as either sender or receiver cells based on receptorligand binding interactions with adjacent cells, and the terms "sender" and "receiver" are used to clarify a cell's role during specific steps in bidirectional signaling. Final spheroid patterns were quantified using metrics such as total fraction of spheroid area occupied by each cell type, average areas of homotypic cell clusters, and average distances between homotypic cell clusters and the spheroid center ( Fig 1B, Table 1). Homotypic cell "clusters" were defined as three or more cells of the same cell type adjoined by cell-cell junctions on adjacent grid spaces in the cardinal directions. The "core" cell type of a spheroid was defined as the cell type of the homotypic cluster with minimum distance to all other homotypic clusters in the spheroid.
Interactions among cells in the ABM were encoded as six rules incorporating nine model parameters that execute on each timestep of the simulation, where one timestep represents 30 min ( Fig 1C, Table 2). Cadherin phenotypes were induced in the model based on induction constants (c i ) that control the probability of cadherin induction in a receiver cell based on the number of neighboring sender cells. Over the time course of the model, neighboring cadherinexpressing cells form junctions with each other based on homotypic (P i ) and heterotypic (P a,b ) adhesion strengths encoded as probabilities of adhesion in the model. To recapitulate the Table 1. Metrics used to quantify multicell patterns in 2D ABMs.

Metric Definition
Total fractional area Number of cells of a specific color (blue, green, or red) as a fraction of the total number of cells of that color in the spheroid.
Average cluster fractional area Average of the areas of all clusters of cells of the same color as a fraction of the total spheroid area. Note that a "cluster" is defined as three or more neighboring cells in the cardinal directions of the same color.
Average fractional distance from cluster centroids to spheroid center Average of the distances between the centroids of individual clusters of a specific color to the center of the spheroid as a fraction of the radius of the spheroid. Ncad/Pcad spheroids observed in Toda et al. [2], we assumed that both cadherin cell types have an equal induction constant (c i ) and equal homotypic adhesion strength (P i ). In later sections, we explore how spheroid patterning is impacted by assigning unequal values of c i and P i to the GFP/Ncad phenotype (c a , P a ) and to the mCherry/Pcad phenotype (c b , P b ). All simulations that were compared to in vitro experiments were run for 100 timesteps, where one timestep represents 30 min, because in vitro spheroids were observed for 50 hr. Additionally, because published data in Toda et al. [2] indicated that cells roughly doubled in number over the course of 50 hr, ABMs were seeded with twice the number of cell agents as the corresponding in vitro experiment (i.e., an in vitro experiment seeded with 200 cells corresponded to an ABM seeded with 400 cell agents). All other simulations were run based on pilot simulations to identify runtimes beyond which no changes were observed in the spheroid pattern, and the specific model runtime for each set of simulations is denoted in corresponding figure captions.

Number of clusters
Published data and an evolutionary algorithm were used to parameterize the ABM Parameters in the 2D ABM were either estimated based on published experimental data in Toda et al. [2] or computationally tuned with an evolutionary algorithm (EA) ( Table 2). The time delay for cadherin phenotypes to be induced (a for GFP/Ncad and b for mCherry/Pcad) and the times to express each cadherin (Δa, Δb) were manually adjusted in the 2D ABM according to published data in Toda et al. [2]. Specifically, these parameters were set such that in simulations of experiments seeded with 100 BFP+ and 100 -/-cells, GFP/Ncad cells formed after 13 hr and mCherry/Pcad cells developed adjacent to GFP/Ncad cells after 21 hr, the respective times reported by Toda et al. [2]. Because the remaining five parameters (c i , P i , P a,b , μ, t) could not be determined from experimental data, these parameters were first manually approximated in the 2D ABM as values yielding outputs qualitatively similar to published images of spheroids from the Ncad/Pcad bidirectional signaling system in Toda et al. [2] (i.e., characterized by a GFP/Ncad central core flanked by mCherry/Pcad clusters and individual BFP+ cells). Then, a univariate parameter sensitivity analysis was implemented for the 2D ABM to determine which of these parameters had the largest relative impact on spheroid patterning. Each of the five parameters was increased or decreased by 10% from its base value, and the effects of these perturbations on the numbers and average areas of green and red clusters were determined (Fig 2A-2D). This analysis identified the cadherin induction constant (c i ) and the homotypic adhesion strength of both cadherin cell types (P i ) as the two parameters whose variation produced the largest changes in all four measured outputs. In this univariate sensitivity analysis, we assumed both cadherin cell types have an equal induction constant (c i ) and equal homotypic adhesion strength (P i ). Values for c i and P i were determined using an evolutionary algorithm (EA) that optimized the 2D ABM prediction of patterns that were experimentally observed in spheroids seeded with 200 cells at a 1:1 ratio of BFP+ and -/-cells. A detailed description of the EA implementation is provided in Materials and Methods. A plot of 30 EA runs over 100 generations demonstrates that the EA accomplished a substantial reduction in error (difference between model prediction and data) to identify values for c i and P i (Fig 2E). The parameter values optimized for the 2D ABM were also used in the 3D ABM.

ABMs predict spheroid patterns across a range of different initial conditions that match in vitro results
2D and 3D ABMs were used to predict spheroid patterns when the BFP+: -/-cell seeding ratio was varied (1:9, 1:4, 1:1, 4:1, 9:1) for a fixed total number of seeded cells (60 or 200 cells) Parameter sensitivity analysis and optimization improves model fit to in vitro data. Parameters that could not be estimated based on experimental data (c i , P i , P a,b , μ, t) were perturbed by a factor 0.9 and 1.1, and the sensitivity coefficient was quantified for the following outputs: (A) number of green clusters, (B) average fractional area of green clusters, (C) number of red clusters, and (D) average fractional area of red clusters. Sensitivity coefficients are reported as the average of 100 model runs, each run for 100 timesteps. The x-axis is ordered from greatest to least absolute-value sensitivity coefficient for each panel. (E) An evolutionary algorithm (EA) was used to tune parameters c i and P i by minimizing the error between model outputs and experimental images of 200-cell spheroids seeded with a 1:1 ratio of BFP+ and -/-cells. The EA was run 30 times for a population of 20 model combinations over 100 generations. Within each generation, the error for each parameter combination was averaged over 10 runs for 100 timesteps. Sample model graphical outputs generated from the parameter combinations with maximum or minimum error are shown.
( Fig 3A). For the 1:1 ratio with 200 cells, the model predicted a GFP/Ncad core surrounded by mCherry/Pcad poles and individual BFP+ cells. For the 1:1 ratio with 60 cells, the model predicted two neighboring poles, one of GFP/Ncad cells and one of mCherry/Pcad cells, with surrounding individual BFP+ cells. In spheroids with a minority of seeded BFP+ cells, fewer mCherry/Pcad cells were generated, leading to fewer and smaller mCherry poles. Conversely, in spheroids with a majority of seeded BFP+ cells, fewer GFP/Ncad cells were generated, leading to smaller GFP/Ncad clusters that were typically located farther from the spheroid center. Model predictions aligned well with observations from in vitro experiments we performed independently using the cell lines engineered by Toda et al. [2]. The average fractional blue, red, and green areas of the in vitro spheroids and the model predictions were statistically similar to each other for the 1:9, 1:1, and 9:1 ratios (Fig 3B-3D). Moreover, 2D and 3D model In vitro cross-sectional images are microscopy images of a single focal plane, without deconvolution, in the approximate center of the spheroid. Comparisons of (B) blue, (C) green, and (D) red fractional areas or compositions between in vitro, 2D ABM, and 3D ABM images for varying initial BFP+:-/-seeding ratios (1:9, 1:1, 9:1) and 200 cells. For 3D ABM outputs, a fractional composition, rather than fractional area, was computed as the number of cell agents of a given color channel divided by the total number of cell agents in the spheroid. ABM-predicted fractional areas or compositions for the 2D and 3D models, respectively, were calculated as an average of 100 model runs, each run for 100 timesteps, for each experimental condition. Fractional areas from in vitro images were calculated from a sample of 9, 10, and 10 images for the 1:9, 1:1, and 9:1 ratios, respectively. A one-way Kruskal-Wallis test followed by multiple-comparisons testing was used to compare fractional areas or compositions between all in vitro, 2D, and 3D ratios; �� indicates p < 0.001. Error bars indicate within-group standard deviation. predictions were statistically similar, suggesting that the 2D model sufficiently represented the synthetic cell signaling system.
Comparison of fractional areas of induced phenotypes among the 1:9, 1:1, and 9:1 ratios indicates that a small number of sender cells is required to induce cadherin phenotypes in the majority of receiver cells. For example, despite there being fewer BFP+ cells in the 1:9 ratio, the total GFP/Ncad fractional area was not significantly different from that of the 1:1 ratio ( Fig  3C). Similarly, despite the total fractional area of GFP/Ncad cells being significantly lower in the 9:1 ratio than in the 1:1 ratio (Fig 3C), the total mCherry/Pcad fractional area was not significantly different from the 1:1 ratio (Fig 3D). These results suggest that sender BFP+ cells in the 1:9 ratio and sender GFP/Ncad cells in the 9:1 ratio disperse sufficiently over the model's time course such that sender BFP+ and sender GFP/Ncad cells experience enough interactions with receiver cells that the total fractional area of phenotypes induced by these sender cells is similar to that in the 1:1 ratio. This hypothesis was tested by quantifying the positions and movements of GFP/Ncad cells over time in the 2D ABM for 1:1, 4:1, and 9:1 BFP+: -/-seeding ratios. For ratios with BFP+ cells in the majority (4:1 and 9:1), GFP/Ncad clusters were significantly farther from the spheroid center than for the 1:1 ratio and, thus, more dispersed throughout the spheroid ( Fig 4A). Additionally, after the GFP/Ncad phenotype had been induced in the majority of -/-cells (~13 hr or 26 timesteps in the model), the fraction of GFP/ Ncad cells that moved per timestep in the 9:1 ratio was larger than that in the 4:1 and 1:1 ratios ( Fig 4B). This result suggests that simulated cell clusters composed of fewer cadherin-expressing cells were more mobile due to fewer movement-restricting intercellular interactions. Increased mobility permits more interactions of GFP/Ncad cells with BFP+ cells, ultimately producing as many mCherry/Pcad cells for the 9:1 condition as for the 1:1 condition.
The model also predicted that the total number of seeded cells affects final spheroid patterns ( Fig 5A). As the total number of cells in the spheroid increased, the number of green clusters formed also increased ( Fig 5B). Simultaneously, the average fractional area of green clusters in the spheroid decreased, and the average fractional distance of green clusters to the center of the spheroid increased (Fig 5C and 5D). These trends suggest that as the overall cell number increases, the central GFP/Ncad core partitions into smaller pieces that become interspersed throughout the spheroid. This result is further explored in

Dynamic cell-cell signaling and homotypic adhesion strength impact pattern formation
To test the hypothesis that the dynamic expression of different cadherin isoforms impacts spheroid patterning, three different signaling circuits were explored ( Fig 6A). In circuit 1, the cadherins were constitutively expressed, as opposed to expression via induction by juxtracrine signaling. In circuit 2, only unidirectional signaling was present such that seeded mCherry/ Dynamic cell-cell signaling has differential effects on core/pole and core/shell spheroids. (A) A schematic of three different ABM signaling circuits and three adhesion profiles is shown. Circuit 1 contains constitutive cadherin expression and no juxtracrine signaling. Circuit 2 includes unidirectional signaling. Circuit 3 includes bidirectional signaling (as described in Fig 1A). In Circuit 3, induction constants for GFP/Ncad and mCherry/Pcad phenotypes are parametrized separately (c a and c b , respectively). In each adhesion profile, the homotypic adhesion strengths of GFP/Ncad and mCherry/Pcad phenotypes are also parametrized separately (P a and P b , respectively). All depicted adhesion profiles have a homotypic adhesion strength of 1.0. Adhesion profile A gives rise to core/pole spheroids, whereas adhesion profiles B and C give rise to core/shell spheroids. The combination of circuit type and adhesion profile leads to nine rulesets (1A, 1B, 1C, 2A, 2B, 2C, 3A, 3B, and 3C). (B) The model was run 100 times, each for 100 timesteps, for each of the nine rulesets. Each ruleset was seeded with 200 cells at a 1:1 ratio of the seeding cell types. The fraction of trials with either GFP/Ncad or mCherry/Pcad spheroid core was compared between rulesets using a Chi-Squared test with df = 8; significant differences between observed and expected fraction (0.50) of core cell types were found across all nine rulesets with p < 0.001. The expected fraction was set to 0.5 because the null hypothesis states that signaling has no effect on which cell type forms the spheroid core. � indicates groups with residuals > 1. Pcad sender cells induced the GFP/Ncad phenotype in seeded -/-receiver cells. In circuit 3, the full bidirectional signaling circuit was modeled, with seeded BFP+ sender cells inducing GFP/Ncad expression in seeded -/-receiver cells and newly formed GFP/Ncad cells inducing mCherry/Pcad expression in BFP+ cells. In this circuit, the induction constants for GFP/Ncad and mCherry/Pcad phenotypes were parametrized separately (c a and c b , respectively, instead of c i ). In the simulations discussed for circuit 3 in this section, c a = c b = c i , but c a and c b are varied in later Results sections. For each circuit, three homotypic adhesion strength profiles were tested, in which the homotypic adhesion strengths of GFP/Ncad and mCherry/Pcad phenotypes were parametrized separately (P a and P b , respectively, instead of P i ). In profile A, GFP/Ncad and mCherry/Pcad cell types exhibited maximal homotypic adhesion and minimal heterotypic adhesion (P a = P b = 1.0, P a,b = 0.0), giving rise to core/pole spheroids, in which one adhesive cell type assembled in a core and the other assembled in smaller peripheral clusters. In profile B, GFP/Ncad cells exhibited maximal homotypic adhesion, mCherry/Pcad cells exhibited minimal homotypic adhesion, and heterotypic adhesion was maximal (P a = 1.0, P b = 0.0, P a,b = 1.0). In profile C, GFP/Ncad cells exhibited minimal homotypic adhesion, mCherry/Pcad cells exhibited maximal homotypic adhesion, and heterotypic adhesion was maximal (P a = 0.0, P b = 1.0, P a,b = 1.0). Both profiles B and C gave rise to core/shell spheroids, in which the more adhesive cell type formed a core surrounded by a complete shell of the less adhesive cells (Fig 6A). Combinations of each signaling circuit and adhesion profile yielded 9 rulesets. For all rulesets, simulations were seeded with a 1:1 ratio of initial cell types for a fixed total of 200 cells.
The effects of uni-and bi-directional signaling between cells were explored by comparing the core cell types in spheroids for circuits 1-3 ( Fig 6B). Adhesion profile A tended to produce core/pole spheroids for which the core cell type varied based on the signaling circuit. Ruleset 1A gave rise to either GFP/Ncad or mCherry/Pcad cores, whereas in Rulesets 2A and 3A the adhesive phenotype that was first present in the circuit formed the spheroid core. Thus, the adhesive cell type that is present longer in the spheroid exhibited a higher likelihood of condensing into a central core. The same trends for core cell type were not observed for adhesion profiles B and C, which tended to produce core/shell patterns. In core/shell spheroids, the more adhesive phenotype formed the spheroid core in > 90% of model simulations regardless of the signaling circuit ( Fig 6B). Thus, for profiles B and C, the signaling dynamics produced outcomes that were consistent with the differential adhesion hypothesis in 200-cell spheroids seeded with a 1:1 ratio.

Varying cell seeding in Ruleset 3B leads to patterns that violate the differential adhesion hypothesis
To test whether it is possible to generate spheroids that violate the differential adhesion hypothesis, the 2D ABM was used to predict patterns of cell subpopulations when the seeding ratios and total cell numbers in Ruleset 3B were varied. Ruleset 3B was chosen because it is based on the fundamental premise of the differential adhesion hypothesis, in which two phenotypes with differing homotypic adhesion strengths self-organize. As the relative proportion of seeded -/cells and spheroid radius increased, the likelihood that adhesive cells would enclose non-adhesive cells increased (Fig 7A). In the 1:1 ratio group, regardless of spheroid radius, a small number of simulated spheroids (~12-20%) contained one group of non-adhesive red cells enveloped by adhesive green cells, while virtually no spheroids contained non-adhesive -/-, or gray, cells enveloped by adhesive green cells. For the 1:4 and 1:9 ratios, the fraction of spheroids with groups of non-adhesive red cells enclosed by adhesive green cells decreased significantly, and the overall fraction of spheroids with groups of non-adhesive gray cells enclosed by adhesive green cells increased significantly, with the latter effect becoming more apparent as the spheroid radius increased. This trend is partially explained by the presence of fewer seeded BFP+ cells, which leads to fewer -/-cells transitioning to the GFP/Ncad phenotype and mCherry/Pcad phenotypes. Due to the increased tendency of smaller green clusters to spontaneously arise at the periphery of the spheroid, non-adhesive cells surrounding each individual green cluster appeared to be trapped in the center of the spheroid, giving rise to situations where non-adhesive cells appeared enclosed by adhesive cells. This interpretation is reinforced by the fact that when the induction constant of the GFP/Ncad phenotype (c a ) was increased, a single, larger green core formed, and the spheroid patterns resembled those observed in the 1:1 group (Fig 7B  and 7C). When the induction constant of the GFP/Ncad phenotype (c a ) was larger, fewer BFP + sender cells were required to induce the GFP/Ncad phenotype in -/-receiver cells, and a larger central GFP/Ncad core formed more quickly in the spheroid, instead of the slower formation of GFP/Ncad clusters and their subsequent enclosure of non-adhesive cells.

Homotypic adhesion strengths impact heterogeneity in spheroid patterns
Systematically covarying the homotypic adhesion strengths of GFP/Ncad and mCherry/Pcad cells (P a and P b , respectively) and setting P a,b = 0 in the 2D ABM for signaling circuit 3 (bidirectional signaling) for spheroids seeded with~350 cells produced a range of different spheroid patterns. We chose to increase the spheroid size for these models because distinct patterns emerged more clearly in larger spheroids. To display the results of these simulations, heatmaps were created to represent the numbers and average areas of green or red clusters in the spheroids (Fig 8A). Recall that the extremes of these maps describe core/shell (P a = 1.0 and P b = 0.0 or P a = 0.0 and P b = 1.0) and core/pole patterns (P a = P b = 1.0). With P a > P b , there were typically fewer and larger green clusters than red clusters, but the average number and area of green clusters were not as low as those observed in the core/pole pattern with a red core. Conversely, with P b > P a , there were fewer and larger red clusters than green clusters, but the average number and area of red clusters were not as low as those observed in the core/pole pattern with a green core. Deviations from core/pole and core/shell patterns could occur when P a and P b have non-extreme values.
To identify the different spheroid patterns formed, we performed k-means clustering, an unsupervised machine learning technique, to identify six groups within the data (Fig 8B). Each of the six groups corresponded to a distinct spheroid pattern: a core/pole pattern with a green core and three red poles, a core/shell pattern with green core and red shell, a core/shell pattern with red core and green shell, a striped pattern of alternating green and red linear clusters, a soccer ball pattern of circular green and red clusters, and a bull's eye pattern with radially alternating green and red shells. As an alternative way to visualize the k-means clustering results, the data were projected in two dimensions using t-distributed stochastic neighbor embedding (tSNE), with pattern identities retained from k-means clustering (Fig 8C). These results also visually support the binning of spheroids into more than just two categories.
Each combination of homotypic adhesion strengths led to simulation outputs that exhibited varying frequencies of the six patterns identified by the clustering algorithm described above. When P a = P b = 0.0, a soccer ball pattern was generated more than 90% of the time. In alignment with the differential adhesion hypothesis, when P b > P a and P a = 0.0, patterns with a red core were generated almost 100% of the time. Conversely, when P a > P b and P a = 0.0, patterns with a green core were generated almost 100% of the time. The distribution of core/shell, core/ pole, bull's eye, and striped patterns generated was relatively similar for all conditions where the homotypic adhesion strengths of both cell types are equal (e.g., , the most frequent patterns involved the more adhesive cells being surrounded by the less adhesive cells. With P a = 0.75 and P b = 0.25, the most frequent patterns were the core/pole and core/shell spheroids with a green core and bull's eye spheroids with a green core. With P b = 0.75 and P a = 0.25, the most frequent patterns were the core/pole spheroids with a red core and the bull's eye spheroids with a red core. This trend suggests that when differential adhesion is present, but the adhesive strength of the second induced phenotype is not maximal, hybrids of the core/shell pattern arise.

Cadherin induction constants impact core/pole and core/shell patterns
Varying the induction constants of GFP/Ncad and mCherry/Pcad cells (c a and c b , respectively) was also predicted to impact the final numbers of green and red clusters and their average areas (Fig 9A). For Ruleset 3A (bidirectional signaling core/pole ruleset) and when c a = 0.05 and c b = 10, one green cluster with a relatively small area was formed, and one red cluster with a larger area was formed. When c b was only slightly larger than c a , multiple small green clusters were formed, and one or two red clusters were formed, suggesting the possibility that different patterns may occur for different parameter combinations. Seven groups were identified by kmeans clustering (Fig 9B). The first two patterns included a green core and three or four red poles. The third pattern comprised spheroids with a green core and red shell. The fourth pattern exhibited an inversion of the first two, with a red core and two or more green poles. The fifth pattern included one red core and one green pole. The sixth included a red core and single green and gray peripheral cells. Finally, the seventh pattern included striped or soccer ball patterns. As in Fig 9A, when c a << c b , a red core formed with one or multiple green poles. However, when c a was sufficiently high, core/pole structures with a green core and multiple red poles formed more frequently. To produce an alternative visualization, tSNE was used to reduce the dimensionality of the system (Fig 9C). tSNE results further supported the existence of multiple subtypes of spheroids. Modifying the cadherin induction constants affects the overall spheroid pattern because various combinations of c a and c b directly impact the final ratio of BFP+, -/-, GFP/Ncad, and mCherry/Pcad cells in the spheroid (S1A Fig). The parameters c a and c b were also varied for Ruleset 3B (bidirectional signaling core/shell ruleset). The ratio of the shell's thickness to the core's radius progressively increased as c b became larger than c a (S1B and S1C Fig), which is consistent with more mCherry/Pcad cells at the final time point. Because this parameter describes sensitivity of a cell to its neighbors, and the number of neighbors in 3D is larger than in 2D, results from these 2D model calculations may not translate directly to a 3D model.

Adjusting ABM parameters enables customization of multicell patterns in spheroids
To demonstrate the utility of the model for developing design principles for spheroid engineering, we used the 2D ABM to design multicellular spheroids with desired patterns. To control the overall spheroid configuration, the homotypic adhesion strengths of induced cell types should be controlled (Fig 10A). The ABM predicted that, to produce a core/pole pattern, both adhesion molecules should have homotypic adhesion strengths > 0. 75. In order to produce a The average from 100 model runs of the following metrics were plotted for each parameter combination: number of green clusters, number of red clusters, average fractional area of green clusters, and average area of red clusters. (B) Results from this parameter space exploration were clustered using a k-means algorithm into seven groups, with representative ABM outputs shown for each group and color coded as the key for this graph. (C) t-distributed stochastic neighbor embedding (tSNE) algorithm was used to project the data into two dimensions with colors for spheroid cluster types retained from the k-means clustering analysis in panel (B). The tSNE algorithm was run with perplexity = 100.
The schematics in Fig 10B demonstrate how bivariate parameter explorations of the 2D ABM can guide the forward design of a user-specified spheroid with an mCherry/Pcad core, one GFP/Ncad pole, and surrounding -/-cells. Based on the results of the model, the user should set P a and P b to be maximal and equal to achieve a core/pole configuration and engineer the system such that c a << c b to achieve an mCherry core and one GFP/Ncad pole.

Discussion
We created an ABM of bidirectional cell-cell interactions that drive patterning of multicellular spheroids inspired by the signaling circuits engineered by Toda et al. [2]. The ABM provides ABMs may be used to design synthetic signaling circuits encoding customizable core/pole, core/shell, soccer ball, striped, and bull's eye spheroid patterns. (A) For a desired spheroid pattern (e.g., core/pole, core/shell, soccer ball, striped, or bull's eye), homotypic and heterotypic adhesion strengths of cell types can be modulated. Features of the core/pole and core/shell configurations, such as the number of poles and ratio of shell thickness to core radius, may be further controlled by modifying the induction constants of cadherin phenotypes in the system. (B) An outline of the process to create a structure with mCherry/Pcad core, GFP/Ncad pole, and surrounding single cells is shown.
https://doi.org/10.1371/journal.pcbi.1010701.g010 an alternative to time-consuming cell culture experimentation by predicting patterns that can arise from a multicell signaling system as a result of adjusting initial conditions, signaling interactions, and adhesion strengths alone or in combination. We used the model to explore the impact of dynamic cell-cell signaling on adhesion-driven pattern formation, and we observed the emergence of new patterns that were inconsistent with the classical differential adhesion hypothesis. Moreover, we used machine learning to map combinations of homotypic adhesion strengths to unique spheroid patterns. Additionally, we found that modulating the induction constants for the GFP/Ncad and mCherry/Pcad phenotypes controls the number of poles in core/pole patterns and the ratio of shell thickness to core radius in core/shell patterns. Our study supports the use of computational models to identify parameters that will yield customized spheroid patterns. Because these parameters have physical meaning, they may be used as design constraints for multicell, synthetic circuits. For example, homotypic and heterotypic adhesion strengths may be controlled by selecting specific cadherin isoforms [6]. Similarly, promoter regions for synthetic transgenes may be selected according to the intended phenotype induction constants [55].
Previous studies have leveraged computational models to explore cell patterning in heterogeneous systems [8][9][10][43][44][45][47][48][49]56,57]. For example, a prior Cellular Potts computational model of the differential adhesion hypothesis [8,9] demonstrated how cell subpopulations with varying levels of cohesiveness self-assemble and how environmental variables such as temperature, shear force, and pressure affect these self-assembly processes. These types of models have been expanded to develop more complex systems such as CompuCell3D, a multi-scale modeling framework that recapitulates morphogenesis in a wide array of tissues and species and incorporates cell differentiation in response to morphogens [10]. Taylor et al. [11] developed an ABM to explore boundary formation and border sharpening between cell subtypes in the Eph-ephrin signaling system, which is critical in embryogenesis. Their model predicts a range of configurations that arise between Eph-and ephrin-expressing cells, and as a function of varying the homotypic and heterotypic adhesion strengths between cell types. Although these published models have thoroughly explored pattern formation in heterogeneous systems with static cell phenotypes, they have not considered how patterns may be affected by the modulation of differential adhesion via known, programmable cell-cell signaling interactions. Our model differs from prior efforts because it simulates phenotypic transitions controlled by a synthetic gene circuit, which allowed us to explore how the phenotype induction probabilities, timing, and sequencing of dynamic cell-cell differential adhesion drives pattern formation.
Our study also contributes to the literature on robustness of pattern formation in multicell systems. Here, we define robustness as lack of substantial change in spheroid patterns for specified changes in model parameters, or for no change in parameters at all, consistent with prior definitions [58]. Our model identified regions in parameter space where patterns (or distributions of patterns) were predicted to be robustly formed and other regions where spheroid patterns were sensitive to changes in parameters. Prior work has modeled multicell pattern emergence using Turing models [58][59][60], which make predictions based on considerations of sharp spatial gradients of chemical morphogens that arise due to competition between reaction and diffusion [61]. Model Turing systems are sensitive to small changes in parameters, and prior studies have noted that integrating biological processes such as cell-cell signaling and differential adhesion may increase robustness of model predictions [58][59][60]. It is possible that incorporating the morphogen reaction-diffusion processes considered by Turing models in our ABM, and into the design of synthetic circuits for cell culture experiments, could increase robustness of pattern formation.
This work also contributes to recent efforts to elucidate design principles for multicell patterns using a combination of computational and synthetic approaches [62][63][64]. Depending on the desired pattern (e.g., core/shell, core/pole), different combinations of adhesion molecules or methods to induce their expression may be selected. For example, quantitative studies of homotypic and heterotypic adhesion strengths of cadherins, such as those by Duguay et al. [6] and Foty & Steinberg [7], may guide the selection of adhesion molecules for synthetic circuits.
To control details such as the number of poles or ratio of shell-to-core radius, induction constants for cadherin expression may be adjusted. Similar parameters can be found in other types of models. For example, Matsuda et al. [55] designed a synthetic experimental system to study signal propagation in heterogeneous cell populations. Their ODE-based model determined conditions that optimize signal propagation and included a "promoter coefficient" similar to our c i parameter. The synthetic feature related to this model parameter accounts for the copy number of the promoter and likelihood of epigenetic modification based on insertion region.
Aspects of the basic modeling approach utilized here will also be useful for the study of endogenous cell signaling circuits that regulate cell-cell adhesion. An application of relevance in cancer biology is epithelial-mesenchymal transition (EMT), a cell process that aberrantly occurs in numerous carcinomas (e.g. breast cancer, pancreatic ductal adenocarcinoma), occurs early in metastasis [65], and promotes resistance to chemotherapy and targeted therapeutics [23,24,66]. During EMT, epithelial-derived cancer cells lose polarity and expression of cell adhesion molecules to become more mesenchymal, motile, and invasive. This process is highly regulated by signaling and transcriptional networks that can be initiated by growth factors, cell-matrix interactions, and tumor microenvironmental factors (e.g., low oxygen tension). ABMs capable of incorporating signaling dynamics that lead to EMT and the resultant effects on differential cell-cell adhesion could be used to understand the morphogenesis of primary tumors and the processes that lead tumors of different compositions to shed metastases [28,54]. Thus, elements of our modeling framework could help expand upon prior computational models of how differential adhesion impacts tumor metastasis and cell migration [67].
There are limitations of both our computational model and the in vitro system. The computational model does not explicitly simulate the effects of cell proliferation nor the fact that induced cadherin expression could decrease after loss of contact between sender and receiver cells. The ABM also assumes that cadherin expression is uniform among cells, but realistic heterogeneity of expression within a population of even clonal cells could impact pattern formation. These issues could be addressed in future model implementations. For example, distributions of cadherin expression among cells could be quantified by flow cytometry or single-cell RNA sequencing, and this data could inform individual simulated cell behaviors in the ABM. Experimental limitations, including an inability to count the individual cells within spheroids, also limited the degree to which we could quantitatively compare model results with experimental data. Furthermore, neither our model nor the experimental spheroids contain diffusible morphogen cues, which are instructive in the patterning of natural (non-synthetic) multicell biological systems [41]. Future extensions of the ABM could incorporate diffusible morphogen cues, such as those needed for inducing cell differentiation [44,45]. An additional model limitation is that signaling dynamics were encoded as simple time delays for cadherin expression after cell contacts were made. In the future, it will be worthwhile to incorporate more realistic models of intracellular signaling, as has been done in recently published multiscale ABMs that integrate information across intracellular and intercellular scales [68][69][70][71]. The attendant need to solve differential equations describing signaling reaction kinetics in each cell within the ABM would greatly increase the number of model parameters and the computational resources needed to simulate the system. These are common issues encountered in multiscale, mechanistic biological models [42,67,72], and tradeoffs must be carefully weighed in designing such models. Encouragingly, recent studies have presented novel methods for reducing the computational burden of complex mechanistic models by using neural networks and hybrid continuum-based modeling approaches [73][74][75][76].
An emerging trend in multiscale computational modeling of biological systems is ABM integration with machine learning and other data science approaches to exploit high throughput experimental and simulated data. We demonstrated the utility of interpreting ABM results with machine learning through our use of k-means clustering, which mapped cell-specific ABM parameters (homotypic adhesion strengths and cadherin induction constants) to tissuelevel spheroid patterns. Previous studies have similarly used ABMs with machine learning methods to study and predict multicellular interactions [46,77]. Future iterations of our ABM could combine data-driven modeling and machine learning to further inform rule development [73,74], model calibration [78], and parameter exploration [77] in order to create an even more comprehensive and predictive model.

Cell lines and propagation
L929 murine adipocytes engineered with a bidirectional cell -cell signaling circuit based on the synthetic notch (synNotch) receptor system [2] were generously provided by Dr. Wendell Lim (University of California, San Francisco). Cells were maintained in DMEM (ThermoFisher) supplemented with 10% fetal bovine serum (VWR), 1 mM L-glutamine, 100 units/mL penicillin, and 100 μg/mL streptomycin. Cell lines were used within 25 passages and maintained in a 37˚C, 5% CO 2 ThermoForma i160 incubator. Spheroids were imaged 50 hr after plating using a Zeiss AxioObserver Z1 widefield microscope at 10× magnification. Six to seven representative spheroids were selected for imaging for each condition, by selecting wells without residual plastic particles (from plate manufacture) and wells in which all three cell colors were easily detected due to spheroid orientation. For each spheroid, a phase contrast image and three fluorescence images were taken: BFP (DAPI filter set), GFP (AF488 filter set), and mCherry (AF546 filter set). Zen Zeiss software was used to merge phase, BFP, GFP, and mCherry channels.

ABM design and implementation
2D and 3D on-lattice ABMs were developed to simulate collections of cells containing the synthetic signaling circuit represented in Fig 1. An ABM platform was chosen because this modeling technique allows for prediction of spheroid-level patterns based on individual cell-cell interactions. The 2D and 3D models were developed using NetLogo software [79].
The ABMs treat cells as individual agents in a spheroid. The 2D ABM simulates cells interacting in a plane through the central axis of a spheroid, whereas the 3D ABM simulates all cells in a spheroid. Each grid space in the on-lattice model represents a 10 μm × 10 μm area capable of accommodating one cell. The same cell size and one-cell-per-site assumptions are necessary limitations of on-lattice models. Each model timestep represents 30 min, such that 100 timesteps represent the 50-hr period over which spheroids were studied in vitro. To determine the impact of different timesteps, we tested a range including 15, 30, and 60 min, and we updated the total runtime and time-dependent parameters (a, b, Δa, Δb, t) accordingly (but without refitting the model). All timestep values led to fractional red and green areas being within the standard deviation of the in vitro measurements (S2A-S2C Fig). The deviation for the blue area for the 15 min timestep seen in S2A Fig is primarily due to the constraint that fractional areas must sum to one and the bump in red area seen for 15 min, which is within the error of the in vitro measurements. Based on those observations, and noting that the 30-min timestep yielded spheroids that were most qualitatively similar to in vitro results (S2D Fig), we concluded that our results were not extremely sensitive to the timestep value and selected 30 min. Unless otherwise noted, simulations were run for 100 timesteps because this value corresponds to the physical time over which spheroids were observed in vitro. Additionally, simulations compared with in vitro data were seeded with the twice the number of cell agents as the number of cells seeded experimentally. This was done to account for the effects of cell proliferation over this time period. Specifically, based on published data in Toda et al. [2] and our own observations of their cells, we assumed that cells doubled once in 50 hr. Seeded model agents represent BFP+ sender cells and -/-receiver cells, which can become mCherry/Pcad and GFP/ Ncad cells, respectively. State transitions were encoded by setting Boolean values of private variables. Intercellular interactions were encoded as six rules, characterized by ten parameters, that execute on each timestep ( Fig 1C, Table 2). Fig 1C describes the rules of the ABM. At the initial timestep, the "setup" method randomly places BFP+ and -/-cells within a user-specified radius and according to a user-specified ratio and percent saturation (percent of cell-occupied grid spaces). On every subsequent timestep of the model, cells move to the open grid space (of eight neighboring spaces in 2D and 26 neighboring spaces in 3D) closest to the center of the simulation space to represent the gravitational and adhesive forces that tend to pull them together at the bottom of a low-attachment well. If there are no open spaces available on any of the neighboring eight grid spaces surrounding a cell, that cell will not move.
Clusters of cells containing fewer than a threshold number of cells (μ) identify the cell closest to the spheroid center as a cell that moves to a randomly selected space on one of its neighboring eight grid spaces (or 26 grid spaces in 3D) that is unoccupied. If there are no open spaces surrounding the leader cell, the cluster will not move. Like cells in a contiguous cluster follow the leader as a connected unit, displacing cells in previously occupied grid spaces as needed to avoid two cells occupying the same grid-space simultaneously. Displaced cells move randomly to the positions vacated by the cluster that moved in response to a leader cell move. As the model is on-lattice, movement is discrete (i.e., between grid spaces), and each grid space is occupied by only one cell at a time. where c i is a tunable parameter that controls the probability of cadherin expression induction, and n is the number of neighbors with the activating phenotype. If the cadherin phenotype is induced, cells change colors and express Ncad or Pcad after specified times (Δa and Δb, respectively). In this baseline ruleset, both cadherin phenotypes have the same induction constant, c i .
Cadherin-expressing cells attach to other cadherin-expressing cells based on homotypic and heterotypic adhesion probabilities (P i and P a,b , respectively). These parameters dictate the probability that a cadherin-expressing cell will form a link with a neighboring like or unlike cadherin-expressing cell within a single simulation timestep, or tick. Additionally, when a cell expressing one type of cadherin is wedged between two cells expressing another type of cadherin, the cells switch places based on the homotypic adhesion probability such that the cells with the two like cadherins occupy adjacent grid spaces. Similarly, if a cell expressing one type of cadherin is separated from a cell expressing a different cadherin by a non-cadherin expressing cell, cells switch places based on the probability P a,b such that cells expressing the two different cadherins occupy adjacent spaces. This type of position switching behavior has been experimentally observed [4]. In the baseline ruleset, both cadherin phenotypes have the same homotypic adhesion strength, P i . Additionally, to provide the opportunity for cells to leave one cluster and join other clusters, 50% of the cadherin interactions across a grouping of connected cells were randomly broken at each timestep, and cadherin interactions between groups of cells could only be reformed after a specific number of ticks specified by the parameter t.

In vitro and in silico quantification of multicell patterns
A variety of characteristics that describe the spatial patterns of cells in the in vitro spheroids and in silico spheroids were quantified (Fig 1B). A MATLAB pipeline was developed to quantify the following characteristics for each color/channel from images of in vitro spheroids: fractional area, total area, and distance between cluster centroid and spheroid center. In order to do this, thresholds were applied to images to isolate purely green, red, or blue pixels. Then, pixel areas and centroids of contiguous cluster for each color were computed based on the MATLAB method 'RegionProps,' which computes these metrics using a connected components algorithm.
The same features were quantified in the 2D ABM simulations. Similarly, the MATLAB connected components 'RegionProps' algorithm was used to quantify the total fractional area, average area of homotypic cell clusters, and average distance between the centroid of homotypic clusters and the spheroid center for each color channel. A cluster was defined as a group of adjacent cells containing more than two cells. Additionally, the number of non-clustered cells in each color channel and total cell count of each color channel was quantified. Finally, a contiguous area ratio, which represents the fraction of the border of a spheroid core that is surrounded by cells expressing the non-core forming cadherin, was also computed. In the 3D ABM, the total fractional volume of each color channel was quantified. This metric was computed as the cell count of a certain color channel divided by the total number of cells in the spheroid. Table 2 summarizes the ABM parameter values and how they were determined. Parameters a, b, Δa, and Δb were estimated according to data published in Toda et al. [2], which indicated that for 200-cell spheroids seeded with an initial 1:1 BFP+:-/-ratio, GFP/Ncad cells appeared after 13 hr and mCherry/Pcad cells appeared adjacent to GFP/Ncad cells after 21 hr [2]. Values for a, Δa, b, and Δb were determined such that the model matched these results at the corresponding time points of 26 ticks and 42 ticks, since one model tick (or timestep) represents 30 min. The remaining five model parameters (c i , P i , P a,b , μ, and t) could not be determined from experimental data. As a first approximation, these parameters were manually adjusted to baseline values so that the 2D ABM yielded spheroids that were qualitatively similar to images of the Ncad/Pcad signaling circuit published in Toda et al. [2]. Specifically, values for μ and t were manually set to encourage continual movement of homotypic clusters in the spheroid over the 100-tick period and prevent the model from reaching a static state composed of smaller homotypic clusters at early timepoints.

Univariate parameter sensitivity analysis
A univariate sensitivity analysis was performed to identify which of the five parameters that could not be experimentally determined (c i , P i , P a,b , μ, and t) most control model outputs. Each model parameter was perturbed, one at a time, by a factor of 0.9 or 1.1, and a sensitivity coefficient (S) was calculated as where x b and x p represent baseline and perturbed parameter values, respectively, and y b and y p represent the baseline and perturbed values of the model output, respectively. Sensitivity coefficients were computed for the following model outputs: number of green clusters, average fractional area of green clusters, number of red clusters, and average fractional area of red clusters. The mean sensitivity for each model output was computed from 100 model runs and reported for the 0.9 and 1.1 perturbation of each of the 10 parameters.

Computational tuning with an evolutionary algorithm
An evolutionary algorithm (EA) was used to tune parameters that could not be determined from published data. Due to the high complexity and computational expense of parameter tuning, only the top two sensitive parameters from the univariate sensitivity analysis (c i and P i ) were selected for tuning. The EA finds a parameter combination that minimizes an error function through a natural selection process wherein optimal parameter combinations in parent generations are selected for cross-over to create offspring parameter combinations in subsequent generations. The error, E, that the EA minimized was calculated as E ¼ ðe S j ðjy j À Y j jÞ À 1Þ 2 ; where y j represents the value of a calculated model output, and Y j represents the experimental value for that output. The following model outputs (j) were used in the error equation: total fractional green area, total fractional red area, total fractional blue area, average fractional green cluster area, average fractional red cluster area, number of green clusters, and number of red regions. The exponential form of the error equation was used to encourage EA convergence. The EA was run for a population size of 20 (i.e., 20 randomly generated sets of initial guesses for c i and P i ), with 10 parallel ABM runs to compute an average error and 100 generations for each set of initial guesses. At each generation, a cross-over rate of 50% and mutation rate of 80% was used. This process (20 sets of initial guesses × 10 ABM runs × 100 generations) was repeated 30 times, and the combination of c i and P i producing the overall lowest error was chosen to model the Ncad/Pcad signaling circuit developed by Toda et al. [2]. This approach was used to account for stochasticity of the ABM and EA and to reduce the likelihood of the EA becoming stuck in local error minima. We developed our own Python script for the EA and used the NL4Py Python library [80] to enable communication between Python and NetLogo. To improve run time, our EA implementation ran the ABM in parallel across multiple computing units.

Comparison of ABM predictions with experimental data
The 2D and 3D ABMs were validated against experimental data by simulating different BFP+: -/-seeding ratios (1:9, 1:4, 1:1, 4:1, 9:1) for 60 and 200 total cells, replicating the conditions used for in vitro experiments. Simulations for each set of conditions were run 100 times, and the average results from these runs were analyzed. The locations of each simulated cell were saved at the end of each model run, and metrics were quantified as described above. Fractional areas/volumes of each color were compared between in vitro results and ABM predictions.

Rule set implementations
Different rulesets were explored using the 2D ABM by varying the order of signaling events, the activated homotypic and heterotypic adhesion strengths, and the cadherin induction constants. In these rulesets (described in Figs 6-10), the GFP/Ncad phenotype and mCherry/Pcad phenotype can take on different homotypic adhesion strengths and induction constants (e.g., [c a , P a ] were assigned to GFP/Ncad cells, and [c b , P b ] were assigned to mCherry/Pcad cells). Each ruleset and its associated observations are explained in detail in Results.

k-means clustering and tSNE dimensionality reduction
k-means clustering of the parameter space exploration results was conducted using the scikitlearn library in Python [81]. The optimal number of clusters for each clustering problem was determined by plotting the within-cluster standard deviation for k-means clustering using two to fifteen clusters. This analysis was used to determine the minimal number of clusters that reduced within-group standard deviation. tSNE was implemented using the same Python library.

Statistics
Statistical comparisons for continuous data were made by a one-way Kruskal-Wallis test followed by Dunn's post-hoc analysis. This test was specifically applied when comparing fractional areas for in vitro spheroids and in predicted ABM spheroids (Fig 2). Statistical comparisons for qualitative metrics (e.g., core cell type) and discrete metrics (e.g., number of clusters) were made using a Chi-Squared test with p < 0.05 considered significant for all tests.