Cell Sorting and Noise-Induced Cell Plasticity Coordinate to Sharpen Boundaries between Gene Expression Domains

A fundamental question in biology is how sharp boundaries of gene expression form precisely in spite of biological variation/noise. Numerous mechanisms position gene expression domains across fields of cells (e.g. morphogens), but how these domains are refined remains unclear. In some cases, domain boundaries sharpen through differential adhesion-mediated cell sorting. However, boundaries can also sharpen through cellular plasticity, with cell fate changes driven by up- or down-regulation of gene expression. In this context, we have argued that noise in gene expression can help cells transition to the correct fate. Here we investigate the efficacy of cell sorting, gene expression plasticity, and their combination in boundary sharpening using multi-scale, stochastic models. We focus on the formation of hindbrain segments (rhombomeres) in the developing zebrafish as an example, but the mechanisms investigated apply broadly to many tissues. Our results indicate that neither sorting nor plasticity is sufficient on its own to sharpen transition regions between different rhombomeres. Rather the two have complementary strengths and weaknesses, which synergize when combined to sharpen gene expression boundaries.


S1 Model S and sub-cellular element method
Here we outline the details of the models discussed in the main text. Model S is designed to model cell movement resulting from cell-cell adhesion and repulsion. To model individual cells, we use the sub-cellular element method (SCEM) [1] (Fig Aa), where each cell is represented by N node nodes. Nodes within the same cell are subjected to intra-cellular biomechanical forces determined by a Morse potential Φ intra : where r is the distance between the two nodes. Inter-cellular forces are determined by another Morse potential Φ inter acting between nodes of different cells. In model S, we assume that each cell can express either of the two identities (red/blue). Φ inter applies to nodes from two cells of the same type (red -red, blue -blue) imparting an attractive (adhesion) force with a short range repulsion (to avoid overlap). Nodes from cells of different type (red -blue) are subjected to a repulsive force: Φ inter x n,i − x m,j (S1. 3) where η n,i is the noise term, N cell is the total number of cells and N node is the total number of nodes in each cell, and x n,i is the position vector of the i-th node of the n-th cell. Both intra-and inter-cellular Morse potentials are truncated after r = 2, where r = 1 is the typical diameter of a cell. This truncation ensures forces are cell-cell contact based and do not extend to longer spatial ranges. The canonical SCEM based only on Morse potentials often behaves poorly for systems involving large-scale cell movement, during which cells might be squashed or even "explode". To avoid such technical problems, we add another pair-wise potential to nodes within the same cell to endow the cell with more structure. We first divide the nodes into two layers, each with the same number of nodes (N node /2). For each pair of neighboring nodes in the same layer (x n,i ∼ x n,i±1 ), and each pair of corresponding nodes in the two layers (x n,i ∼ x n,i±N node /2 ), we apply the following potential Ψ ( Fig  Ab): Hence for the i-th node in the n-th cell, the velocity resulting from the potential Ψ should be: With such an enhanced structure, the motion of a node is determined by two parts: where v Φ n,i and v Ψ n,i are determined by equations (S1.3, S1.5), respectively.

S2 Model P
Model P, originally developed in [2], describes a plasticity based process that is hypothesized to aid rhombomere boundary sharpening in the zebrafish hindbrain. Morphogen M diffuses across the entire domain. The dynamics of the extra-cellular M concentration is described by the following equation: is the M production rate at position x at time t: and ω out is a white noise. When entering a cell, M activates either gene A or gene B, and the following system of ODE's is evaluated for each cell: In the simulations of slowed down plasticity induced transitioning (Fig 6b, Movie S7), a common factor 0.1 is multiplied to all other terms in the right hand side of equations (S2.1 -S2.5), with all noise terms multiplied by √ 0.1.

S3 Model SP
Model SP is a hybrid of models S and P, where cells are mobile and the intra-cellular gene concentrations can change due to motions of cells as well as stochastic effects, which in turn feedback to affect inter-cellular forces (which depends on gene expression). As in model S, cell movements are the result of inter-cellular adhesion and repulsion, which are determined by the cell's gene expression (levels of A and B, which depend on morphogen levels M ). As a result of cell motions, concentrations of A and B and adhesion / repulsion properties of each cell are continuously changing, forming a feedback between gene expression and motility. To reflect this, unlike in model S where Φ inter is a two-state Morse potential (equation (S1.2)), in model SP, Φ inter is a continuous family of Morse potentials ( Fig C): where F(g m , g n ) ∈ [0, 1] is a symmetric function of g m , g n measuring the similarity of the gene expression in the two cells, which is constructed in the following way. For the n-th cell, we linearly scale the concentration difference [A n ] − [B n ] into a range of [−1, 1] and denote this scaled result as δ n , so that δ n = 1 when the cell expresses maximum A and minimum B, and δ n = −1 when the cell expresses maximum B and minimum A. Then we define

S4 Representation of cells
In SCEM, each cell is represented by a collection of nodes ( Fig Da). For visualization purposes however, it is helpful to depict cells as a confined area with a boundary. The two-layer structure of cells in this framework makes it simple to draw these cellular boundaries based on node positions. To visualize each cell, we assign the region enclosed by the outer layer of nodes as being part of the cell (Fig Db). This leaves a significant portion of the computational domain unassociated with each cell. However, since the zone of influence of each node extends a certain distance, we will assign part of the region between the outer node layers of adjacent cells as belonging to one of those two cells. To do so, we create a triangular mesh of the portion of computational mesh outside of the outer node layer of all cells using the outer layer nodes as vertices of the triangles ( Fig  Dc). Each resulting triangle can take one of two forms and its area can be apportioned to the adjacent cells accordingly.
1. One vertex belongs to one cell, and the other two vertices belong to another cell ( Fig Dd). In this case, the triangle is associated with only two cells and we divide the triangle into one triangle and one quadrilateral separated at the midline of the original triangle. The resulting two areas are subsequently assigned to the two associated cells and colored accordingly.
2. The three vertices belong to three different cells ( Fig De). In this case we divide the triangle into four smaller triangles by connecting the midpoints of each of the three sides of the triangle. The central triangle is unassigned (and uncolored) while the remaining three triangles are assigned to the respective cells and colored accordingly.
We treat all triangles from the mesh in this way, which gives the final representation of cells ( Fig Df).

S5 Sharpness Index and Mixture Index
For each model simulation, we assign two measures of sharpness to quantify how well defined the resulting boundary between domains is. The first measurement is the Sharpness Index (SI), which quantifies both the formation of a clear boundary and the straightness of the boundary. We adopt this index from [2]: SI is the standard deviation of the distance (measured from cell center) to the midline of the transition region of all mis-located cells. SI is evaluated over time for a simulation.
In addition to SI, to categorize the final state of a simulation, we introduce in another measurement: the Mixture Index (MI), which is evaluated at the endpoint of a simulation to quantify the degree of mixture of the transition region. MI is defined in the following way: 1. If a clear boundary forms dividing the two colonies, MI=0 (Fig Eab). That is, this index is intended to indicate the coherence of a boundary and does not quantify its straitness. Thus a boundary can be sharp but not strait.
2. If a cell is mislocated and at either the top or bottom of the computational domain, it contributes to MI a value equal to the shortest distance between itself and the presumptive boundary times 0.5 (Fig Ecd, mislocated blue cells at the boundaries).
3. If a cell mislocates so that it is separated from the appropriate grouping (e.g. a red cell surrounded by blue), it contributes to MI a quantity equal to its shortest distance to the presumptive boundary ( Fig Ed, mislocated red cell).
We down-weight mislocated cells at the top/bottom since their location near the domain border leads to weaker sorting forces and thus in larger tissues, this effect would be expected to have less relative influence. In this way, each mislocated cell contributes to the calculation of the MI, with more mislocated cells leading to a larger value of MI. In short, this defines a weighted accounting of the number of mislocated cells to describe how sharp a boundary is.
Using the MI measure, we categorize sharpening results into three categories: boundary formed (MI=0), where a clear boundary forms dividing the two cellular zones; boundary nearly formed (MI≤2), where mostly a boundary forms with one or two cells mislocated; boundary failed to form (MI>2). See Tables 1-3 for simulation results. While we distinguished between "boundary formed" and "boundary nearly formed" here, in vivo there is likely little distinction since subsequent formation of actin cables (or similar structures) could correct these final minor errors [3], or apoptosis of one or two cells could complete the process at relatively little cost. In Tables A -C we present the average end MI corresponding to simulation groups presented in Tables 1-3. For future reference, the MI measures the degree of mixing of cells at the boundary whereas SI accounts for both the degree of mixing and how straight the border is. As a comparison example, we give both MI and SI of the end states of four simulations from model S in Fig E. S6 Sensitivity analysis on model P In this section we analyze how sensitive model P is to several of its parameters.
First, we find that model P is stable to parameters in the morphogen equations (S2.1, S2.3). Changes in the parameters D M (the diffusion coefficient for extracellular M ), β (which represents the relative strength of extracellular M degradation compared to cell uptake), and k M (the rate of morphogen exchange between the cell and the extracellular medium), within a certain region will not affect the bistable switching property of the system, although they will cause some quantitative changes in the sharpening, as shown by On the other hand, parameters in the two gene equations (S2.4, S2.5) might change the system bistable switching property greatly. For example, we change the value of C A in equation (S2.4), which equals 0.9 in all other simulations to create the A → B (i.e., red → blue) only transitions. With C A = 1, we no longer see A → B transitions in the transition region, instead, there are reversed transitions (i.e., B → A) happening in the more posterior region (Fig Fe); while with C A = 0.8, lots of A → B happening in the whole observed region, resulting in all cells expressing gene B soon (Fig Ff). These effects are better illustrated by calculating the B cell distribution: with C A = 1 (Fig  Fg), the observed region suffers a loss of B cells while with C A = 0.8, B Cells take over the whole region (Fig Fh). The average B cell distributions in the observed region (out of 16 simulations each) with C A = 0.9, 0.8 and 1 are shown in Fig Fi, illustrating the differences of the bistable switching of the systems.

S7 The transition region location is precise after boundary
sharpening by any of the three models In addition to being sharp, the boundary location between segments must also be precise for robust and reproducible development. In this system, three important factors influencing precision: morphogen noise, the type of cell sorting, and gene expression noise.
Morphogen noise generates the salt and pepper transition region between zones. Thus, in a sense, the morphogen can be thought of as generating a noisy initial condition which subsequent dynamics serve to sharpen. In general, our simulations can be divided into three stages. First, an initial stage in which morphogen noise usually leads to an extremely wide initial transition region (Fig Ga). After this initial phase, we turn off the morphogen noise and let the system relax to reach a steady state, during which the early wide transition region naturally narrows from the initial right (posterior) side down to a narrow one typically of 3-6 cells wide (Fig Gb). This produces the initial condition from which simulations proceed. In general, this initial generation of zones leads to variability in the location of the left edge of the transition zone (Std = 0.96, Fig Gf). Interestingly, there is no variability in the location of the right edge (Std ∼ 0, Fig Gg).
We next assessed how the precision of the final location of zone boundaries evolve with time in models P, S, and SP (Figs Gc-e). While the mean of the locations have been discussed in the main text, here we focus on the variability, i.e., the precision. Results indicate that Model SP leads to the greatest reduction in the variability in the location of the left edge of the boundary, while the three models perform roughly the same at the right edge of the boundary. Thus, a model that combines both cell sorting and plasticity leads to the greatest precision in the location of domain boundaries.

S8 Noise induces gene expression switching may cause the loss of cells expressing one identity.
Plasticity induced red → blue transitions result in the loss of cells of one identity. Since the switching is one-way only due to the structure of the gene expression system, A → B in our model, this will certainly cause a loss in cells expressing gene A. When the domains are large, such a loss might not be a serious problem, yet for zebrafish rhombomeres which are initially low in cell numbers, it must be considered. We investigate two noise strength: mild (which is used in Fig 7) and strong, with noise strength η A = η B = 0.06 and 0.09, respectively. In our 16 simulations, initially (Time=10.7 hpf) the average number of A cells in the whole domain is ∼ 39, with mild noise at the end (Time=12.7 hpf) the average A cell number is ∼ 23 while with strong noise it is ∼ 9 (Fig Hab). That is to say, with mild and strong noise, gene A expression cells suffer a loss of 39.1% and 76.6%, respectively. When cell sorting is added to plasticity with mild noise, the cell loss phenomenon is slightly improved -at the end (Time=12.7 hpf) the average A cell number is ∼ 25.5 (Fig Hc).

S9 Parameter values
Parameter values used in our simulations are listed in Tables D -F [2] where the plasticity model was originally proposed. "cd" stands for "cell diameter".

Parameters Values
Units  For the intracellular structure, in addition to the Morse potential Φ intra , we add another potential Ψ. The nodes are evenly distributed in two layers, and Ψ applies to any pair of neighboring nodes within the same layer, and any pair of corresponding nodes from the two layers. This additional potential endows the cell with more stable structure.  This leads to an initially very wide transition region. b) After the initial fate determination stage, we turn off the morphogen noise and let the system relax to reach its stable state, during which the early wide transition region is narrowed from the right side down to a narrow one typically of 3-6 cell wide. c-e) The evolution of the mean and Std of the locations of the left and right edges of the transition region for the three models (P, S and SP). f, g) The standard deviation of the left and right edge locations of the transition region for the three models.