Graph topological transformations in space-filling cell aggregates

Cell rearrangements are fundamental mechanisms driving large-scale deformations of living tissues. In three-dimensional (3D) space-filling cell aggregates, cells rearrange through local topological transitions of the network of cell-cell interfaces, which is most conveniently described by the vertex model. Since these transitions are not yet mathematically properly formulated, the 3D vertex model is generally difficult to implement. The few existing implementations rely on highly customized and complex software-engineering solutions, which cannot be transparently delineated and are thus mostly non-reproducible. To solve this outstanding problem, we propose a reformulation of the vertex model. Our approach, called Graph Vertex Model (GVM), is based on storing the topology of the cell network into a knowledge graph with a particular data structure that allows performing cell-rearrangement events by simple graph transformations. Importantly, when these same transformations are applied to a two-dimensional (2D) polygonal cell aggregate, they reduce to a well-known T1 transition, thereby generalizing cell-rearrangements in 2D and 3D space-filling packings. This result suggests that the GVM’s graph data structure may be the most natural representation of cell aggregates and tissues. We also develop a Python package that implements GVM, relying on a graph-database-management framework Neo4j. We use this package to characterize an order-disorder transition in 3D cell aggregates, driven by active noise and we find aggregates undergoing efficient ordering close to the transition point. In all, our work showcases knowledge graphs as particularly suitable data models for structured storage, analysis, and manipulation of tissue data.

Tissue-scale mechanics have been addressed by computational models that represent cells as discrete entities with certain physical properties that phenomenologically describe the mechanics at smaller length scales [30][31][32][33][34][35][36][37][38][39].Recent studies compare various widely used computational models for simulating cell assemblies including cellular automaton, cellular Potts model, phase-field model, particle-based model, deformable-particle model, Voronoi model, and vertex model [40][41][42].While many of these approaches naturally incorporate cell rearrangements-processes that are crucial for capturing a realistic rheological tissue response-handling these processes computationally in the vertex model is quite challenging.In particular, the vertex model describes individual cells as polygons (2D) or polyhedra (3D) and parametrizes their shapes by vertex positions [43][44][45][46].Cell-cell contacts are represented by edges in 2D and polygons in 3D and comprise an intricate network whose connectivity needs to be computationally altered upon every cell-rearrangement event.Current state-of-the-art computational tools for vertex-model simulations offer certain solutions to simplify the implementation of these network-reconnection events, however, these methods are not easily generalizable to 3D [25,39].
Generally, vertex models store the topology and geometry of the tissue in a tabular form within arrays and perform topological transformations of the cell-interface network by updating these arrays according to specified rules [47][48][49][50].Although programming the routines that perform these dynamic array updates is still relatively manageable for planar polygonal cell packings and even for 3D surface packings involving polyhedral prism-like cells [51][52][53][54][55], developing computer codes for simulations of 3D bulk cell aggregates poses a significantly greater challenge.Indeed, since the pioneering work by Honda et al. [43], who first introduced a vertex model of 3D cell aggregates, there have only been a few recent works reporting successful attempts of coding a full 3D vertex model with dynamic cell rearrangements [56][57][58].The difficulty of implementing these rearrangements with the conventional data model raises questions about its suitability and challenges our basic understanding of rearrangements in space-filling packings.
Nevertheless, the vertex model often offers a profound mechanistic understanding of tissue behaviors, which is to a large extent facilitated precisely by the explicit presence of cell boundaries and their ability to remodel.Therefore, despite the challenges associated with cell-boundary tracking during cell rearrangements, the vertex model may still be in many respects advantageous over other models, in which cell boundaries need not be explicitly managed.For instance, the vertex model excels in realistically modeling large-scale three-dimensional tissue deformations and cellular flows, that may be driven by local active forces [43,46,59,60] while, at the same time, capturing certain small-scale structural features, including single-cell shapes and their sidedeness within the aggregate.
To address the challenges associated with the implementation of cell rearrangements in the vertex model, we introduce a reformulation of the vertex model called Graph May 3, 2024 2/25 Vertex Model (GVM).We discover a particular graph-data model, which allows formulating topological transformations of cell networks as simple graph transformations.The blueprint outlining the relationships among the components is specified through a metagraph which is designed in a manner that topological transformations are themselves represented by graphs.This design not only enhances their intuitive and visual understanding but also simplifies their implementation, making it accessible even to researchers with limited programming expertise.Moreover, we demonstrate that within the GVM's data representation, a T1 transition-the basic rearrangement event in 2D cellular tilings-emerges as a subgraph of graph transformations representing cell rearrangements in 3D packings.This allows developing generalized computational codes that are applicable to both 2D and 3D, suggessting that GVM's data model may be the most natural representation of these systems.As a proof of concept, we develop an open-source Python package neoVM, which implements GVM over a graph database, managed in Neo4j [61,62].We use our new approach to study order-disorder transition in 3D cell aggregates.We characterize the transition and find aggregates undergoing most efficient ordering in the vicinity of the transition point.

Vertex model
The vertex model represents a cell aggregate by a three-dimensional packing of space-filling polyhedral cells (Fig 1A).Cell shapes are parametrized by positions of vertices in the (x, y, z) space: Here i = 1, ..., N v , where N v is the total number of vertices.Pairs of vertices are connected by oriented edges, defined by the indices of the constituent vertices: Here i = ±1 denote head and tail vertices of the edge.While at this point these signs seem redundant, since the order in which the indices appear in e j itself indicates the head/tail role of the corresponding vertices, they will become important later on.
Polygonal cell sides are defined by oriented lists of indices of their constituent edges as where k = 1, ..., N p , with N p being the total number of polygons.The n where l = 1, ..., N c , with N c being the total number of cells within the aggregate.Polygon orientations Σ Exemplary structures of cell aggregates with N c = 128 cells are given in the online repository of neoVM [61] and their generation is described in Methods.
The dynamics of cell-shape changes are described by simulating movements of the vertices, driven by mechanical forces.In a model that neglects inertial effects and only considers friction of vertices with a static background, vertices follow first-order dynamics described by May 3, 2024 4/25 Here, η is the friction coefficient, W is the potential energy of the aggregate, and F (a) i is a system-specific active-force contribution.The potential energy is typically calculated from geometric properties of cells such as surface areas of cell-cell contacts A, cell volumes V , etc.These quantities are calculated from the vertex positions as sums over geometric elements, i.e., vertices, edges, polygons (Methods).

Topological transitions
In addition to changing their shapes, cells also change their relative organization within the tissue by exchanging their neighbors [43].These reorganization events alter the topology of the network of cell-cell contacts, thereby also affecting the interaction between the vertices.In confluent cell aggregates, cells exchange their neighbors by (i) merging vertex pairs of vanishingly short edges and (ii) resolving these vertices into new edges [63][64][65].The topology of the edge network after these transformations locally differs from the initial one.In particular, cells that were initially separated might become neighbors, whereas pairs of initially neigboring cells may separate.
To model cell rearrangements in 3D cell aggregates, the following elementary local topological trasformations need to be considered (Figs 1B-D): (i) Edge-to-vertex (EV) transition merges vertices of a vanishingly short edge into a single 6-fold vertex, (ii) vertex-to-triangle (VT) transition resolves a 6-fold vertex into a new triangle, (iii) triangle-to-vertex (TV) transition merges all three vertices of a triangular polygon into a single 6-fold vertex, and (iv) vertex-to-edge (VE) transition resolves a 6-fold vertex into a new edge.
An EV followed by a VT completes an ET transition, which transforms a vanishing edge into a triangle, formed in the perpendicular direction to the shrinking edge (Fig 1B and 1C).After an ET transition, initially separated cells become neighbors by sharing the new triangle.Similarly, a TV followed by a VE, completes a TE transition, which transforms a vanishing triangle into an edge, formed in the perpendicular direction to the shrinking triangle (Fig 1B and 1C).After a TE transition, the neighboring cells initially sharing the triangle become separated.We note that in a special case of an epithelial monolayer, where cells only attach to their neighbors laterally but not apically and basally, EV transitions either on basal or apical edges give rise to scutoidal cells [66,67].
Importantly, as illustrated in Fig 1D, the basic topological transformations in fact resemble multiple edge-to-vertex and vertex-to-edge transitions, known in 2D polygonal networks as T1 transitions.In a T1 transition, a pair of initially neigboring polygons becomes separated, whereas polygons from the remaining (initially separated) polygon pair become neighbors.This similarity between 2D and 3D transformations suggests that it may be possible to unify topological transformations in 2D and 3D space-filling packings, provided that the vertex model's core architecture be properly reformulated.
The main issue with the conventional implementation of the vertex model is that vertices, edges, polygons, and cells are stored in separate arrays (r i , e j , p k , and c l , respectively), which do not directly encode any interconnections or relationships among their respective elements.In particular, elements that might be spatially and topologically related are generally not stored together in the database and accessing any high-level topology data (e.g., finding cells that share a common vertex) requires inefficient searches over all the elements of the cell network.To avoid these inefficient searches, the conventional vertex models store data in a highly redundant form, where higher-level information about the topology of the cell network are stored in addition to e j , p k , and c l (even though these higher-level information may be calculable from e j , p k , and c l ).For instance, to efficiently search for cells that share a certain vertex, lists of cells sharing a common vertex need to be stored for all vertices.Indeed, retreiving this information from e j , p k , and c l on the fly would require highly inefficient looping May 3, 2024 5/25 over all the cells.Due to this data redundancy, algorithms that manipulate the data arrays upon topological transformations in a self-consistent manner are difficult to program.

Knowledge graph
To overcome challenges related with implementing topological transformations in the vertex model, we propose a new approach, based on storing the topology of the cell network into a knowledge graph, which uses a graph-rather than a tabular data model.By construction, the elements that are topologically related are also connected in the knowledge graph and therefore, any high-level information about the topology of the cell network is readily retrievable by querying over the relevant part of the database with no need of storing any redundant data.Knowledge graph is a graph data structure, which represents a network of real-world entities and relationships between them [68].These data are stored in a graph database where entities and relationships are represented by nodes and links, respectively, and can, additionally, carry multiple properties.For example, a movie database can be stored as a knowledge graph, in which the data about actors and directors for a given movie are represented by nodes labeled Person and Movie and relationships labeled ACTED IN and DIRECTED.In such a knowledge graph, the information that Cillian Murphy acted in the movie Oppenheimer, directed by Christopher Nolan, can be stored as The notation used here and in the following sections follows the syntax of the Cypher graph query language [69].It is important to note that GVM relies on general principles of discrete mathematics and does not depend neither on the choice of the query language nor on the choice of the database-management framework.Computationally implementing GVM, however, does require some basic knowledge of graph databases and query languages.

Metagraph
Real-world entities in GVM are vertices, edges, polygons, and cells and they are stored in a knowledge graph in a hierarchical manner as nodes labeled Vertex, Edge, Polygon, and Cell, respectively.The topology of the cell network is encoded through relationships labeled IS PART OF.These relationships are directed and relate source-target node pairs, where the entity represented by the source node is always hierarchically one level below the entity represented by the target node.
For instance, if a specific polygon p contains a specific edge e, the nodes representing these two entities, i.e., (p) and (e), are connected as (e)-[:IS PART OF]->(p).Connecting equally labeled nodes (e.g., a pair of edges) is not allowed and neither is connecting nodes carrying labels that do not follow one another hierarchically (e.g., a vertex-polygon pair).For example, say that one of the polygon p's vertices is vertex v. Rather than encoding the information that v is part of p directly by a (v)-[:IS PART OF]->(p) connection, this information is retrieved hierarchically from the connectivity of v and p through edges.In particular, if v is part of p, it is also necessarily shared by two edges that are both also part of p (say e 1 and e 2 ): The above rules for the construction of the GVM's knowledge graph can be conveniently represented by a graph, called metagraph.Much like metalanguage is a language that describes another language, metagraph is a graph that describes another graph and can be viewed as a blueprint for generating actual (valid) manifestations of that graph.From the above definitions, it is obvious that the metagraph of GVM is   Additionally, both nodes and relationships carry properties that encode additional information about nodes and relationships.While nodes carry a property, id, which represents the identification numbers i, j, k, and l of vertices, edges, polygons, and cells, respectively, the relationships are prescribed a property sign, whose value is either −1 or +1.This is a contextual property, that puts the relationship's source node into the context of the target node.In particular, in the subgraphs of type (v:Vertex)-[r:IS PART OF]->(e:Edge), the value of r.sign denotes whether vertex v is a head vertex (r.sign=+1) or a tail vertex (r.sign=-1) of edge e [i.e., parameter s in Eq. ( 2

Pattern matching
Transforming the graph database of GVM upon cell rearrangements requires only two steps: (i) Data retrieval, accomplished through pattern matching and (ii) Graph transformation.
In step (i), a suitable (meta)graph pattern is utilized to query the database and identify the nodes relevant to the transformation at hand.After the graph is traversed, instances of the specified graph pattern are returned.These instances are further filtered, using various conditional statements.
The goal of step (i) is to retrieve from the whole graph of GVM a small subgraph comprising solely of the nodes representing the objects (vertices, edges, polygons, and cells) that take part in the specific topological transformation being performed (Fig 2B).Given the unambiguous definition of the graph data structure by the GVM's metagraph (Fig 2A ), the routines that perform this step are easily reproducible.We implement these routines in Cypher and find that each topological transition requires ∼ 10 distinct short queries, similar to the query shown in Eq. ( 18) to retrieve the relevant data [61].

Graph transformations
In step (ii), the subgraph matched during the pattern-matching step undergoes a transformation based on the rules of the specific cell-rearrangement event being performed.
Like the matched subgraph that is being transformed, the graph transformation itself is represented by a graph.This graph contains exactly the same nodes as the matched GVM subgraph, however with much fewer relationships.In particular, the relationships in the transformation graph are of two types: (i) Green and (ii) red, indicating creations and deletions of :IS PART OF relationships in the actual GVM subgraph, respectively (Fig 2C).
Compared to the convoluted codes that perform topological transformations in the conventional vertex model, the task of programming the routines that perform graph topological transformations in GVM is much less challenging.Indeed, our implementation of graph-transformation routines in Cypher comprises of successive calls of ∼ 5 distinct short queries, similar to examples shown in Eqs.(19) and (20), which merely delete and create relationships.
Transformation of contextual properties.Values of contextual properties s, σ, and Σ to be assigned to the newly created relationships are specified in the transformation graph through the relationship property of green relationships (s ′ , σ ′ , and Σ ′ for vertex→edge, edge→polygon, and polygon→cell connections, respectively).Unlike the green relationships, the red relationships do not carry any additional properties (Fig 2C).
Transformed values of certain contextual properties can be arbitrarily chosen, however, in most cases they need to be figured out from the known values of other contextual properties.Conveniently, the rules for prescribing these new values can be summarized in 2 compact formulas.In particular, for a new vertex→edge relationship between nodes representing vertex v i and edge e j , s ′ i,j is calculated as where k denotes vertex v k , which was part of edge e j prior to the transformation (i.e., s k,j denotes contextual property of the relationship between nodes representing vertex v k and edge e j prior to the transformation).
May 3, 2024 8/25 For a new edge→polygon relationship between nodes representing edge e i and polygon p j , σ ′ i,j is calculated as Here, vertex v k is a vertex shared by edges e i and e l , which are both part of polygon p j .Additionally, among the two vertices of edge e i , vertex v k is the one that was not part of edge e l prior to the transformation (i.e., Eq. ( 7) contains s ′ k,l and not s k,l ).An analogous equation to Eq. ( 7) also holds for assigning Σ ′ ij to a newly created polygon → cell relationship.
Equations ( 6) and ( 7) are demonstrated in the subsequent section for the case of a T1 transformation and their meaning is described in the corresponding figure caption.

T1 graph transformation
To demonstrate all steps required to perform a topological transformation in a space-filling cell aggregate, we first turn to a 2D polygonal cellular tiling, where cells rearrange through T1 transitions.Specifically, during a T1 transition, a vanishingly short edge merges into a single vertex, i.e., a four-way junction, which subsequently resolves into a new edge oriented roughly in the perpendicular direction compared to the orientation of the initial edge.As any other topological transformation, GVM performs a T1 transition in two steps as follows (Fig 3).
(i) Pattern matching performs a series of graph-database queries so as to find nodes representing elements (i.e., vertices, edges, polygons and cells) that participate in the transformation (the initial state in Fig 3A).These queries first identify the edge that undergoes a T1 and label it e 5 .Subsequently, they identify two vertices connected to e 5 , randomly labeling one v 1 and the other v 2 .Following this, they locate two polygons that share e 5 , labeling them p 3 and p 4 .Continuing, edges e 1 and e 2 are identified as edges connected to v 1 (excluding e 5 ) and are also part of polygon nodes p 3 and p 4 , respectively.Similarly, e 3 and e 4 are determined using the same procedure, where the role of v 1 from the previous step is now adapted by v 2 .Subsequent steps involve finding polygons p 1 , containing both e 1 and e 2 , followed by identifying p 2 , containing e 3 and e 4 .
Note that nodes representing cells (polyhedra) are missing in graphs describing a T1 transition (Fig 3).This is because polyhedra do not exist in the 2D representation of the vertex model, where polygons themselves are interpreted as cells.
(ii) After pattern matching, graph transformations convert the initial sub-graph to the final sub-graph, illustrated in the upper-right area of Fig 3A , employing  Finally, the newly created relationships need to be prescribed contextual properties using equations ( 6) and ( 7), (Fig 3B and 3C).For instance, s ′ 1,3 , i.e., the sign of vertex v 1 in the context of the edge e 3 , is assigned a value identical to s 2,3 , i.e., the sign of vertex v 2 in the context of e 3 before transformation.In turn, determining the sign of a new edge in the context of a polygon, e.g., e 5 in the context of p 1 (σ ′ 5,1 ), relies on contextual properties s ′ 2,5 , s ′ 2,2 , and σ 2,1 [Eq.(7)].For a more comprehensive understanding of the origin of equations ( 6) and ( 7 ).After the transformation, vertex v 1 assumes the same role in the context of edge e 3 as the role of v 2 in the context of e 3 before the transformation (s 2,3 ).This occurs because the edge e 3 merely replaces v 2 (blue) with v 1 (red).C Schematic of determining the value of contextual property for a new edge→polygon relationship generated between edge e 5 and polygon p 1 (σ ′ 5,1 ).The calculation of σ ′ 5,1 (red) relies on one of the vertices of e 5 , i.e., v 2 (green), and the edge linked to both v 2 and p 1 , i.e., e 2 (also depicted in green).The assignment of σ ′ 5,1 is determined based on the contextual properties of: (i) edge e 2 in the context of polygon p 1 , σ 2,1 , (blue) and (ii) vertex v 2 in contexts of edges e 2 , s ′ 2,2 , and e 5 , s ′ 2,5 , (green).In short, the contextual property σ ′ 5,1 aligns or opposes that of σ 2,1 depending on the similarity or dissimilarity of s ′ 2,2 and s ′ 2,5 .
these elements or repositioning the corresponding nodes would affect the visual representation of the graphs, but the graphs themselves (i.e., their connectivities) would remain the same.

ET and TE graph transformations
Graph transformations describing ET and TE transitions as well as EV, VT, TV, and VE transitions are obtained following the exact same procedure as in the case of T1 transition (previous section).

Generalization of topological transformations
As shown in Fig 1D, the transformation patterns during ET and TE transformations are both geometrically as well as topologically quite similar to the more simple T1 transition.This raises a question whether topological transformations in 2D and 3D can be generalized and described using the same, generalized, graph transformation.
Surprisingly, as depicted in suggesting that it should be possible to develop a generalized computational implementation of GVM, capable of simulating both 2D and 3D space-filling packings.Our implementation of GVM within the neoVM package confirms this hypothesis.

Order-disorder transition in active tissues
As a proof of concept, we develop a custom Python package called neoVM, which manages the GVM's knowledge graph and its transformations in a graph database May 3, 2024 12/25 management framework Neo4j (Methods).
We use neoVM, to study an order-disorder transition of cell aggregates, driven by active tension fluctuations.In particular, we consider an aggregate of N c = 128 cells with identical (normalized) volumes (V 0 = 1 for all cells), enclosed within a simulation box with periodic boundary conditions.The vertex dynamics are described by Eq. ( 5), assuming the potential energy given by Eq. (10).
In addition to the conservative and friction forces, we also include active force dipoles acting along cell edges to induce active cell rearrangements.The total active force on vertex i is a sum of forces acting along edges (i.e., tricellular junctions) sharing that same vertex: Here the indices lmn denote a tricellular junction (i.e., edge), shared by cells l, m, and n; L lmn is the edge length.
The magnitudes of active force dipoles are dynamic quantities that fluctuate with time.In particular γ lmn (t) obeys Ornstein-Uhlenbeck dynamics described by where τ m is the relaxation time scale associated with turnover dynamics of molecular motor Myosin, γ 0 is a baseline tension, whereas ξ lmn (t) is Gaussian white noise with properties ⟨ξ lmn (t)⟩ = 0 and ⟨ξ lmn (t)ξ opr (t ′ )⟩ = 2σ 2 /τ m δ lo δ mp δ nr δ(t − t ′ ); σ 2 is the long-time variance of the tension fluctuations.We simulate the above active dynamics at different magnitudes of active noise σ, starting with a Kelvin structure-a crystalline cell arrangement made up of truncated octahedra with 14 facets (8 regular hexagons and 6 squares).The active noise distorts the geometry of the aggregates, which are no longer perfect crystals.In particular, for σ > 0 the average cell shape, quantified by the shape factor q = ⟨S l /V 2/3 l ⟩ l∈cells (S l and V l are cell surface area and volume, respectively) deviates from the Kelvin's truncated octahedron (Fig 6A).The distorted geometry is also seen in the width of the distribution of edge lengths, which increases with an increasing σ-to a point where vanishingly short edges appear (Fig 6B).These edges undergo ET and TE topological transformations, which in turn triggers cell rearrangements-a signature of a transition from a solid-like to a fluid-like behavior.This transition manifests in disordering of the aggregate, seen in the dependence of an order parameter 1 − f 14 , describing the fraction of non-14-sided polyhedra (f 14 = N 14 /N c ), on the control parameter σ (Fig 6C).From these results, we obtain an estimate for the transition point σ * ≈ 0.17.Figs 6D-H show cell configurations at σ = 0, 0.2, 0.3, 0.4, and 0.5.
Next, we are interested in whether the active noise can drive the opposite effect, i.e., ordering.To study this, we start with a disordered cell packing, prepared in advance by packing spheres in the simulation box, using random sequential addition and then constructing Voronoi partitions around sphere centers.This procedure yields a sample with the initial fraction of 14-sided polyhedra f 14 = 0.234.Again, we simulate the active dynamics at different magnitudes of the active noise σ.We find that high-σ values keep the aggregate disordered due to frequent cell-neighbor exchanges.In contrast, a sufficiently small level of active noise drives active tissue ordering, which is seen in decreasing 1 − f 14 and narrowing of the edge-length distribution over time (Fig 6I and 6J, respectively).At moderate σ ≈ σ * values, the ordering is more efficient compared to small σ values, where the active-noise level is not sufficiently high to allow overcoming local energy barriers for cell rearrangements.Despite a higher degree of disorder, the low-σ states consist of cells whose shapes are closer to the May 3, 2024 13/25 Order-disorder transition in 3D cell aggregates.A Shape parameter q for "thermalized" aggregates versus active noise σ.B Distribution of edge lengths in thermalized aggregates for σ = 0.025, 0.2, and 0.5 (red, green, and blue curves, respectively).C Order parameter 1 − f 14 and normalized number of cell-rearrangement events n = (N ET + N TE )/Nmax versus active noise σ.D-H Thermalized aggreages at σ = 0, 0.2, 0.3, 0.4, and 0.5.I Order parameter 1 − f 14 versus time for σ = 0.01, 0.17 and 0.4 (red, green, and blue curves, respectively) during aggregate ordering.J The initial (t = 0.001; orange curve) and final (t = 1000; magenta curve) distributions of edge lengths at σ = 0.17.K Shape parameter q versus time for σ = 0.01, 0.17 and 0.4 (red, green, and blue curves, respectively) during aggregate ordering.
Kelvin's regular truncated octahedron compared to cells from the σ ≈ σ * case, where cell shapes are perturbed due to active noise (Fig 6K).The rate of topological-transition events in the simulations of active cell aggregates reaches as high as ∼ 200/time unit, which in total amounts to ∼ 10 5 events per simulation (Fig 6C ); note that many of these events are reversible transitions that occur multiple times while the manipulated vertices are still located very close to one another and have not yet properly resolved in the geometric sense.Despite this large number of reconnections, the aggregate does not develop any nonphysical topology, e.g., an edge connecting more than two vertices or a polygon with one missing edge, etc.This demonstrates that topological transformations in space-filling 3D packings can indeed be implemented as graph transformations defined in Fig 4 .Importantly, due to GVM's unambiguous data structure, these transformations are relatively straight-forward to implement and are therefore readily reproducible, which is clearly demonstrated by the implementation of GVM within our neoVM package [61].

Discussion
We reformulated the vertex model of cell aggregates.The new formulation, called Graph Vertex Model (GVM), is based on storing the topology of the cell network into a knowledge graph.We discovered a particular graph data model, uniquely defined by a metagraph, which allows formulating topological transformations of the cell network as simple and mathematically properly defined graph transformations (Fig 2).These transformations are themselves represented by graphs and consist of only the most elementary graph operations, i.e., relationship deletions and creations.We designed graph transformations for all topological transitions which are required to describe cell rearrangements-including edge-to-triangle (ET) and triangle-to-edge (TE) transformations (Fig 4).Importantly, ET and TE transformations can be both applied to a 2D system, where they reduce to the well-known T1 transition.We showed that this happens because the transformation graph describing a T1 transition is in fact a subgraph of both ET and TE transformation graphs (Figs 5 and 3).Thus, when ET or TE transformations are applied onto a 2D polygonal tiling, only the T1 subgraph is matched and the corresponding transformation executed.This result generalizes topological transformations in 2D and 3D space-filling packings, suggesting that our proposed graph-data structure may be the most natural representation of the topology of space-filling packings.We used GVM to study active cell aggregates, whose cell junctions are subject to fluctuating active tensions (Fig 6).In particular, we characterized the order-disorder transition and found initially disordered aggregates undergoing ordering, which is most efficient for active noise close to the transition point.
Even though the basic GVM's data model presented here only encodes information on the topology of the network of cell-cell interfaces, GVM already represents an important technological and conceptual step forward in computational models of tissue mechanics.This advancement lays the foundation for the development of knowledge graphs capable of structurally storing live-imaging data, such as that obtained from developing embryos, integrating data on geometry, topology, mechanics, and biochemistry.With this aim, our ongoing work uses GVM as a starting point to develop a comprehensive knowledge-graph database of the early fly development.Making this database interactive and accessible online will allow collaborative research with the aim to progressively expand our collective knowledge base about the mechanics of the embryonic development.Additionally, its graph data structure may even be readily complemented with graph-compatible methods of artificial intelligence (e.g., Graph Neural Networks).
Graph vertex model can be readily extended to describe other cell-scale events that change the local topology of the network of cell-cell interfaces.Indeed, integrating cell division and apoptosis into the model would allow detailed mechanistic studies of spheroid-like cell aggregates as models of tumors or even embryos during the early stages of development, where cells may still be packed within a three-dimensional aggregate [70,71].In the context of tumors, for instance, investigating how the stability of the overall tumor shape depends on smaller-scale biomechanical processes such as the cells' effective surface tension, inhomogeneous cell proliferation, and active noise at cell-cell interfaces, would allow better understanding the mechanical basis of tumors' transition to malignancy.
For more efficient simulations, capable of dealing with hundreds or even thousands of cells, a considerable effort will also have to be devoted to developing technologies that will improve computational efficiency of the vertex-model simulations over a graph database.While our current implementation of GVM, neoVM [61], primarily serves as a proof of concept, it falls short on the efficiency.The reason for this is mostly twofold: (i) The time integration of the dynamical system is performed in Python, which is generally slower than some low-level programming languages, and (ii) Performing May 3, 2024 15/25 operations on the graph database managed by Neo4j necessitates reading from and writing to the local hard drive during cell rearrangements, where the database is stored.That this indeed limits the performance of neoVM, is also seen in S4 Fig, which shows the simulation time depending on the system size.While the dependence itself is linear, tissue activity shifts this linear dependence towards a higher offset value following a larger number of cell-rearrangement events.This shows that cell rearrangements consume most of the simulation time of neoVM, suggesting that the efficiency of neoVM could be significanlty improved by relying on memory storage instead of accessing the local hard drive.

Materials and methods
Initial configurations.Initial configurations (.vt3d files) used in simulations (Figs 6 and S4) are available in the git repository of neoVM.Among these initial files, the Kelvin initial configuration is generated by taking a unit cell from Surface evolver [72] and replicating them in all directions until the number of cells in the aggregate reaches 128.On the other hand, the disordered initial conditions are generated by putting spheres in the simulation box using random sequential addition and, subsequently, constructing Voronoi cells around these spheres using the Voro++ package [73].Calculation of forces.We neglect the inertial effects so that the friction force F Here, only friction with a static background is considered, η being the associated friction coefficient.The active force F (a) i can describe different system-specific active mechanisms, e.g., active contractions of the cell membrane due to the activity of the underlying cell cortex or traction forces [19,20,36,74].This model yields a system of first-order dynamic equations for vertex positions given by Eq. (5).
We consider a model, in which cell-cell interfaces are prescribed by effective surface tensions Γ lm , which include contributions of the cell cortical tension and cell-cell adhesion [75][76][77]; the notation lm denotes index of a polygon shared by cells l and m.In this model, the total potential energy of the cell aggregate reads where the first sum goes over all pairs of neighboring cells l and m and the second sum goes over all the cells, κ V and V 0 being the cell-incompressibility constant and the preferred cell volume, respectively.Note that V 0 need not be equal for all cells.In heterogeneous cell aggregates such as tumors, a similar same energy functional could be used, but V 0 's would need to be considered distributed.By definition, the conservative force acting on vertex i is calculated as which further requires calculating gradients of interfacial surface areas and cell volumes as described below.Surface area of polygonal side k is calculated as a sum of surface areas of triangular surface elements, a lm,µ , defined by pairs of consecutive polygon vertices r µ and r µ+1 , and the polygon's center of mass May 3, 2024 16/25 Like in the previous section, Greek indices here do not denote the real vertex identification numbers, but their sequential indices within individual polygons.The surface area of polygon lm reads and its gradient where δ ij is the Kronecker delta.
Cell volume l is calculated as a sum of volumes of tetrahedra, defined by triangular surface elements (r µ , r µ+1 , c lm ) and with the fourth vertex at the origin (0, 0, 0), as Its gradient is calculated as Topological transformations.None of the topological transitions is allowed if the resulting cell configuration breaks any of the following topological rules [51,57]: (i) Edge pairs may not share more than one vertex, (ii) polygon pairs may not share more than one edge and (iii) cell pairs may not share more than one polygon.In our implementation of GVM within the neoVM packing, these conditions seem to be rigorously imposed, since we never observe them being violated despite conducting numerous simulations involving a large number of rearrangement events (Fig 6).If such violation were to happen, the transformation causing it would need to be immediately reversed.
Cypher queries for pattern matching and graph transformations.
The following Cypher query creates a new relationship IS PART OF between nodes (v1) and (e3) and assigns property value s23.

CREATE (v1)-[:IS PART OF {sign:
The following Cypher query deletes relationship r between nodes (e5) and (p4 Implementation in Python and Neo4j.As a proof of concept, we set up the GVM's knowledge-graph database in a graph database management framework Neo4j [62].The core program of the vertex model is implemented in Python and communicates with Neo4j using Py2neo client library.The time integration of the dynamical system is performed in Python, whereas all topological transformations are performed in Neo4j through pattern matching and graph transformations implemented as Cypher queries [69].Our implementation of GVM is available as an open-source Python package, called neoVM, and is available online [61]. S3 Fig shows the schematic of neoVM's architecture.The program is initialized by reading the initial geometry and topology of the cell network from an input .vt3dfile and storing them into an object t of class tissue.In particular, this object stores lists of vertex, edge, polygon, and cell objects, which encode r i , e j , p k , and c l , respectively, in a tabular form.This is followed by generating an object db of class database, which connects to an empty Neo4j database and fills it with the tissue data according to the rules of the GVM's metagraph, using function setup DB().The initialization is followed by a time loop, which propagates the system forward in time by time steps ∆t.At each time step, the dynamical system [Eq.(5)] is integrated between t and t + ∆t [function solver()] and the program checks whether any of the edges in the cell network meets contitions for topological transitions [function topological transitions()].In particular, edge j undergoes an ET transition if it is shorter than a threshold length l th and the rate of change of its length is negative (dl j /dt < 0), i.e., the edge is contracting.If the edge happens to be part of a triangle, a TE transition is performed on the triangle if, additionally, all edges of the triangle are shorter than l th and the area of the triangle is decreasing (dA k /dt < 0).
For every edge/triangle, subject to a topological transition, the core program sends a sequence of Cypher queries to the Neo4j graph database.These queries (i) perform pattern matching to isolate a subgraph relevant for the particular transformation being May 3, 2024 18/25 performed, and (ii) perform graph transformation on that subgraph.After graph transformations, vertex positions are displaced such that the lengths of the newly created edges (edges e 7 , e 8 , and e 9 in Fig 1 ) are on the order of l new ≪ 1.Finally, the local structure of the arrays, encoding r i , e j , p k , and c l , (stored in object t) are updated according to the applied transformations.This is done by converting the altered part of the knowledge graph back into the array format using function update().

j
denote indices of head and tail vertices of edge j, respectively (Fig 1A), and j = 1, ..., N e , with N e being the total number of edges.The signs of the vertex indices, s (h/t) j

k
edges in the list p k are listed sequentially and σ (µ) k = ±1 denotes the orientation of the µ-th edge (with index j (µ) k ) within the polygon relative to a chosen positive direction (Fig 1A).Similarly, cells are defined by oriented lists of indices of the n

=Fig 1 .
Fig 1. Geometry and topology of cell aggregates.A Cell aggregate is modeled by a space-filling packing of polyhedral cells.Cell shapes are parametrized by vertex positions r i , which move according to mechanical forces F i .Topology of the cell network is specified by lists e j , p k , and c l [Eqs.(2)-(4)], which store head and tail vertices of edges, oriented edges within polygons, and oriented polygons within cells, respectively.B EV transition merges vertices of an edge into a single vertex, whereas VT transition resolves a vertex into a triangle.VE and TV transitions are inverse transitions of EV and VT transitions, respectively.C Polygons involved in topological transitions from panel B. D Decomposed schematic highlighting from 3 different perspectives of polygons, edges, and vertices, in topological transitions from panel B.

Fig 2 .
Fig 2. Graph Vertex Model.A The hierarchical structure of GVM vertex→edge→polygon→cell is defined by a metagraph.Relationships between these nodes are labeled IS PART OF and carry a sign property denoted by s, σ, and Σ for vertex→edge, edge→polygon, and polygon→cell relationships, respectively.B A subgraph representing a particular local cell state is obtained by pattern matching.This subgraph is then transformed by a graph transformation.C Metagraph of graph-transformation graph connects Vertex, Edge, Polygon, and Cell nodes with green and red relationships, indicating creation and deletion of IS PART OF relationships in the GVM's knowledge graph, respectively.
)].In the subgraphs of types (e:Edge)-[r:IS PART OF]->(p:Polygon) and (p:Polygon)-[r:IS PART OF]->(c:Cell) the same property specifies the orientation of edge e in the context of polygon p and polygon p in context of cell c, respectively [i.e., parameters σ and Σ in Eqs.(3) and (4), respectively].May 3, 2024 7/25 a few transformative operations.The detailed depiction of these graph transformations in the middle panel of Fig 3A reveals several deletions and creations of new relationships.Specifically, at the vertex and edge level, v 1 → e 2 and v 2 → e 3 relationships are eliminated, while v 1 → e 3 and v 2 → e 2 relationships are established.Furthermore, at edges and polygons level, only e 5 → p 3 and e 5 → p 4 relationships are removed, while new e 5 → p 1 and e 5 → p 2 relationships are created.What relationships need to be deleted and created is visually represented in the graph-transformations graph by red and green arrows, respectively.

σ′ 5 , 1 = − s′ 2 ,5 s′ 2 ,2 σ 2, 1 σ′ 5 , 2 = − s′ 1 ,5 s′ 1 ,3 σ 3 Fig 3 .
Fig 3. Graph transformation for a T1 transition.A Graphs in the left and right columns correspond to the initial and final cell configurations, respectively.Gray arrows represent relationships labeled IS PART OF.The graph in the middle column shows the graph transformation, which includes green and red relationships, indicating relationship creations and deletions, respectively.Additionally, the graph transformation specifies property values of the newly created relationships.B Schematic of determining the value of contextual property for a new vertex→edge relationship generated between vertex v 1 and edge e 3 (s ′ 1,3).After the transformation, vertex v 1 assumes the same role in the context of edge e 3 as the role of v 2 in the context of e 3 before the transformation (s 2,3 ).This occurs because the edge e 3 merely replaces v 2 (blue) with v 1 (red).C Schematic of determining the value of contextual property for a new edge→polygon relationship generated between edge e 5 and polygon p 1 (σ ′ 5,1 ).The calculation of σ ′ 5,1 (red) relies on one of the vertices of e 5 , i.e., v 2 (green), and the edge linked to both v 2 and p 1 , i.e., e 2 (also depicted in green).The assignment of σ ′ 5,1 is determined based on the contextual properties of: (i) edge e 2 in the context of polygon p 1 , σ 2,1 , (blue) and (ii) vertex v 2 in contexts of edges e 2 , s ′ 2,2 , and e 5 , s ′ 2,5 , (green).In short, the contextual property σ ′ 5,1 aligns or opposes that of σ 2,1 depending on the similarity or dissimilarity of s ′ 2,2 and s ′ 2,5 .

Fig 4 .
Fig 4. Graph transformations in a 3D vertex model of polyhedral packings.A Graph transformation of an ET transition.B Graph transformation of an TE transition.In both panels, graphs in the left and the right column correspond to the initial and the final cell configuration, respectively.Gray arrows represent relationships labeled IS PART OF.The graphs in the middle column show graph transformations, which include green and red relationships, indicating relationship creations and deletions, respectively.Additionally, the graph transformation specifies property values of the newly created relationships.In each graph, the node indices increase from left to right in unit steps.