Canalization and Control in Automata Networks: Body Segmentation in Drosophila melanogaster

We present schema redescription as a methodology to characterize canalization in automata networks used to model biochemical regulation and signalling. In our formulation, canalization becomes synonymous with redundancy present in the logic of automata. This results in straightforward measures to quantify canalization in an automaton (micro-level), which is in turn integrated into a highly scalable framework to characterize the collective dynamics of large-scale automata networks (macro-level). This way, our approach provides a method to link micro- to macro-level dynamics – a crux of complexity. Several new results ensue from this methodology: uncovering of dynamical modularity (modules in the dynamics rather than in the structure of networks), identification of minimal conditions and critical nodes to control the convergence to attractors, simulation of dynamical behaviour from incomplete information about initial conditions, and measures of macro-level canalization and robustness to perturbations. We exemplify our methodology with a well-known model of the intra- and inter cellular genetic regulation of body segmentation in Drosophila melanogaster. We use this model to show that our analysis does not contradict any previous findings. But we also obtain new knowledge about its behaviour: a better understanding of the size of its wild-type attractor basin (larger than previously thought), the identification of novel minimal conditions and critical nodes that control wild-type behaviour, and the resilience of these to stochastic interventions. Our methodology is applicable to any complex network that can be modelled using automata, but we focus on biochemical regulation and signalling, towards a better understanding of the (decentralized) control that orchestrates cellular activity – with the ultimate goal of explaining how do cells and tissues ‘compute’.

The function checkCounts(s) takes a list of variable counts, where the position of each corresponds to a column of A and (1) removes counts that correspond to columns in which states are fixed (all wildcards, or a possible literal enput); (2) for each of the remaining counts (not fixed) that may correspond to variables in a group-invariant enput, its position (column index) along with the count itself are all gathered in a set. This set is partitioned into subsets where each contains the variables (column indexes) with idential variable counts. If all of these sets contain at least two columns (indexes) then the set of these is returned as the variable sv, otherwise sv is returned with the value 1, meaning that A cannot be a symmetric group. The function checkValidSpec(A, sv) (returns T rue or F alse) checks that the input variables in every subset of colums of A with identical, non-fixed, variable counts (specified in sv) have the same schema counts. This is another condition for A to be a possible symmetric group. The function canSwap(A, i, j) simply permutes the i th and j th columns of A and rearranges the rows of the resulting matrix. If this resulting matrix is identical to A the function returns T rue, and returns F alse otherwise. The pseudocode for fgi and g is provided below, as well as a worked example in the following subsection.
For partitions H 0 2 P : |H 0 | > 20 the size of the search space for symmetric groups becomes significantly large. For example, if in such partition H 0 there are only symmetric groups containing e.g. two wildcard schemata, the algorithm has to search on approximately 2 |H 0 | subsets of H 0 . For example, if |H 0 | = 20 and there are no symmetric groups in it, the algorithm will evaluate approximately 10 7 of its subsets before it fails. If |H 0 | = 30 and it does not contain symmetric groups the algorithm will evaluate approximately 1.6 ⇥ 10 10 subsets -a search space that is too large for feasible computation. However, it is important to emphasize that |H 0 | ⇡ 20 is not a strict limit: if a very large set H 0 contains large symetric groups, then the algorithm can identify it after just relatively few expansions of the search space. Therefore, when a given set P contains large partitions H 0 it is possible to limit the expansion of the search space. By default, the algorithm ends when all partitions H 0 2 P have cardinality |H 0 | = 1, which means the entire search space has been expanded.

Algorithm 1 Function fgi(P )
1: F 00 = ; { the output two-symbol schemata will be stored here} 2: map the transition for every partion in P onto a list S 3: map the condition parts of each partition in P onto P 4: while P 6 = ; do

Worked Example
Assume the algorithm fgi is called with a set of partitions P containing the following single partition In step (2) the output variable F 00 is initialized as an empty set. After steps (3) and (4) we obtain a new P that contains only the condition parts of each original partion in P ; the common transition for each partition in P is stored in S: The while loop is entered on step (5) after checking that P 6 = ;. In step (6) A is assigned the first partition in P (head of P ) In step (7) P is assigned the rest of the list P , which means P = ;. In steps (8-10) the variable colcnts is assigned the counts of 0s, 1s and # in the columns of A: And with this variable the function checkCounts is called. The function removes the count {0, 0, 5} because it corresponds to a fixed column of A and records the how many of the remaining columns have the same count. Since there are two counts unique to single columns {2, 1, 2} and {3, 0, 2}, the function returns 1, and this means that A is not a candidate for a symmetric group under permutation. The algoritm then checks this value at step (12) and assigns the Boolean value 'False' to sw in step (17). Since sw = F alse at step (18) the algorithm jumps to step (23) in which A is 'split', and the search space for symmetric groups expanded. This is done at step (22), where if A contains m > 2 schemata, then (in step 23) find all the distinct subsets of A that contain m 1 schemata and append the result to P (step 24). For the current example, five subsets of A are now in P . With the new P the algorithm returns to the start of the main While loop. The head of P is now: Shifting control to function g, with inputs A and H This function iterates over each of the groups of columns that have the same variable counts, stored in H. For every list of columns in H the algorithm iteratively checks what pairs of columns can be exchanged leaving A unchanged. It is possible that the column indexes of a single element of H even though they have the same counts, cannot all be exchanged with each other. For example, it is possible that given one such list, e.g. {2, 3, 4, 5} columns 2 (4) and 4 (2) can be exchanged, and columns 3 (5) and 5 (3) can be exhanged, but not any other pair. In such case, the element of H is broken down into two sublists that will be marked with different position-free symbols -provided A as a whole is a symmetric group. It is also possible that -continuing with the same example -three columns can exhange places with each other, but that a fourth one cannot exchange places with any other. In this case g would return the empty set, since A is not a symmetric group. In our core example, this function identifies that columns 1 and 2 as well as 4 and 5 can be exchanged, returning the value of H unchanged, which is assigned to the variable r. In summary, function g checks that for every list of columns of H every element can exachange places with at least one other column.
Shifting control back to fgi At step (20) since r is not the empty set, the output variable F 00 is updated, now containing a two-symbol schema represented as the tuple (A,r). After this the algorithm goes back to the start of the While loop, to continue checking the remaining four partitions in P . All of these fail to satisty the checkCounts criterion, and P is expanded each time to contain further subsets to search on. No other partition satisfies all the criteria needed and eventually P is the empty set, when the algorithm stops, returning the single two-symbol schema identified, with its transition in S (to 0), and including the the wildcard schema {#, #, #, 1, 0, :, 0} in the original P that was not redescribed into any two-symbol schema. For various subsets of size m = 2 the criterion to file is the one specified in step (18) of fgi where the algorithm checks whether the schemata in the current matrix A are all contained in at least one element of F 00 .