Supplement S1

Preparation of Proteins Preparation of Nanobodies Enhancer and Modified Enhancer Both nanobody constructs were cloned into a pHEN6 vector and harbor a pelB leader sequence for periplasmic export and a C-terminal Hexa-His-Tag for purification, followed by a terminal Cysteine for covalent, site directed coupling of the protein. For expression, a 5l E. coli JM109 culture was induced with 0.5mM isopropyl β-D-1-thiogalactopyranoside and incubated for 5 hours at 30°C. Cells were lysed by sonification in buffer containing 1xPBS pH 8.0, 0.5M NaCl, 20mM imidazole, 1mM PMSF and 10 g/l lysozyme. After centrifugation, the nanobody constructs in the soluble fraction were purified by immobilized metal affinity chromatography (IMAC) on prepacked 1ml HisTrap HP columns with an Äkta Explorer HPLC system (GE Healthcare, Freiburg, Germany) according to manufacturer’s instructions. The elution fractions were analyzed by SDS-PAGE. Purified nanobody fractions were pooled and dialysed overnight into 1xPBS, flash-frozen and stored at -80°C at concentrations of 21μM (Modified Enhancer) and 35μM (Enhancer).

With the following definition of the Hill function, applied multiplicatively for each repressor.
Assume an input M 1 mediating, under the quasi-static assumption, the output M 2 through the Hill function Let M 1 be an decaying exponentially gradient of the form is the traveling wave shown in figure S1. The necessary condition αC 1 T , corresponds to a system initially past the saturation point of the Hill function and ensures that cells initially start with the on-phase of the traveling wave. The constant C 4 sets the minimum concentration of the traveling wave and allowing for non-zero steady-states in the case of activating morphogens. The parameters of figure S1 are given in table S3 .5 C 4 0 Table 3: Parameters used to generate the traveling wave of figure S1  | v + −˙ x| < 0.02 and the gray points correspond to an exponentially decaying gene G i−1 = e −t+t 0 and the two consecutive genes slaved to the decay by the equations (with Hill replaced by the shorter notation H)

and the valley starts
Because G i−1 (t) is a known function and we replace the full time dependence of G i on G i+1 and G i+1 on G i by their values at time t 1 , we obtain two first order ODE in one variable, the solution of which is an hypergeometric function.
Because the valley lives on a 2D plane between G i and G i+1 , not all slow points lie on the valley. As the system approaches the valley, G i−1 is strongly repressed by G i and therefore decays exponentially following e −t+t 0 where t 0 is a constant which shifts the curve in time according to which peak is fit. This exponential decay happens perpendicularly to the plane of the valley. As G i−1 reaches 0, the points agree with the valley condition. This exponential decay is modeled by the gray lines and it is the transition between two valleys. On the valley, G i slowly decreases while G i+1 slowly increases. Once G i+1 reaches θ i,i+1 = 0.4 it strongly represses G i which in turn exponentially decays following e −t+t 0 . The cycle thus continues.
The cell environment is intrinsically noisy because only a finite number of molecules, η, are interacting. To have a useful model, it needs to be robust to noise, or at least we must offer an argument for how it can be made more robust. There is an obvious problem when considering a system that evolves in time stochastically: time compounds noise, especially in system with sharp transitions such as those we present. The fact that there is a bifurcation with a time scale finely controlled by the proximity to a critical value also means that there is a trade off between noise robustness and the ability to produce domains of variable length. Using the theory of the chemical Langevin equation which we use to numerically simulate noise using the τ Leaping algorithm [2], we show how the introduction of a cell coupling rescues patterning even in very noisy environments (η ∼ 500). Figures S2 and S3 show ensemble averages of 100 independent trials at a given noise level, controlled by η, the number of molecules in a given volume. The average time course resembles the deterministic time course and in the η → ∞ we retrieve the deterministic solution. However, for smaller η, the peaks' amplitudes of gene expression are decaying as a function of time, as the variance increase. As time goes on, the loss of predictive power makes it impossible to predict which state will be expressed at some later time t. Due to the trade off between the critical behaviour at the bifurcation, this worsens as T iM is adjusted to lengthen the domains. To remedy this problem, we therefore propose a model by which cells interact with each other to reduce noise.

Cell averaging
We simulate cell development with a two dimensional cell array. Each cell C mn is exposed to a concentration of morphogen M mn which varies in time and space. We model M mn as a traveling wave propagating along the x axis (n index) so that each cell along the y axis (m index) is exposed to the same concentration of M at a given time.
We postulate cell-to-cell interactions. To model cell interactions we will add interaction rates R to our set of coupled ODEs. The resulting system is not cell autonomous and requires keeping track of every cell individually. To motivate the form of our interaction term, we note that it is known that cells interact with each other through many means such as simple diffusion, endocytosis, the use of transport proteins such proteoglycans or signaling pathways (e.g. Notch). Rather than modeling a particular form of cell-to-cell interaction, we take a phenomenological effective term. We model interactions between two cells using the rate function are the concentration of the gene G i in the two different cells indexed by (m, n), (p, q) in the cell array, ξ int is the length of interaction sets a concentration scale 1 past which the cells do not interact and σ int is the strength of interaction which plays the role of the rate at which the interaction happens.
The purpose of this interaction is to average the difference between two interacting cells. When C mn [G i ] − C pq [G i ] is zero, the function returns zero and there is no interaction. When |C mn [G i ] − C pq [G i ]| ξ int the cells are deemed dissimilar and they do not interact ( Figure 4C of main text).
ξ int , the cells interact in such a way as to return to their mean. If C mn [G i ] > C pq [G i ], R will be an effective degradation rate and if C mn [G i ] < C pq [G i ], R will be an effective production rate. The function R is plotted in figure 4D of the main text. By construction, for reasonable values of ξ int , σ int , R cannot change the deterministic steady state because it vanishes at zero and for values much greater than ξ int .
To obtain a local interaction at the level of the embryo, we assume all cells interact with each other with an interaction strength that falls off exponentially with their distance. Define to be the cartesian distance between the two cells at position (m, n) and (p, q) respectively, in the array. Then they interact with an interaction Finally, the deterministic coupled ODEs that represent these interactions is given by Eq. 6 where T mn is the exposure time which depends on cell position (m, n) and Hill(G, θ, h) is the Hill function as defined in eq. 1. The chemical Langevin equation is implemented according to the τ -leaping treating the production and degradation terms as noisy reactions. The interaction term is assumed to be exact.

In Silico Evolution selects for ξ int , σ int
The interaction parameters ξ int , σ int have to be carefully chosen to counteract two opposing forces. If the parameters are too small, the averaging effect will be too weak and there will be little improvement in noise robustness. If the parameters are too big, then cells which have been exposed for shorter period of time and have therefore achieved steady state will bias the expression of cells which are still deciding their fate.
To find suitable values for the interaction parameters, we run the an evolution algorithm. To do so, we create a population of embryos (i.e. cell arrays equipped with the GRN) each with different parameters ξ int , σ int . We integrate the the embryo once deterministically without interaction terms to getC mn [G i ](t) and multiple times stochastically with the interaction term to get each time C mn [G i ](t). Embryos are then ranked according to their fitness given by Eq. 7 which measures the RMS deviation from the deterministic timecourse. Survival of the fittest dictates that embryos with bad fitness are replaced by embryos with better fitness. The population is then mutated by slightly changing ξ int and σ int and keeping all other parameters constant.
For the simulation, we used η = 500, N rows = 100, N cols = 30, N genes = 5, N t = 5000, N tries = 5. We set θ iS = 1.59 which gives a long timescale. The best parameters were found to be ξ int = 0.0588 and σ int = 0.0300. Figure 4 of the main text shows the ensemble average of the 30 cells per row for different columns for both the uncoupled and the coupled system. We conclude that cell averaging preserves the timescale and the predictive power which would otherwise be lost to noise.