Advertisement
  • Loading metrics

Noncommutative Biology: Sequential Regulation of Complex Networks

  • William Letsou,

    Affiliation Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America

  • Long Cai

    lcai@caltech.edu

    Affiliation Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America

Noncommutative Biology: Sequential Regulation of Complex Networks

  • William Letsou, 
  • Long Cai
PLOS
x

Abstract

Single-cell variability in gene expression is important for generating distinct cell types, but it is unclear how cells use the same set of regulatory molecules to specifically control similarly regulated genes. While combinatorial binding of transcription factors at promoters has been proposed as a solution for cell-type specific gene expression, we found that such models resulted in substantial information bottlenecks. We sought to understand the consequences of adopting sequential logic wherein the time-ordering of factors informs the final outcome. We showed that with noncommutative control, it is possible to independently control targets that would otherwise be activated simultaneously using combinatorial logic. Consequently, sequential logic overcomes the information bottleneck inherent in complex networks. We derived scaling laws for two noncommutative models of regulation, motivated by phosphorylation/neural networks and chromosome folding, respectively, and showed that they scale super-exponentially in the number of regulators. We also showed that specificity in control is robust to the loss of a regulator. Lastly, we connected these theoretical results to real biological networks that demonstrate specificity in the context of promiscuity. These results show that achieving a desired outcome often necessitates roundabout steps.

Author Summary

DNA is the blueprint of life. Yet the order in which a cell follows these instructions makes it capable of generating thousands of different fates. How this information is extracted from underlying gene regulatory networks is unclear, especially given that biological networks are highly interconnected, and that the number of signaling pathways is relatively small (approximately 5–10). The conventional approach for increasing the information capacity of a limited set of regulators is to use them in combination. Surprisingly, combinatorial logic does not increase the diversity of target configurations or cell fates, but instead causes information bottlenecks. A different approach, called sequential logic, uses noncommutative sequences of a small set of regulators to drive networks to a large number of novel configurations. If certain targets are first protected, then even promiscuous regulators can activate specific subsets of lineage-specific targets. In this paper we show how sequential logic outperforms combinatorial logic, and argue that noncommutative sequences underlie a number of cases of biological regulation, e.g. how a small number of signaling pathways generates a large diversity of cell types in development. In addition to explaining biological networks, sequential logic may be a general experimental design strategy in synthetic and single-cell biology.

Introduction

A fundamental question in systems biology is how a small number of signaling inputs specifies a large number of cell fates through the coordinated expression of thousands of genes. This problem is especially challenging given that gene regulatory and other types of networks in biology tend to be highly interconnected and their regulators promiscuous, with regulators affecting multiple targets and targets being affected by multiple regulators. Examples of this architecture include: transcription factor binding networks in bacteria [1], yeast [2, 3], plants [4], and animals [5, 6]; cellular signalling pathways involved in growth and differentiation [79]; the interactome of protein kinases and phosphatases [10, 11]; and synaptic connections between different layers of the brain [12]. Furthermore, because the targets and regulators are often well-mixed and mutually accessible in the cell, most actions are likely to have nonspecific and undesired effects.

At the same time, regulatory molecules drive networks to a large number of highly specific outcomes or cell fates. Although there are approximately four hundred canonical cell types in the adult human [13], recent single-cell RNA expression profiling experiments in the developing embryo [14, 15], brain [16], hematopoietic system [17, 18], and other organs [19, 20], have indicated that there may be thousands more.

Given there are only a few signaling pathways used in metazoan development [21, 22], understanding how cells reach their final outcomes when there are fewer regulators than fates and/or targets is an unsolved problem. One extensively studied solution for the control of promiscuous gene networks is combinatorial binding of DNA-binding transcription factors (TFs) at the promoter [2331]. At the level of individual promoters, combinatorial binding ensures that individual genes are ON only when specific combinations of TFs are present (Fig 1A). However, on the genome level, combinatorial regulation restricts which sets of genes may be ON at the same time. For example, using AND logic, gene H in Fig 1A is only ON in the case that the three TFs K1, K2, and K3 are concurrent at the H promoter; but these stringent requirements mean that H can never be transcribed independently of the less highly-regulated genes A-G. (A similar conclusion holds for OR logic.)

thumbnail
Fig 1. Combinatorial logic bottlenecks information flow in networks.

(A) The number of ways that three TFs (K1, K2, K3) can be ON or OFF (tabulated at right) is the same as the number of ways they can bind at promoters (left). An equal number of gene expression states are observed whether the TFs use AND logic (requiring all factors be present) or OR logic (requiring at least one of the factors). (B) Signal-to-target information flow is bottlenecked by regulators if (i) the regulators respond to multiple targets, or (ii) the signals activate multiple regulators. The allowed target states are tabulated for signals using AND logic and regulators using AND/OR logic. (C) A feedback loop causes constitutive activation of a regulator (K1) and leads to fewer accessible configurations (tabulated at right).

https://doi.org/10.1371/journal.pcbi.1005089.g001

In fact, using combinatorial control, there is a one-to-one correspondence between configurations of the targets and configurations of the regulators. As shown in Fig 1A, the ON/OFF states of 3 TFs uniquely define the binding combinations at 23 = 8 promoters. A similar conclusion holds when the regulators are expressed in a graded fashion.

This one-to-one correspondence is the fundamental limitation of combinatorial regulation: it requires an equal number of regulators and independently controlled targets and/or cell fates. Applied to embryonic development, combinatorial control requires that hundreds or thousands of cell-type specific TF combinations be generated in a spatially precise manner at the start. However, the combinatorial scheme does not explain how the TF states are regulated in the first place, and thus it offers no new insight into how cell fate is specified.

The limitations of combinatorial logic can also be understood from an information theoretic point of view. In particular, it is impossible to specify arbitrary cell fates if the regulatory layer bottlenecks the capacity of the targets to receive messages from extracellular signals. It is known that some ten to twenty types of signals [21, 22] converge onto membrane-bound regulators in many different combinations, permitting messages to be passed to the downstream targets. Much of this information stands to be lost, however, if the network relies on combinatorial logic alone: the regulatory layer simply cannot transmit messages in their entirety if there are more signals than regulators. Thus, combinatorial logic strongly circumscribes what fates are ultimately reachable. Cell fate information is lost not only if the signals are more numerous than the regulators, but also if the connections between signals and regulators are promiscuous (Fig 1B). When different signals activate the same regulators (Fig 1Bi), certain signaling inputs become redundant. On the other hand, when same signal activates different regulators (Fig 1Bii), some of the regulators become redundant. One may determine by direct enumeration exactly how redundancy decreases the number of configurations available to the targets (Materials and Methods Sections 1 and 2). These preliminary conclusions are at odds with the observation that signaling molecules are deployed over time in a complex code [32]. How do these messages in the signal space reach the targets if the regulatory layer imposes a bottleneck on information flow?

In addition, feedback regulation—a common feature of regulatory networks—exacerbates information bottlenecks when coupled with combinatorial logic. Stated another way, feedback merely widens the basin of attraction of certain promoter configurations at the expense of the number of distinct configurations. In Fig 1C, constitutive expression of K1 by C means that C is never ON independently of the targets regulated by K1. Thus, the number of accessible configurations decreases from 8 to 6 without allowing new target configurations to be explored.

We need an alternative to combinatorial logic in cell fate specification that overcomes information bottlenecks. Here, we considered time-ordered control schemes, which we refer to as sequential logic. In this scheme, regulators can be applied in a stepwise manner; the entire sequence matters, so the final configurations can differ if the same regulators are permuted in time. In order for different temporal sequences to carry distinct information, the actions of the regulators must be noncommutative. This is the case, for example, when a regulator protects its targets from the action of another regulator, as when loci recruited to repressive chromatin compartments are protected from further modification [33, 34].

While it is not surprising that noncommutative sequences like this result in different outcomes at the single promoter level, these simple mechanisms may have nontrivial implications for regulation at the genome level. In particular, noncommutativity permits the same regulators to be used at different times with distinct effects. This is seen in development when ubiquitous signaling molecules like FGF family members exert different effects depending on the time and context of their expression [3538]. Reuse of factors could greatly expand the information capacity of the major signaling pathways.

A number of examples show that noncommutativity may be a general strategy in other areas of biology. In hematopoietic stem cells, activation of GATA2 and C/EBPα in different orders results in different cell fates [39]. In neurobiology, different temporal orderings of the same inputs lead to distinct firing patterns [4042]. In the field of synthetic biology, a DNA switch was developed that could detect the order in which invertase enzymes were applied [43]. And in evolutionary biology, the order in which mutations arise was recently implicated in determining a genotype’s fitness [4447]. There is also accumulating evidence for sequential logic in transcriptional control: signaling molecules and TFs in mammalian cells, including ERK [48], NF-κB [49, 50], p53 [51], as well as in yeast [5254] have been observed to pulse, suggesting that TF timing may be used to control the transcriptional state of the cell.

By applying sequential logic, we show that, even in complex and promiscuously regulated networks, specific target configurations can be reached using a temporal sequence of regulators. In particular, we consider two models inspired by (i) kinase/neural networks and (ii) chromosome folding and show analytically that both scale super-exponentially. We further show that noncommutative networks are robust to the loss of regulators, suggesting a mechanism for regulator evolution. We also show that regulators induce different orbits in expression space, which is related to the number of networks that can be controlled in parallel. We conclude by discussing how these models apply to interconnected networks in and outside biology and by providing possible experimental tests of the theoretical concepts. Theorems and proofs are given in the Materials and Methods.

Results

A time-sequence ratchet model generates more diversity than combinatorial logic in multiply-connected networks

To consider how time-ordered sequences of regulators can specifically control groups of targets, we begin by analyzing a generic two-layer network that is an extension of combinatorial logic (Fig 2, Materials and Methods Sections 1 and 2). In this model, each regulator controls multiple targets, and each target is accessible to any of its regulators. The model is meant to be analogous to the cellular environment wherein regulators and targets are well-mixed. For example, targets could be substrate proteins capable of multi-site phosphorylation [55, 56], and regulators the kinases and phosphatases. Targets could also be neurons and regulators their upstream excitatory and inhibitory inputs [12]. We denote by K the set of activators (i.e. kinases) and P the set of deactivators (i.e. phosphatases). Each target has a ladder of (integer-valued) states, and together the states of the targets are a configuration of the network. (This distinction is in contrast to the common usage of “state” as a gene expression vector.) An additional parameter, the threshold T, determines the number of rungs on the ladder. Regulators ratchet the targets through their states, and only targets that have reached threshold will be ON at the end of a sequence of regulators. If each target in the group can be controlled by a unique combination of K’s and P’s, what ON/OFF configurations are possible?

thumbnail
Fig 2. The ratchet model attains configurations not reachable by combinatorial logic.

(A) The ratchet model for n = m = 2 and ln = lm = 1. Activators (K’s) and deactivators (P’s) turn targets ON and OFF, respectively. (B) An example temporal sequence for the network with a threshold equal to 1. Black targets are in the 1 state. (C) An example sequence for the same network with threshold equal to 2. Gray targets are in the 1 state, and black targets are in the 2 state. (D) Scaling laws for the threshold T = 1 (red) and T = 2 (yellow) are shown for symmetric networks (n = m). A comparison to combinatorial logic with an equivalent number of regulators (n + m) is shown in blue.

https://doi.org/10.1371/journal.pcbi.1005089.g002

In this model, termed the ratchet network (Fig 2A), each of n K’s and m P’s control unique targets, with the connectivity parameters ln and lm specifying the number of regulators to which each target connects. Consider the sequence K1 K2 P1 acting on the targets A, B, C, and D (Fig 2B). In the final configuration, B and D are ON together even though no single K connects to both, and A and C are OFF, even though both share and activator with B and D. Therefore, this simple model illustrates the important point that similarly regulated targets can be in independently controlled using sequential logic.

With threshold T = 1, not all configurations are reachable. Observe that there is no way to specifically activate A and D while leaving B and C OFF. This result is surprising given that A and D share no regulators: specificity depends on the network as a whole, not just individual targets. By going to T = 2, the forbidden configuration becomes accessible (Fig 2C), along with all ON/OFF states (below).

A combinatorial formulation of the model as a connectivity matrix

The model described above can be formalized as a combinatorial object that we refer to as the connectivity matrix A. This formulation is useful because it is amenable to studying scaling, and it permits a direct comparison between noncommutative ratchet networks and standard combinatorial logic. For the interested reader, the models considered in this paper have a universal formulation as noncommutative matrix operators on the vector space of configurations (Materials and Methods Section 9).

Typically, the state of N targets is represented as an N-dimensional vector. If each target is controlled by a unique (Ki, Pj) pair (i.e. ln = lm = 1), the N = nm-dimensional vector can be re-formulated as an n × m matrix (1) where each entry Ai,j ∈ {0, 1, …, T} is the state of the target regulated by Ki and Pj. For example, the connectivity matrix for the network in Fig 2 is (2)

In general, a regulator may connect to multiple targets (i.e. ln, lm > 1, see below), so that each entry of A may be thought of as an M-dimensional vector (M determined in Materials and Methods Section 1). It turns out that this is an unnecessary complication; we instead let each Ai,j = 1 if at least one of the M targets regulated by Ki and Pj is ON, and Ai, j = 0 only if all M targets are OFF.

In this formulation Ki and Pj are raising and lowering operators that map n × m matrices to n × m matrices via the rules (3) From Eq (3), any sequence Ki1 Ki2Kik of all K’s is commutative, because any target controlled by tk of the K’s will be in state tT at the end of the sequence, regardless of the order. A similar argument holds for the P’s. However, sequences consisting of both K’s and P’s are in general noncommutative. This is due to edge effects when Ai,j = 0 or T. If Ai,j = T, for example, then Ki Pj results in Ai,j = T − 1, whereas Pj Ki gives Ai,j = T. Therefore, A gives insight into both the configuration of the targets and the noncommutativity of the regulators.

The problem of determining the number of accessible configurations in a network is reduced to finding the number of matrices satisfying certain patterns. For example, combinatorial logic with T = 1 corresponds to the special case in which the only sequences are the 2n combinations of the n K’s. In an n × 1 connectivity matrix, activating Ki corresponds to turning all 0’s in row i into 1’s. There are 2n matrices generated by this procedure. More complicated cases of combinatorial logic can be studied this way (Materials and Methods Section 2), but it turns out that the total number of network configurations is always less than 2n + m, with n + m the total number of regulators. This is important because noncommutative models can bypass the exponential limit.

The ratchet model scales as the poly-Bernoulli numbers

We used the connectivity matrix representation of the ratchet network to determine the scaling as function of the number of regulators n and m, with each target connected to a unique (K, P) pair (i.e. ln = lm = 1) and the threshold T = 1. Ki turns 0’s to 1’s in row i and Pj turns 1’s to 0’s in column j. The rules are consistent with the one-pot reaction model in which all substrates receptive to Ki are promoted when Ki is active. For example, the sequence K1 K2 P1 in Fig 2B can be recast as (4)

The main result is that A must avoid the patterns and in any 2 × 2 sub-block (Materials and Methods Section 3). Brewbaker [57] enumerated the n × m binary matrices avoiding these patterns and showed that they scale as the poly-Bernoulli numbers [58] (5) where is a Stirling number of the second kind, defined combinatorially as the number of ways to put j labelled balls into n unlabelled boxes such that no box is empty [59]. These numbers scale not quite as fast as 2N = 2nm, but much faster than 2n + m, the maximum number of states in the equivalent combinatorial network (Fig 2D). Thus, a simple time-sequence model is able to generate super-exponential scaling.

All binary ON/OFF states are reachable for an increased threshold

Are more configurations accessible if multiple activation events are needed before reaching threshold? For example, neurons require the summation of multiple excitatory inputs to reach action potential, and proteins need to be phosphorylated at multiple sites before they are activated [55, 56]. We found that by increasing the threshold to T = 2, all 2N ON/OFF configurations of the N targets become reachable. In the connectivity matrix formulation, and are no longer forbidden, which we show with an inductive proof (Materials and Methods Section 4). This scaling law (Fig 2D), achieves the maximum of reachability and specificity; it far exceeds the scaling 2n + m of the combinatorial model.

Being able to reach the entire ON/OFF space of N targets is overkill for most biological networks, which only display a relatively small number of stable configurations. The major implication of this result is that multiple levels of activity permit more targets to be controlled independently.

Increased regulatory connectivity generates robustness

As sequential logic allows a large number of configurations to be reached in a complex network, we asked whether increasing the connectivity of the network (ln and lm) can maintain the specificity of the network while making it robust to the loss of a regulator. This is potentially relevant to evolution of biological networks, because redundant connections allow the network to repurpose regulators for new functions without severely impairing existing ones [60].

In the ratchet model, an increase in the connectivity parameters to ln = 2 K’s and lm = 2 P’s permits multiple targets to share a common (K, P) pair (Fig 3A). The connectivity matrix incorporating the extra links in the network in Fig 3A is (6) Now that each entry of A is a group of M > 1 targets, it makes sense to track the state of the group as a whole with a single number Ai,j. Even though a target appears in multiple entries of A, the rules prevent a regulator from altering the state of groups at remote locations (e.g. K1 cannot change the state of the group at A2, 2).

thumbnail
Fig 3. Multiple connections in the ratchet network decreases the number of configurations.

(A) An example network where each target has ln = 2 connections to the K’s (red) and lm = 2 connections to the P’s (blue). (B) A list of the minimal length sequences generating unique configurations in the network in when ln = lm = 1. Red bars are K actions and blue bars are P actions. (C) The list of minimal length sequences when ln = lm = 2. Some sequences now map to the same configuration. (D) Analytical solution for the number of sequences as a function of n = m for different ln = lm families.

https://doi.org/10.1371/journal.pcbi.1005089.g003

We prove in the Materials and Methods that all sequences using at least nln + 1 K’s and mlm P’s are redundant with shorter sequences (Fig 3B and 3C, Materials and Methods Section 5). For example, the sequences K1 K2 K3 is required to turn ON all targets in the case ln = lm = 1, but if ln = lm = 2, the shorter sequences K1 K2, K1 K3, and K2 K3 have the same effect. We derived a recursive formula that eliminates the redundant sequences in each (n, m, ln, lm) instance to derive the number of sequences in (n, m, ln + 1, lm) and (n, m, ln, lm + 1) (Fig 3D and S2 Fig). The formula agreed exactly with an algorithm designed to find all minimal length sequences (Materials and Methods 5). Notably, increasing ln, lm reduced the number of configurations. We observed a similar effect in combinatorial logic (S1 Fig).

To investigate the robustness of sequential logic networks, we studied the effect of deleting regulators in increasingly connected networks on the number of reachable configurations (Fig 4A). We hypothesized that sequences that activate similar subsets of targets should be able to recoup permanently lost configurations. To test this, we computed the normalized correlation coefficient between configurations in the network using all K’s (the full network) and configurations in the network without K1 (the impaired network), subject to the constraint that those configurations were reached using longer sequences (Fig 4B). To focus on the recoverable fraction, we deleted all configurations that had an exact match. Highly similar configurations (yellow) clustered to the right of the plot, indicating that longer sequences can be used to recover lost configurations.

thumbnail
Fig 4. The ratchet network is robust to loss of a regulator.

(A) A schematic illustration of the experiment. The regulator K1 was deleted from networks with m = 2 P’s and variable n for different values of the connectivity ln. The resulting number of configurations was computed by simulation. (B) Correlation coefficient between configurations in the full network (all K’s; rows) and the impaired network (without K1; columns). All rows with exact matches were deleted. (C) Cumulative distribution F(x) of the maximum correlation coefficient x for each row in C for different values of ln. The dashed line is the similarity cutoff 0.8. (D) Tradeoff between reachability and robustness. The number of reachable configurations as a function of (n, ln) is plotted vs. the fraction of states above the similarity cutoff 0.8 (i.e. 1 − F(0.8)) for different values of n.

https://doi.org/10.1371/journal.pcbi.1005089.g004

How similar are the recouped configurations? As connectivity increased, the maximum similarity became increasingly concentrated above 0.8 (Fig 4C). There is generally a tradeoff between reachability and the size of the fraction above 0.8 (Fig 4D). The tradeoff is nonlinear, however: using ln = 2 gave the greatest increase recoverability for the smallest loss of configurations, showing that an intermediate level of redundancy can buffer the network to loss of regulators. The above analyses demonstrate that specificity of control is not compromised when regulators are lost or repurposed in heavily interconnected networks.

Sequestration networks generate diversity through protected states

In the ratchet model, all targets are accessible to their regulators at all times. However, in some cases targets may be shielded from regulators: for example, genes can be silenced by sequestration in various nuclear compartments [61, 62]. This was seen in a landmark study by Filion et al [63], who used a DNAse accessibility assay to show that genes associate with different regulators depending on their chromatin “color” or accessibility status.

To study the effect of accessibility and silencing on activating specific subsets of genes, we constructed the following sequestration model. In addition to the OFF state 0 and the ON state 1, each target/gene is endowed with additional orthogonal states 2 to n (allowing for a total of 2n − 1 − 1 genes). If RNA polymerase (RNAP) is associated with K1, what genes can be independently activated? In this model (Fig 5) a regulator Ki promotes targets in the 0 state to state i, and Pi returns targets in state i to 0. Any target in state i is protected from regulators other than Pi. As an example of gene regulation on a three-dimensional chromosome (Fig 5A), the sequence K3 K4 K1 P3 P4 first clusters all genes having a 3 in a repressive compartment, and then the remaining genes having a 4 in another repressive compartment. The net effect is that RNAP can only act on the gene represented by {1, 2}. We represent this abstractly as a configuration vector of k-armed targets (Fig 5B), where each entry corresponds to the state {0, 1, …, n} of a gene able to access kn of the states (see below for a mathematical description of the model). Therefore, protected states in the sequestration model allow genes to be transcribed specifically in a well-mixed environment.

thumbnail
Fig 5. The sequestration network is a noncommutative model of gene regulation by chromosome folding.

(A) A sequence of moves K3 K4 K1 P3 P4 on a hypothetical chromosome with K and P actions represented as DNA-binding factors and K1 playing the role of RNAP. Red circles correspond to genes and numbers correspond to allowed binding partners. (B) The same sequence in A represented as a collection of targets with up to n = 4 arms. For example, the target {0, 1, 2} corresponds to the gene locus with states 1 and 2 in A. The filled circle represents the current state.

https://doi.org/10.1371/journal.pcbi.1005089.g005

We derived (see below) that the number of reachable configurations scales with the number of regulator pairs n as (7) For n = 1, 2, 3, 4, 5, 6, this formula gives f(n) = 1, 2, 7, 89, 16897, 780304385 (Fig 6). We also relaxed the constraint that all genes have a 1 state (allowing for a total of 2n − 1 genes) and found that the number of configurations scales as cn = 2, 7, 94, 37701 with n = 1, 2, 3, 4. The full model does not have an analytical solution, but it does have upper and lower bounds related to Eq (7) (Materials and Methods Section 7, S3 Fig).

thumbnail
Fig 6. Scaling in the sequestration model is super-exponential.

(A) A plot of all the allowed configurations of a set of targets of n = 4 regulator pairs in the sequestration model. Yellow represents targets that are ON, and blue those that are OFF. (B) A list of the sequences generating the corresponding states in A. K actions are shown in the red spectrum, and P in the blue. (C) A logarithmic plot of the scaling in the sequestration model. The total space is the 22n − 1 − 1, the reachable space is calculated from Eq (7), and the combinatorial model is 22n.

https://doi.org/10.1371/journal.pcbi.1005089.g006

Combinatorial scaling laws of this sort are not uncommon [44, 64, 65]. Edwards and Glass [64] saw an explosion in the number of states when studying trajectories on n-cubes, and Green and Rees [65] saw a super-exponential jump when enumerating certain types of nonrepeating sequences on n letters. Furthermore, a similar small number (four) of factors are necessary and sufficient to reprogram fibroblasts to stem cells [66]. Together, these results indicate that sequences can far exceed the 2n limit set by combinatorial regulation, and that only a few regulators are necessary to make large changes in the configuration of a cell.

Regulators act on the configuration vector in the sequestration model

The sequestration network with n regulator pairs (referred to as the n-network) is described using the 1 × 2n − 1 configuration vector x. This is a simpler description than the connectivity matrix because a target affected by Ki is necessarily affected by Pi. The entries of x are the states of each target g able to be controlled by kn of the regulator pairs. Each target g is is a list {0, i1, …, ik} of the k regulators to which it responds. Because of their radial appearance, such targets are said to have k arms (see Fig 5B).

The regulators act on x according to the rules (8) Eq (8) guarantees that the regulators are orthogonal in the sense that a target in state j is protected from Ki and Pi if ij; and also idempotent in that . Furthermore, sequences of regulators are noncommutative unless the only actions are P’s. This is a consequence of the fact that P’s put all affected targets into the 0 state. Although these rules are different from the ratchet model, a formulation exists that generalizes the K’s and P’s to matrix operators consistent with both models (Materials and Methods Section 9).

If x is restricted to the 2n − 1 − 1 targets all able to be regulated by K1 and at least one other K, the network is said to be reduced; otherwise we say x is full. This distinction was used in Fig 5.

A one-coloring is a configuration of x that uses only one of the states and 0. For example, the configuration x = (1, 0, 0, 1, 1, 0, 0) in the full n = 3-network is a one-coloring of 1; so is the reduced network formed by (x4, x5, x7) = (1, 1, 0). This concept is easily extended to k > 1-colorings. One-colorings are particularly important because they resemble the ON/OFF configurations of genes in an RNA-seq experiment, and we would like to know how many such configurations can be reached.

A simple counting argument for the connected one-colorings illustrates super-exponential scaling in the sequestration model

As in the ratchet model, finding the accessible states of the sequestration network amounts finding restricted patterns in x. We determined that the restricted one-colorings are those that violate a property referred to as connectivity (Materials and Methods Section 7). A configuration of x is said to be connected if all k > 3-arm targets match the state of at least one of k of the 2-arm targets {0, i1, i2}, …, {0, ik − 1, ik} sharing the indices i. If the network is reduced, no k-arm target may be in the 1 state when all of 2-arm targets with which it overlaps (i.e. shares an index other than 1) are in the 0 state. This restricts the one-colorings and suggests a method to determine the scaling law for the model in Fig 5.

As an example, in the n = 4 network on the reduced set of 23 − 1 targets illustrated in Fig 5, {0, 1, 3} and {0, 1, 4} both being 0 constrains {0, 1, 3, 4} to be 0 as well. Furthermore, even though {0, 1, 2} is in the 1 state, {0, 1, 2, 4} and {0, 1, 2, 3, 4} may be 0. It is only the two-arm targets that constrain the possible configurations: for example, the longer sequence K2 K4 P2 K3 K2 P4 K1 P3 K4 P1 K3 P4 K1 P2 P3 obtains the state x = (0, 0, 1, 0, 0, 0, 1) in which only the targets {1, 4} and {0, 1, 2, 3, 4} are ON, showing that {0, 1, 2, 3, 4} need not be in the same state as {0, 1, 2, 3}, {0, 1, 2, 4}, or {0, 1, 3, 4}. In Fig 6A and 6B we show the allowed states and the sequences that generate them for n = 4; there are 90 out of a possible 224 − 1 − 1 = 128 configurations.

There are 22n − 1 − 1 one-colorings on 2n − 1 − 1 targets. How many of these violate the connectivity rule? Suppose there are m 0’s among the 2-arm targets. If m = 1, then of the k ≥ 3-arm targets are constrained to be 0, as there is always another 2-arm target (in the 1 state) that each k-arm target can match. If m > 1 and m − 1 < k, however, then , so k-arm targets whose states {i1, …, ik − 1} are completely contained within the set of 2-arm targets {0, 1, j1}, …, {0, 1, jm} must be 0. Hence in any violation of the connectivity rules at least one of k-arm targets will be in the 1 state and the remaining k-arm targets will be 0 or 1. Furthermore, there are ways of specifying m 0’s, so the total number of violations is (9) Subtraction from 2n − 1 − 1 gives Eq (7).

The ratchet and sequestration networks divide the configuration space into orbits

Until now we have considered the reachable space of a single group of targets each starting in 0. An ensemble of networks could each start with their targets in some arbitrary state, and when a sequence is applied to the ensemble the different networks will in general span different configurations. Determining the number of orbits (defined precisely in Materials and Methods Section 8) within the set of possible configurations tells us how many networks can be controlled in parallel.

Enumerating the reachable space for both the ratchet and sequestration networks involved finding configurations that violated at least one rule. If two configurations have distinct violations, then there is no way they can communicate using the regulators. Therefore, the different orbits are the groups of configurations having the same forbidden patterns. It is possible that a violation could be alleviated by an action that changes the state of an offending target, so we require that each orbit be immune to a subset of the regulators. This could be achieved in biological networks by locking targets in protective chromatin states or by shutting down certain cellular receptors.

We determined a recursive formula for the number of orbits in the ratchet network for an arbitrary n, m (Materials and Methods Section 8). In Fig 7A we plot the orbits for the n = 4, m = 2 case. There is one large component of size and several smaller orbits of size with in and jm. There are only a handful of singleton orbits in Fig 7A, but the number of isolated states dominates the space as n, m increase.

thumbnail
Fig 7. Noncommutative models induce orbits in the configuration space.

Graphical representation of the orbits in (A) the n = 4, m = 2 ratchet network and (B) the full n = 3 sequestration network. Configurations are indicated by red circles, and those accessible to each other are connected with blue lines. Arrows in (A) indicate whether a path is irreversible.

https://doi.org/10.1371/journal.pcbi.1005089.g007

We were unable to find a similar solution for the sequestration network because we lack a general solution for the number of states in the main orbit. However, Fig 7B shows the computationally discovered orbits for the full network on 2n − 1 targets. A nontrivial feature is that there are orbits which use all pairs of regulators, but which do not communicate with the main orbit. For example, the sequence K2 K3 from x = (1, 0, 0, 0, 0, 0, 1) reaches the same configuration as the sequence K1 starting from x = (0, 2, 3, 2, 2, 3, 0); these configurations are part of the same orbit because both violate the connectivity rule between x7 = {0, 1, 2, 3} and the 2-arm targets x4, x5, and x6.

Another observation is that some pathways cannot be reversed by a legal action in the ratchet network orbits (indicated by a directed arrow in Fig 7), whereas there always exists a reversible path between configurations in the sequestration network orbits (no arrowheads). It can be proved that this is true in general for the sequestration network (Materials and Methods Section 8). This feature permits orbits to be found computationally by looking for reversible one-step paths in the entire configuration space.

The orbits are one explanation for the phenomenon the same signal can cause cells to behave differently [38]. More generally, the orbits demonstrate an intriguing symmetry between the targets responding to a restricted subset of the regulators on one hand, and the orbits restricted to the same subset on the other.

Discussion

In this paper we first show how noncommutative, sequential logic can relieve information bottlenecks in multilayer networks. Bottlenecks in combinatorial logic may occur whenever a downstream layer has fewer elements than the layer upstream, which poses the problem of how networks process complex signals without loss of information. Noncommutative solutions such as the ratchet and sequestration models, in which the number of configurations scales super-exponentially in the number of regulators (Eqs (5) and (7)), permit longer, more complex messages to reach the targets via information “pulses.” These pulses encode a large diversity of signals into configurations of the targets that would otherwise be lost using combinatorial logic.

Noncommutativity has long been recognized as a central concept in control theory, because it allows systems with few controllers to explore a broader configuration space. For example, one generates z rotations in 3D by Rx Ry Rx, so control over z is generated by a pulse sequence of rotations in x and y, as in airplane control where roll and pitch generate yaw [67]. Infinitesimal motions in the form of generating matrices are translated into flows in a vector space by exponentiation. Because matrix multiplication is noncommutative, composition of flows is not simply the addition of generators, but rather a higher order polynomial of commutators of the generators given by the Baker-Campbell-Hausdorff formula [68]. Noncommutativity also appears in experimental physical chemistry where pulse sequences can prepare spin systems in nontrivial population configurations [69]. A formal description of these phenomena is based on the Heisenberg picture of quantum mechanics, wherein evolution of a system of many variables is given by a differential equation involving the commutator of a Hamiltonian operator.

The significance of noncommutative control for systems biology is that it becomes possible to independently control targets that would otherwise be activated by the same promiscuous regulator. In this paper, we argue that noncommutative sequences permit control over new directions in gene expression space, allowing more specific sets of targets to be controlled. Several studies have shown that TFs that can bind genes in one tissue type are in fact precluded from binding the same genes in another [70, 71]. The C. elegans TF LIN35 fails to bind targets in the germline that it binds in the intestine [71], and the SMARCA4 complex in mouse binds enhancer elements in heart, limb, and brain tissue in a tissue-specific manner [70]. One hypothetical explanation for these observations, based on the sequestration model, is that cell-type specific gene expression is the result of noncommutative sequences like K1 K2 and K2 K1 that silence certain promoters. The three-dimensional structure of the genome is a likely setting for this type of regulation.

Gene regulation is known to take place in three-dimensions, as observations of DNA looping [72], nonrandom chromosome packing [73], and clustered transcription factories [74] have shown. However, the factors that affect chromosome structure are non-specific. One such factor is the ubiquitous zinc finger protein CCCTC binding factor (CTCF) [75], which functions as both an activator of transcription by bringing enhancers and promoters together [76, 77] and as a repressor by insulating genes [78, 79]. Epigenetic modifications, such as histone methylation and acetylation [8082], also affect three-dimensional structure. In addition, DNA looping was observed in the context of allelic exclusion during B- and T-cell lineage specification where individual alleles were recruited to heterochromatic regions while the other underwent recombination [33, 34]. Consequently, the sequestration model predicts that temporal permutations of a small set of chromatin modifying factors could specify a large number of potential chromosomal conformations and lead to different expression states and corresponding cell fate decisions.

New technologies such RNA-seq and ChIP-seq can be used to test the predictions of the noncommutativity hypothesis at the genome level. Epigenetic drugs such as azacytidine and trichostatin A inhibit DNA methylation [83] and histone deacetylation [84], respectively, and have been shown to cause global changes in gene expression alone and in combination [83, 85]. The sequestration hypothesis predicts that perturbations to the three-dimensional structure of the chromosome are noncommutative, so distinct gene expression states may be reached by permuting the order in which epigenetic drugs are applied. While the sequestration model may underlie chromosome folding, the ratchet model could form the basis of phosphorylation networks. For example, mass spectrometry studies have revealed complex phosphorylation patterns [86, 87], though the number of kinases and phosphatases is comparatively small and the networks are highly interconnected [10, 11]. As phosphoproteins are the mediator of extracellular signals, ordered disruption of signaling pathways could also lead to distinct gene expression configurations.

Analogously, the ratchet model may aid in the specification of distinct neural activity patterns, owing to the fact that connections between the different hippocampal layers overlap [12, 88]. While superficial neurons can be activated in response to spatial cues, deeper layers can be selectively activated by time sequences of inputs [40, 41, 89]. These results suggest the hypothesis that neural networks may be noncommutative. In particular, experimental support exists for the role of the dentate gyrus in pattern separation and orthogonalization by way of ensuring that even quite similar memory representations use distinct subsets of neurons [90, 91]. The ratchet model, by ordering inputs in time, is one way of reaching these specific subsets if the number of input neurons is smaller than the number of targets neurons. Memories share many common elements, including shape, color, smell, and sound, which poses problems for recall. We hypothesize that older, “fuzzier” memories could be those relegated to very long ratchet sequences. According to this hypothesis, memories are not forgotten, but are instead increasingly difficult to access, and memories that are not consolidated are those that never formed a unique ratchet sequence.

Beyond resolving bottlenecks and generating specificity, noncommutative actions offer a new interpretation of how cell fate decisions and other stepwise processes occur on abstract regulatory landscapes. The classical Waddington landscape view of development holds that cells decay to attractor configurations representing terminal outcomes [92]; this is consistent with a boolean network with many variables X converging to a fixed point [93]. In a static landscape, the final outcome is determined a priori by the nearest energy minimum. What then determines the initial configuration? In organisms such as Drosophila, maternal patterning of the embryo may account for this initial bias [94]; but in other organisms that employ mechanisms like multilineage priming [82, 95], it is not clear that every cell fate decision is made at the beginning.

Sequential logic allows cells to reach their final fate on a dynamic landscape. In the system of Fig 8A (top), for example, it is not possible for cells in the blue configuration to transition to the red fate by increasing X2, because this involves an uphill climb. However, the regulators of genetic networks may also affect the landscape directly. This is seen in Fig 8A (bottom) where the sequence K1 K2 P1 changes the landscape in such a way that the overall cost of reaching the same endpoint is much lower than the direct path (Fig 8A, top). This can be understood as the effect of regulators acting on additional variables V, which modulates the landscape in X space. For example, TFs can recruit chromatin regulators that modify global three-dimensional chromosome structure and future TF accessibility [74, 76, 96, 97], or kinases can sequester substrates in the nucleus to prevent their subsequent activation [53, 54]. Because sequential logic acts on the V’s as well as the X’s, changes that appear to be small in one dimension (Fig 8B, left) actually involve large excursions in the full space (Fig 8B, right). As a consequence, in noncommutative regulation, the landscape changes and cells can take on fates that were not accessible at the beginning.

thumbnail
Fig 8. Sequential logic on regulatory landscapes.

(A) The regulatory landscape for the 2-mRNA system X1, X2 for two hypothetical paths with configurations represented by balls. It is difficult to directly increase X2 because of a potential barrier (top). In the roundabout path (bottom), visiting two intermediate configurations via K1 K2 P1 results in an altered regulatory landscape. (B) The initial and final configurations in (A) projected onto (X1, X2) space (left) and (X1, X2, V1) space (right). The regulators affect not only X1 and X2, but also an additional variable, denoted V1, that alters the landscape of X1 and X2. The arrows indicate the instantaneous direction of the trajectory.

https://doi.org/10.1371/journal.pcbi.1005089.g008

Previous theoretical models have explored dynamic regulatory landscapes in the form of bifurcations [98, 99]. In these models, a set of kinetic parameters determines the positions of minima and maxima in the landscape. However, the noncommutative model advanced here is fundamentally different, in that using the regulators to move through X changes the landscape directly. This could happen, for example, if acting on X1 with K1 hides it from the effect of K2. Uncoupling of targets in this way may underlie the distinct effects of signals like FGF at different stages of development [3538]. It will be interesting to explore time series data for hints that some genes pulse ON and OFF in order to protect their promoters from the actions of promiscuous regulators.

Multistep processes other than development can benefit from the type of noncommutative regulation highlighted in Fig 8. What seems like an intractable problem at the start becomes much more feasible if one realizes that the effects of actions change with time and context. This intuition is why thinking in terms of commutators [A, B] = ABBA can make complex problems more soluble: the desired effect is often what is leftover after performing and undoing a sequence of actions. Several examples illustrate this concept.

With its increased capacity for generating diversity, sequential logic is likely to be used in evolution. A recent theoretical example in social bacteria demonstrated that in evolving a new quorum sensing receptor-ligand pair, adding new receptors prior to ligands is preferred over the opposite path [45]. An analysis of the stability and catalytic activity of a family of bacterial β-lactamase mutants showed that the ability to evolve new substrate specificity is contingent on mutations that first stabilize the protein active site [46, 100]. Finally, biological networks evolve the same functions in different orders, but the order in which these functions arise dictates which other genotypes can be reached by neutral mutations [44]. These results suggest that permuted sequences of mutation events may have different fitness costs. With extensive artificial evolution experiments underway in protein engineering [100] and bacterial mutation accumulation [47], coupled with progress in sequencing technologies, it will be possible to test this hypothesis by permuting the conditions that promote mutation.

Sequential logic can also be applied in synthetic biology to build circuits with memory [43, 101103]. In general, the toolkit that permits up- and downregulation of genes is small, with a few staples like Lac, Tet, and Ara [104]. Significant effort has been put into generating logic gate (AND/OR) promoters [30]. To further expand the toolkit, it has been proposed that more orthogonal regulators be developed [105]. We suggest that sequential logic may be a more promising strategy to scale up the number of targets that can be independently controlled by permuting in time a small number of controllers.

More broadly, sequential logic can be used to accomplish experimental goals not possible in single-step approaches. For example, in multiplexing mRNA detection in single cells, we previously used a sequential hybridization scheme that permits the number of barcodes to exponentially [106], whereas combinatorial schemes can only specify approximately 30 barcodes. We expect many single-cell experiments to benefit from a sequential strategy in which detours facilitate achievement of the main goal with high efficiency.

Finally, our results connect outside of biology to strategic planning in social, political, and economic arenas. Anyone familiar with negotiating knows about the limitations inherent in trying to make interconnected groups of people move in specific directions, especially when the actions affect all participants at once. Multiparty negotiations and tournaments may benefit from time-ordered strategies in which enemies temporarily team up, or fringe interest groups are transiently pacified. Indeed, a conclusion from the sequestration model is that the most highly regulated targets need to be protected prior to satisfying the ones with fewer connections. Determining whether this prediction is borne out in congressional and international negotiations, for example, is an interesting question for political science. Evidence for noncommutative effects in games exists in that the initial seeding in a tournament can bias its outcome [107], and that long-term goals change players’ strategies in in the repeated prisoner’s dilemma [108]. In conclusion, the direct path to an outcome in a networks with many interacting parts may have many unintended and prohibitively expensive consequences. A multi-step strategy may achieve the same outcome with minimal cost and side effects.

Materials and Methods

1. The connectivity matrix with multiple targets

In this section we determine how many targets are controlled by the same regulators in the connectivity matrix A. Then we extend A to more than 2 dimensions.

If ln = lm = 1 it is clear that each Ai,j corresponds to a single target and that each target appears only once. In general, however, a target can appear in multiple entries of A (cf. Eq (6)). To see this, consider the bipartite graph formed by all the targets and all the K’s, but none of the P’s. The handshaking lemma from graph theory [59] says that the total number of edges is one half the sum of the degrees of each vertex, which is either ln for a target or some number pn for a K regulator. There are Nln total edges, so we find or for the number of links coming from each K. Similarly, the number of links emanating from each P is . In terms of the connectivity matrix, pn and pm correspond to the number of unique targets in each row and column, respectively.

Because K1 connects to a fraction of the targets, it follows that K1 and P1 together connect to a fraction of the targets. Therefore, the total number of targets connecting to K1 and P1 is . Another way to see this is to consider one target in the intersection of K1 and P1. This one target uses up one of each of the regulators and one unit of connectivity, leaving a total of ways to connect other targets to the same pair of regulators. It is easily verified that these two formulations for the number of targets per matrix entry M are equivalent. This illustrates that there is not simply a one-to-one correspondence between the entries of A and the targets.

There was nothing special about the labels K and P in the above paragraphs. Thus, the connectivity matrix can easily be extended to a u-dimensional connectivity tensor where u is the number of pools of regulators. Each pool has ni regulators connecting to lni targets, and each target connects to regulators of pool i, ∀i ∈ {1, …, u}. The total number of targets and the total number of targets per entry are extensions of the u = 2 case, giving (10) distinct targets and (11) targets controlled by one factor from each of the u pools. S1A Fig shows an example network with u = 3 pools.

2. Counting configurations in combinatorial networks using the connectivity matrix

The number of configurations in combinatorial logic is the number of ways that N targets can each be bound by exactly u regulators, where each regulator comes from a different pool. In the main text we analyzed the case u = 1 and ln = 1. Here we extend those results to arbitrary u and ln.

First consider the case u = 2, corresponding to a pool of K’s and a pool of P’s. Whereas in the ratchet model, Ki and Pj acted separately on the entries of A, in combinatorial logic the pair (Ki, Pj) is needed to switch Ai, j from 0 to 1. Many such pairs may be active at any one time. We write this formally as (12) where {K} denotes a subset of the K’s. The notation (⋅, ⋅) means that a combination of factors acts on the target, instead of just a single factor.

If ln = lm = 1 there are (2n − 1)(2m − 1) + 1 ways to pick at least one of n K’s and one of m P’s, plus one way to pick nothing. If lm = 1 and ln > 1, then for a certain number αn of the K’s, any subset containing α or more K’s has the same effect as activating all n K’s at once. For example, in Eq (6), the action of ({K1, K2}, {P1, P2}) is sufficient to activate all targets in the n = m = 3, ln = lm = 2 network. To determine α, recall that there are M targets in each entry of the connectivity matrix A. Choosing i K’s means that the total number of targets is M × i, but a single column of A only contains pm unique targets. Each target is connected to ln K’s, so for a target in the intersection of i K’s and a single P, there are lni spots left over to choose ni K’s and lm − 1 spots left over to choose m − 1 P’s, or ways total. Using the principle of inclusion-exclusion [59] this means that α is the smallest i such that (13) By choosing α K’s, the number of unique targets in a column of A that can be turned ON is exactly the number represented in that column. Because all subsets with α, α + 1, …, n − 1 K’s are redundant, here are only subsets of K’s that contribute to unique configurations, leaving a total of unique configurations.

If the P’s also have redundant connections, the result generalizes to

Theorem 1 The number of configurations in combinatorial logic with parameters n, m, ln, lm, and u = 2 is (14) where α (resp. β) is the smallest number of K’s (resp. P’s) having the same effect as all K’s (resp. P’s) at once.

This result is obtained by counting all pairings of K’s and P’s, then subtracting those pairings that have a redundant effect. For example, any combination using K3 is redundant in the connectivity matrix of Eq (6). Finally, those pairings that were excluded twice are added back in.

This result generalizes to all u with slight modifications. Because one factor from each of u pools is now required, the combinatorial equation determining state of a target is (15) Here the double subscript Kik indicates the kth factor in the ith pool. Determining αi for each pool i of regulators requires finding the pool ji which maximizes the number Ni of targets controlled in two dimensions. If we choose αi or more regulators in the ith pool, then there is a redundancy in the jth dimension, whereas any choice of fewer than αi regulators activates fewer than Ni targets. Write the total number of targets and the number of targets in any column of the the equivalent ni × nj connectivity matrix regulated by pools i and j. It is easy to see that these parameters reduce to their previous definitions for u = 2. Now define as the number of targets in each entry of the equivalent ni × nj connectivity matrix. As above, αi is now the smallest r such that (16)

Once αi is determined for each pool i, the inclusion-exclusion sum can be extended using standard arguments [59]. Define by (17) where σ denotes all k-subsets of {1, …, u}. Then we have the final result

Theorem 2 The total number of configurations in combinatorial logic with u pools and parameters ni, lni, i ∈ {1, …, u} is (18)

This result reduces to Theorem 1 when there are only u = 2 pools. At most there are ways to specify at least one target, corresponding to the 0th-order term in Eq (18). Increasing the connectivity through the lni can only reduce the number of configurations. This behavior is shown in S1B Fig for the symmetric case that all the ni and lni are equal. As u is increased the number of configurations increases dramatically, but the scaling is actually subexponential, i.e. less than 2N. Increasing connectivity through lni shifts the curves to the right.

3. Using the connectivity matrix to establish a one-to-one correspondence between the ratchet network and the lonesum matrices

To establish the correspondence between the reachable configurations of ratchet network (ln = lm = 1, T = 1) and the lonesum matrices, we must show (i) that A avoids the patterns and in any 2 × 2 sub-block, and (ii) that any lonesum matrix can be constructed from K and P actions. First observe that the value 1 in Ai, j indicates the last K affecting that index must have followed a P, whereas 0 indicates the last P must have followed a K. For the first restriction we have implies . This means P2 follows K1 follows P1 follows K2 follows P2, which is a contradiction, showing that this 2 × 2 block is unreachable. The other five unique 2 × 2 blocks are all reachable with elementary sequences. This establishes point (i) that the reachable configurations are a subset of the lonesum matrices.

To establish point (ii) that the lonesum matrices are a subset of the reachable configurations, we use an equivalent formulation of the lonesum matrices as staircase matrices composed of the rows aj = (1, …, 1, 0, …, 0) with the last 1 appearing at position ij subject to the constraint that ijij − 1 for all ∀j ∈ {2, …, n} [109]. It is easy to see that the pattern of ones resembles an inverted staircase. We show via induction that any staircase matrix can be constructed from K and P actions. The nth row is obtained by the sequence Kn Pin + 1Pm which leaves 1’s at the first in indices and 0’s at the remainder. Now assume that the kth row is obtained by the sequence Kk Pik + 1Pm without affecting any of the rows n, n − 1, …, k + 1. Then the sequence Kk − 1 Pik − 1 + 1Pm puts 1’s at the first ik − 1 indices of row k − 1. Because ik − 1ik ≥ ⋯ ≥ in, none of the Pik − 1 + 1, …, Pm turn a 1 to a 0 in rows n, n − 1, …, k + 1, k. This proves the induction hypothesis and shows that the staircases matrices are a subset of the reachable configurations.

Together with the fact that the reachable configurations are a subset of the staircase matrices, this implies that the reachable configurations and the lonesum matrices are in fact the same set, and we have

Theorem 3 The number of reachable configurations in the (n, m) ratchet network with ln = lm = 1 and threshold 1 scales as the poly-Bernoulli numbers .

4. Inductive proof that all binary ON/OFF configurations are reachable in the ratchet network with threshold greater than 1

With T = 2, only targets in state 2 are ON. Once a 0-1 configuration of A is obtained, however, it is a simple matter to convert it into an ON/OFF configuration by applying all the K’s. Here we use the fact that 1’s can be reached from above and below to prove the

Theorem 4 In the ratchet network represented by the matrix A with ln = lm = 1 and threshold T = 2, all binary 0-1 configurations are reachable.

Proof. We use an induction argument analogous to the proof of Theorem 3. Suppose that in row n a set of rm indices {nj} = {nj1, …, njr} should be ON. First prepare every target in row n in the 1 state using Kn, then use the sequence Kn Pjr + 1Pjm to obtain 2’s at {nj1, …, njr} and 1’s at {njr + 1, …, njm}. Now assume that we can prepare rows n, n − 1, …, k + 1 in a similar 1-2 configuration with the rest of the matrix 0. We want to show that we can add row k to this set without affecting any of the previous rows. Assuming that a set of sm indices {kj1, …, kjs} should be ON, apply the sequence to obtain 2’s at {kj1, …, kjs} and 1’s at {kjs + 1, …, kjm}. Now, because {Pj1, …, Pjs}∪{Pjs + 1, …, Pjm} = {P1, …, Pm}, all 2’s and 1’s in rows n, n − 1, …, k + 1 are now 1’s and 0’s, respectively. Applying the sequence Kn Kn − 1Kk + 1 reestablishes the 1-2 configuration we had prior to fixing row k and leaves 0’s at rows 1, …, k − 1. Now that row k is also in the proper 1-2 configuration, we have proved the induction hypothesis. Once all rows in the proper 1-2 configuration, the sequence P1Pm obtains the matrix in the 0-1 configuration. Since this procedure can be repeated for any collection of indices {{1j}, …, {nj}}, it follows that all binary 0-1 matrices are reachable.

5. A recursive formula for the number of non-redundant sequences in the ratchet network

When the connectivity parameters ln and lm exceed 1, certain sequences in the threshold 1 ratchet network become redundant. Our goals in this section are to (i) to characterize the redundant sequences by the number of K’s and P’s, and (ii) count the non-redundant sequences. This will obtain an upper bound on the number of configurations.

We want the shortest sequences that can activate or (deactivate) all targets; any sequences longer than this are redundant. To see why this is so, we need the concept of a cycle. We say that a target has gone through a cycle if has traversed the states 0, 1, 0 at some subsequent time points. We have the following lemma.

Lemma 5 Any sequence that takes all targets through a cycle is redundant.

Proof. The final configuration of any sequence is represented by the positions of the 1’s and 0’s of the connectivity matrix. Recall that Ai,j = 0 if an only if all targets represented by Ai,j are OFF in the final configuration. Permute the rows and columns of A until it is in staircase form with r ≤ min(n, m) steps, where a step is a group of adjacent rows or columns having the same number of 1’s and 0’s. The steps partition the rows and columns of A into subsets of indices {i1, i2, …, ir} and {j1, j2, …, jr} where the kth step is defined by 1’s at rows ik to ik + 1 − 1 and 0’s at columns jk to jk + 1 − 1. Then the sequence obtains the desired configuration of 1’s and 0’s. Being able to write a staircase matrix for the final configuration means that every target ON in the final configuration occurs only where there are 1’s in the matrix. These targets are never affected by a P in this procedure; they do not go through a cycle. Because any allowed configuration can be reached from this procedure, it follows that any sequence that uses a cycle is redundant.

Knowing that the non-redundant sequences must avoid cycles, it suffices to find the longest sequences that can be written before cycles appear.

Lemma 6 For each value of ln (lm), the maximum number of K’s (P’s) that can be used before all targets are activated (deactivated) is n − ln + 1 (m − lm).

Proof. A sequence that activates all targets has no intervening P’s. Recall that a single K activates at most targets. Then, prior to the last K being used, the number of activated targets is . This means there are at most nln groups of targets controlled by different K’s. Thus, at most nln K’s are used before the last K is used, and nln + 1 K’s must be sufficient to activate the complete set. The maximum number of P’s that can be used is only mlm because we can think of every sequence starting in the zero configuration as having been preceded by a single P; this modification puts the P’s on equal footing with the K’s.

With this characterization of the non-redundant sequences our goal is to recursively eliminate sequences that use nln + 1 K’s and mlm P’s. We first find the number of sequences that use up to mlm P’s, which forms the top row in each (n, m) block in S2 Fig. Then we use these values to recursively find the number of sequences using up to nln + 1 K’s. The strategy is to subtract from the total number of sequences at a given (ln, lm) all those sequences using the forbidden number of regulators in order to get the new total.

Denote by the number of sequences using m P’s when the total number of K’s is n. If m = 1, then all sequences (except for the empty sequence) use a K and none use a P. If m = 2, the maximum number of P’s that can be used is mlm = 1. Discarding the 2n sequences with no P, the number of sequences using a single P is (19) Division by m = 2 is required to account for the fact that there are different ways of starting each sequence with a P, and we consider both of these equivalent. Having determined , it is straightforward to determine . Because there are m + 1 P’s to choose from, there are ways to write sequences with m P’s, ways to write sequences with m − 1 P’s, …, ways to write sequences with 0 P’s, the only remaining sequences are those with m + 1 P’s. Knowing that the total number of sequences is , this leaves (20) total sequences using m + 1 P’s when the total number of K’s is n. Having determined this number, we can sum up all the sequences using mlm P’s to get the first row of the (n, m) block in S2 Fig. Denote by the lmth column and lnth row of the (n, m) block. The column headers are given by (21)

We can determine the row entries for ln > 1 in the same way that we determined the column headers, the only difference being that the total number of sequences is , not unless lm = 1. Denote by the number of sequences using n K’s when the total number of P’s is m and the P connectivity is lm. For fixed m, lm and n = 1, there are (22) sequences, as all but the empty sequence use a single K. In complete analogy to Eq (20) we find there are (23) sequences using n + 1 K’s when the total number of P’s is m. Unlike in the equation for , there is no division by n + 1 because all sequences starting with a different K are different. Finally, we can sum up all the sequences using nln + 1 K’s to get the

Theorem 7 The number of minimal length sequences in the (n, m, ln, lm) ratchet network with threshold T = 1 using no more than nln + 1 K’s and mlm P’s is (24)

We used this formula to compute each entry in S2 Fig. Because of the complexity of this procedure, we checked it against a computer algorithm operating with the following steps. In step 1 find all sequences in the ln = lm = 1 case. In step 2 increase the connectivity (ln or lm) and find all sequences of a given length; group them by the configuration they generate. Some of these sequences will not appear in the list generated by step 1: for example, both K1 K2 and K2 K1 will be found in step 2. We are interested in index permutation e.g. 1 → 3, not letter permutation, so in step 3 delete all sequences in each length group not appearing in step 1. Repeat steps 1–3 with this new list of sequences until ln = n − 1. This code, implemented in Matlab Version 2015b, gave exact agreement with Theorem 7.

6. Proof that the reachable configurations are equivalent to the connected one-colorings

We now show that rules restrict the reachable configurations of the sequestration model in the main text to the connected one-colorings of the reduced n-network.

Theorem 8 There is a one-to-one correspondence between the reachable configurations of the reduced n-network and the connected one-colorings.

Proof. The converse direction, reachable implies connected, is easier to prove and will be discussed first. Assume that all configurations in the reduced n-network so far reached are connected. The next configuration will be reached by turning all 0’s to i’s or all j’s to 0’s by application of Ki or Pj, respectively. The k-arm targets sharing state i with the 2-arm target {0, 1, i} are either in the same state as some other 2-arm target {0, 1, i′} or are in the 0 state. So application of Ki cannot change the connectivity of the configuration. Furthermore, a k-arm target can be in the j state only if the target {0, 1, j} is in the j state, so these targets will still be matched after application of Pj. Thus, any configurations reachable from a reachable configuration must be connected.

The forward direction, connected implies reachable, is less trivial. In order to prove that all connected one-colorings in the n-network are reachable, we will use the strong form of mathematical induction. Assume the theorem holds for all networks up to n − 1. Embedded within the full n-network of 2n − 1 targets is the reduced n-network on 2n − 1 targets. Within the reduced n-network is a set of 2n − 2 targets able to access {0, 1, 2} and all subsets (including Ø) of the integers {3, …, n}. Thus, we can substitute 2 → 1 as the ON state in this embedded network and all connected one-colorings (of 2) will be reachable. The same holds in general for all 2nk targets able to access {0, 1, k} and all subsets of the integers {k + 1, …, n}. In each of these embedded networks the substitution k → 1 as the ON state will enable us create any connected one-coloring.

Pick any connected one-coloring (of 1) in the n-network. Its opposite configuration is formed by the transformation at each target g of 1 → 0 and 0 → kmin, where kmin = min{kg|xpos({0, j, k}) = 0} is the smallest index that g shares with a corresponding 2-arm target at position pos({0, j, k}) of x (possibly in the full network) currently in the 0 state. The opposite of a connected one-coloring is clearly connected, because all the connected 1’s are now 0, and all the 0’s are in the same state as the 2-arm target {0, j, kmin}. If it is possible to reach the opposite configuration, then application of the sequence K1 P2Pn yields the desired one-coloring of the n-network.

To show that the opposite configuration of the chosen one-coloring is indeed reachable, isolate the embedded networks one-by-one by application of the sequence Kk K1 Pk for k = 2, …, n, so that the targets in the nk + 1-network are the only targets in the 0 state. By hypothesis, the connected one-colorings are reachable in all embedded networks which have at most nk states besides 0, 1, and k. The opposite configuration in the n-network is composed of connected one-colorings (of k) in each embedded network; these are are reachable. Therefore, the one-coloring of the n-network is reachable via K1 P2Pn. This procedure holds for any one-coloring.

7. Lower and upper bounds for the full n-network

How many configurations are reachable in the full n-network? Let this number be cn. The following theorems derive lower and upper bounds for cn in terms of the number of one-colorings.

Theorem 9 The formula f(n + 1) for the number of connected one-colorings in the reduced n + 1-network is a lower bound for cn.

Proof. The full n + 1-network can be partitioned into a set of 2n targets having a 1 and all subsets of {2, …, n + 1}, and 2n − 1 targets that lack 1 but have all nonempty subsets of {2, …, n + 1}. The latter set of targets is an embedded full n-network, while the former is the reduced n + 1-network. All 2(n + 1) letters are needed to form the one-colorings in the reduced n + 1-network. Every one-coloring is finally obtained by applying some permutation of K1, P2, …, Pn + 1 to a configuration that uses (at most) the states 2, …, n + 1 and 0, i.e. the full n-network. Because K1 and P1 do not affect the targets of the the embedded full n-network, there must be (at least) one sequence using only {K2, …, Kn + 1} and {P2, …, Pn + 1} that prepares the embedded full n-network in the aforementioned configuration, which means we may associate a one-coloring with (at least) one of the cn sequences in the embedded full n-network. Therefore, multiple configurations in the full n-network may map to the same one-coloring in the reduced n + 1-network. Conversely, if two one-colorings are different, they are distinguishable by their configurations immediately preceding the final K1, P2, …, Pn + 1 sequence, and must therefore map to different configurations in the full n-network. Together, these statements imply that the map from configurations in the full n-network to one-colorings in the reduced n + 1-network is many-to-one, but the map from one-colorings to configurations in the full n-network is one-to-one. Therefore, f(n + 1) ≤ cn.

Theorem 10 An upper bound on cn is (25) where (n)k = n(n − 1)⋯(nk + 1) is the falling factorial.

Proof. There are nf(n) one-colorings in the full n-network, plus one origin. Each one of the one-colorings can be thought of as the origin of an n − 1-network, which in turn generate (n − 1)f(n − 1) one-colorings in an embedded n − 1-network, for a total of configurations using 1, 2, and perhaps 0, hence termed two-colorings. However, one of the f(n) one-colorings is the 0 state of the n-network, so it does not generate any two-colorings. Thus, there are at most 1 + nf(n) + n(n − 1)(f(n) − 1)f(n − 1) zero-, one-, and two-colorings. Now assume that the number of k-colorings is Of these, are origins of an nk-network, meaning they are actually k − 1-colorings; they cannot generate any k + 1-colorings. The remaining are genuine k-colorings which can generate f(nk) one-colorings in the nk-network, or equivalently, k + 1-colorings. Thus, the total number of zero-, one-, two-, …, k + 1-colorings is no more than This induction argument proves the statement.

8. Properties of the orbits in the ratchet and sequestration network

First we define what it means to be an origin and an orbit in the threshold-1 ratchet network and determine the number of orbits as a function of n and m. Then we prove that the configurations in the sequestration network are defined by reversible paths.

A forbidden configuration in the ratchet network contains some row or column permutation of the pattern on any 2 × 2 sub-block of the connectivity matrix A. This is the minimum violation, but larger blocks may violate this pattern as well, for example has 2 violations. Furthermore, application of any of the K’s or P’s in this sub-block will relieve at least one of these violations. Therefore, we define an i, j-orbit in the ratchet network as the locus of configurations having a forbidden configuration on an i × j sub-block that does not use the corresponding set of i K’s and j P’s. The origin of any i, j-orbit is the configuration having all remaining nmij entries of A equal to 0 (or all 1 to make the case of having only P actions symmetric with having only K’s). A matrix X having the same forbidden i × j sub-block as an origin Y is not considered to be in the orbit of Y if (i) there is no sequence of actions that transforms Y to X, or (ii) if the sequence involves one of the forbidden K’s or P’s. With these restrictions, the number of origins is equal to the number of orbits.

Denote by the number of orbits in a ratchet network of size n × m with violations involving in K’s and jm P’s. If i = j = 2 there are forbidden configurations that turn into origins for the remaining ni K’s and nj P’s. There are more orbits in these smaller networks. For every i′, j′ ≥ 2 there are configurations reached by orbits using iK’s and jP’s. Only configurations not reached by these orbits are available as new origins when the number of K’s and P’s not to be used is i and j, respectively. Finally, there are ways to specify in K’s and jm P’s. Then we have the

Theorem 11 For a given set of in K’s and jm P’s, the number of i, j-orbits is (26) and the the total number of i, j-orbits in the n × m ratchet network is (27) where (28)

The modification B′ ensures that an orbit lacking allowable P’s (K’s) can still use K’s (P’s). A table of values of Eq (27) is given in S4 Fig.

We noted in the main text that configuration in the sequestration network can be joined by reversible paths. A path Ki Pj or Pj Ki is reversible if a configuration reached by the sequence of actions w is also reached by the either the sequence wKi Pj or wPj Ki, but not wKi or wPj, respectively. Thus we can also prove the

Theorem 12 There always exists a reversible path between any two configurations in an orbit of the sequestration network.

Proof. Let x be a configuration in an orbit using mn of the actions, and let P denote the locus of configurations reached from x. We now need to show that P must be reversibly reached from the origin. Denote by the complement of P, so that any is reversibly reached from the origin. In order for there to be no reversible path between xP and , there must always be a state i such that Ki increases the number of targets {⋅, i} in the i state and Pi increases the number of targets {⋅, i} in the zero state. Now assume there is a configuration zP using all m allowed states. z must have at least one target in the 0 state, but this is un-allowed, because then z would violate the connection rule. Therefore, there is a maximum number m′ < m of states used by any xP. Now assume there is a configuration z′ ∈ P using all m′ allowed states. But this implies that there is a single-arm target {0, j} that must be in the zero state. Then the action Kj takes z′ to a configuration and Pj takes y to z. This path must be reversible, and z′ is reached reversibly from the origin. By induction we conclude that m′ = 0 and that P = Ø. Finally, because any two configurations are reached reversibly from the origin, there is a reversible path between them.

Theorem 12 defines the orbits of the sequestration network as those configurations connected by reversible paths.

9. A universal formulation of the actions as matrix operators

In this section we show how to write the K and P regulators as matrix operators in a manner consistent with both models considered in the paper. First we define the vector space of configurations of the N targets, then we derive the operators that transform .

Let . For a network with N targets we require that ∑i xi = N. This means that x has at least N entries, and in general dim xN. Therefore, we cannot use the standard state space of N-dimensional vectors, because the operators will not conserve the number of targets. Each target has a 0 state. The number D of independent directions accessible from 0 is called the dimension of the network, and the number T of steps one can move along each dimension is called the threshold. In the ratchet model, each target has a single ladder of states with variable threshold, so D = 1 and T is allowed to vary; in the sequestration model D = n but the threshold is T = 1.

Denote by Adi the fraction of the targets of type A in state i ∈ {0, 1, …, T} along dimension d. For a subset of the targets a K-type action causes population transfer between states (d, j) and (d, i) with i = j + 1, and a P-type action the reverse. If a K regulator acts for a short time we can write the “reaction rate” equation as (29) where gA > 0 is a proportionality constant. This defines a matrix differential equation (30) with the vector of populations of the DT + 1 states of the N targets and the block diagonal matrix of rate constants between the j and j + 1 population states along dimension d. Eq (29) can be rewritten (31) Because Gdj is block diagonal, Eq (30) can be solved by exponentiation on each block: (32)

The restriction of the model from a continuous range of population states xAi ∈ [0, 1] to the boolean values {0, 1} formally emerges by considering the “reaction” K catalyzes on its targets to have gone to completion. We do this by taking the the limit t → ∞ in Eq (32) to get (33) so that the matrix Kdj defined by (34) is the block diagonal matrix having 1’s at (row, column) positions (1 + (d − 1)T + i, 1 + (d − 1)T + j) of each block that responds to K in dimension d and admits population transfer between from j to i.

Because K acts on all targets at once, it is insensitive to the initial state j. Thus the matrix corresponding to the action of K is (35) which is the block diagonal matrix having 1’s at (row, column) positions of each block that responds to K in dimension d.

This derivation can be repeated in the case that population goes in the opposite direction from at state j to a state i < j using a different set of rate matrices Hdj corresponding to the reverse of Eq (31). We obtain the block diagonal matrix Pd corresponding to the action of P in dimension d having 1’s at (row, column) positions of each block that responds to P in dimension d. Whereas Kd is sub-diagonal, Pd is super-diagonal.

The Baker-Campbell-Hausdorf expansion shows that Kd in Eq (35) and in general any product of matrices Kd and Pd are generated by matrix exponentiation of commutators of the generators Gdj, Hdj. This is the origin of noncommutativity in both the ratchet and sequestration models.

An example in the sequestration network illustrates population transfer between states. In the n = 2 network on the targets A, B, and C the initial configuration of the network is represented by . Only targets A and C can access dimension 1, and only targets B and C can access dimension 2. Therefore the t → ∞ action of K1 on the network is given by (36) Only A and C advance to state 1 and the number of targets (3) is conserved.

Supporting Information

S1 Fig. Scaling in combinatorial networks is sub-exponential.

(A) An example network with u = 3 pools of n = 2 regulators each. A target is only ON if all u of its regulators bind. (B) Plots of Eq (18) vs. n for an increasing number of pools u and increasing redundancy ln.

https://doi.org/10.1371/journal.pcbi.1005089.s001

(TIF)

S2 Fig. Number of unique words in the threshold 1 ratchet network as a function of n, m, ln, and lm found using Eq (24).

n and m increase the across the rows and up the columns. ln and lm increase down the columns and across the rows of the sub-blocks.

https://doi.org/10.1371/journal.pcbi.1005089.s002

(TIF)

S3 Fig. The full n-network model has upper and lower bounds.

(A) A plot of all the allowed configurations of a set of targets controlled by n = 3 regulators pairs in the full n-network. Blue, cyan, yellow, and red correspond to states 0, 1, 2, and 3, respectively. (B) A list of the words generating the corresponding states in A. K actions are shown in the red spectrum, and P in the blue. (C) A logarithmic plot of the bounds on the full model. The total space is , the upper and lower bounds are calculated from Eqs (24) and (7), respectively, and the combinatorial model is 22n.

https://doi.org/10.1371/journal.pcbi.1005089.s003

(TIF)

S4 Fig. Number of orbits restricted from using i of the K’s and j of the P’s in the threshold 1 ratchet network as a function of n and m calculated using Eq (27).

n and m increase the across the rows and up the columns. i and j increase down the columns and across the rows of the sub-blocks.

https://doi.org/10.1371/journal.pcbi.1005089.s004

(TIF)

Author Contributions

  1. Conceived and designed the experiments: LC WL.
  2. Performed the experiments: WL.
  3. Analyzed the data: WL LC.
  4. Contributed reagents/materials/analysis tools: WL.
  5. Wrote the paper: WL LC.

References

  1. 1. Ma HW, Kumar B, Ditges U, Gunzer F, Buer J, Zeng AP. An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs. Nucleic Acids Res. 2004;32:6643–6649. pmid:15604458
  2. 2. Balaji S, Babu MM, Iyer LM, Luscombe NM, Aravind L. Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. J Mol Biol. 2006;360:213–227. pmid:16762362
  3. 3. Guelzim N, Bottani S, Bourgine P, Képès F. Topological and causal structure of the yeast transcriptional regulatory network. Nat Genet. 2002;31:60–63. pmid:11967534
  4. 4. Pfeiffer A, Shi H, Tepperman JM, Zhang Y, Quail PH. Combinatorial complexity in a transcriptionally centered signaling hub in Arabidopsis. Mol Plant. 2014;7:1598–1618. pmid:25122696
  5. 5. Rhee DY, Cho DY, Zhai B, Slattery M, Ma L, Mintseris J, et al. Transcription factor networks in Drosophila melanogaster. Cell Rep. 2014;8:2031–2043. pmid:25242320
  6. 6. Neph S, Stergachis AB, Reynolds A, Sandstrom R, Borenstein E, Stamatoyannopoulos J. Circuitry and dynamics of human transcription factor regulatory networks. Cell. 2012;150:1274–1286. pmid:22959076
  7. 7. Hua S, Kittler R, White KP. Genomic antagonism between retinoic acid and estrogen signaling in breast cancer. Cell. 2009;137:1259–1271. pmid:19563758
  8. 8. Lu H, Ward MG, Adeola O, Ajuwon KM. Regulation of adipocyte differentiation and gene expression-crosstalk between TGFβ and wnt signaling pathways. Mol Bio Rep. 2013;40:5237–5245.
  9. 9. Iwanaszko M, Kimmel M. NF-κB and IRF pathways: cross-regulation on target genes promoter level. BMC Genomics. 2015;16:307. pmid:25888367
  10. 10. Varjosalo M, Keskitalo S, Drogen AV, Nurkkala H, Vichalkovski A, Aebersold R, et al. The protein interaction landscape of the human CMGC kinase group. Cell Rep. 2013;3:1306–1320. pmid:23602568
  11. 11. Sacco F, Perfetto L, Castagnoli L, Cesareni G. The human phosphatase interactome: An intricate family portrait. FEBS Lett. 2012;586:2732–2739. pmid:22626554
  12. 12. Sasaki T, Minamisawa G, Takahashi N, Matsuki N, Ikegaya Y. Reverse optical trawling for synaptic connections in situ. J Neurophysiol. 209;102:636–643.
  13. 13. Vickaryous MK, Hall BK. Human cell type diversity, evolution, development, and classification with special reference to cells derived from the neural crest. Biol Rev Camb Philos Soc. 2006;81:425–455. pmid:16790079
  14. 14. Ohnishi Y, Huber W, Tsumura A, Kang M, Xenopoulos P, Kurimoto K, et al. Cell-to-cell expression variability followed by signal reinforcement progressively segregates early mouse lineages. Nat Cell Biol. 2014;16:27–37. pmid:24292013
  15. 15. Yan L, Yang M, Guo H, Yang L, Wu J, Li R, et al. Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat Struct Mo Biol. 2013;20:1131–1139.
  16. 16. Zeisel A, Muñoz-Manchado AB, Codeluppi S, Lönnerberg P, La Manno G, Juréus A, et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science. 2015;347:1138–1142. pmid:25700174
  17. 17. Jaitin DA, Kenigsberg E, Keren-Shaul H, Elefant N, Paul F, Zaretsky I, et al. Massively parallel Single-Cell RNA-seq for marker-free decomposition of tissues into cell types. Science. 2014;343:776–779. pmid:24531970
  18. 18. Moignard V, Woodhouse S, Haghverdi L, Lilly AJ, Tanaka Y, Wilkinson AC, et al. Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat Biotechnol. 2015;33:269–276. pmid:25664528
  19. 19. Treutlein B, Brownfield DG, Wu AR, Neff NF, Mantalas GL, Espinoza FH, et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature. 2014;509:371–375. pmid:24739965
  20. 20. Brunskill EW, Park JS, Chung E, Chen F, Magella B, Potter SS. Single cell dissection of early kidney development: multilineage priming. Development. 2014;141:3093–3101. pmid:25053437
  21. 21. Vogelstein B, Kinzler KW. Cancer genes and the pathways they control. Nat Med. 2004;10:789–799. pmid:15286780
  22. 22. Gerhart J, Kerschner M. Cells, Embryos, and Evolution. Blackwell Science; 1997.
  23. 23. Wilson NK, Foster SD, Wang X, Knezevic K, Schütte J, Kaimakis P, et al. Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell. 2010;7:532–542. pmid:20887958
  24. 24. Kaplan T, Li XY, Sabo PJ, Thomas S, Stamatoyannopoulos JA, Biggin MD, et al. Quantitative models of the mechanisms that control genome-wide patterns of transcription factor binding during early Drosophila development. PLoS Genet. 2011;7:e1001290. pmid:21304941
  25. 25. He X, Samee MA, Blatti C, Sinha S. Thermodynamics-based models of transcriptional regulation by enhancers: the roles of synergistic activation, cooperative binding and short-range repression. PLoS Comput Biol. 2010;6:e1000935. pmid:20862354
  26. 26. Thomas R. Boolean formalization of genetic control circuits. J Theor Biol. 1973;42:563–585. pmid:4588055
  27. 27. Kauffman SA. Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol. 1969;22:437–467. pmid:5803332
  28. 28. Davidich MI, Bornholdt S. Boolean network model predicts cell cycle sequence of fission yeast. PLoS ONE. 2008;3:e1672. pmid:18301750
  29. 29. Fauré A, Naldi A, Chaouiya C, Thieffry D. Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle. Bioinformatics. 2006;22:e124–e131. pmid:16873462
  30. 30. Buchler NE, Gerland U, Hwa T. On schemes of combinatorial transcription logic. Proc Natl Acad Sci U S A. 2003;100:5136–5141. pmid:12702751
  31. 31. Peter IS, Faure E, Davidson EH. Predictive computation of genomic logic processing functions in embryonic development. Proc Natl Acad Sci U S A. 2012;109:16434–16442. pmid:22927416
  32. 32. Isshiki T, Pearson B, Holbrook S, Doe CQ. Drosophila neuroblasts sequentially express transcription factors which specify the temporal identity of their neuronal progeny. Cell. 2001;106:511–521. pmid:11525736
  33. 33. Chan EAW, Teng G, Corbett E, Choudhury KR, Bassing CH, Krangel DGSMS. Peripheral subnuclear positioning suppresses Tcrb recombination and segregates Tcrb alleles from RAG2. Proc Natl Acad Sci U S A. 2013;110:E4628–E4637. pmid:24218622
  34. 34. Sayegh CE, Jhunjhunwala S, Riblet R, Murre C. Visualization of looping involving the immunoglobulin heavy-chain locus in developing B cells. Genes Dev. 2005;19:322–327. pmid:15687256
  35. 35. Min H, Danilenko DM, Scully SA, Bolon B, Ring BD, Tarpley JE, et al. Fgf-10 is required for both limb and lung development and exhibits striking functional similarity to Drosophila branchless. Genes Dev. 1998;12:3156–3161. pmid:9784490
  36. 36. Shifley ET, Kenny AP, Rankin SA, Zorn AM. Prolonged FGF signaling is necessary for lung and liver induction in Xenopus. BMC Dev Biol. 2012;6:27.
  37. 37. Iyengar L, Wang Q, Rasko JE, McAvoy JW, Lovicu FJ. Duration of ERK1/2 phosphorylation induced by FGF or ocular media determines lens cell fate. Differentiation. 2007;75:662–668. pmid:17381542
  38. 38. Wada N, Nohno T. Differential Response of Shh Expression Between Chick Forelimb and Hindlimb Buds by FGF-4. Dev Dyn. 2001;221:402–411. pmid:11500977
  39. 39. Iwasaki H, Mizuno S, Arinobu Y, Ozawa H, Mori Y, Shigematsu H, et al. The order of expression of transcription factors directs hierarchical specification of hematopoietic lineages. Genes Dev. 2006;20:3010–3021. pmid:17079688
  40. 40. Ginther MR, Ramus DFWSJ. Hippocampal neurons encode different episodes in an overlapping sequence of odors task. J Neurosci. 2011;31:2706–2711. pmid:21325539
  41. 41. Agster KL, Fortin NJ, Eichenbaum H. The hippocampus and disambiguation of overlapping sequences. J Neurosci. 2002;22:5760–5768. pmid:12097529
  42. 42. Moro SI, Tolboom M, Khayat PS, Roelfsema PR. Neuronal activity in the visual cortex reveals the temporal order of cognitive operations. J Neurosci. 2010;30:16293–16303. pmid:21123575
  43. 43. Ham TS, Lee SK, Keasling JD, Arkin AP. Design and construction of a double inversion recombination switch for heritable sequential genetic memory. PLoS One. 2008;3:e2815. pmid:18665232
  44. 44. Payne JL, Wagner A. Constraint and contingency in multifunctional gene regulatory circuits. PLoS Comput Biol. 2013;9:e1003071. pmid:23762020
  45. 45. Eldar A. Social conflict drives the evolutionary divergence of quorum sensing. Proc Natl Acad Sci U S A. 2011;108:13635–13640. pmid:21807995
  46. 46. Wang X, Minasov G, Shoichet BK. Evolution of an Antibiotic Resistance Enzyme Constrained by Stability and Activity Trade-offs. J Mol Bio. 2002;320:85–95.
  47. 47. Blount ZD, Borland CZ, Lenski RE. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc Natl Acad Sci U S A. 2008;105:7899–7906. pmid:18524956
  48. 48. Albeck JG, Mills GB, Brugge JS. Frequency-modulated pulses of ERK activity transmit quantitative proliferation signals. Mol Cell. 2013;49:249–261. pmid:23219535
  49. 49. Nelson DE, Ihekwaba AE, Elliott M, Johnson JR, Gibney CA, Foreman BE, et al. Oscillations in NF-κB signaling control the dynamics of gene expression. Science. 2004;306:704–708. pmid:15499023
  50. 50. Kellogg RA, Tian C, Lipniacki T, Quake SR, Tay S. Digital signaling decouples activation probability and population heterogeneity. elife. 2015;4:e08931. pmid:26488364
  51. 51. Lahav G, Rosenfeld N, Sigal A, Geva-Zatorsky N, Levine AJ, Elowitz MB, et al. Dynamics of the p53-Mdm2 feedback loop in individual cells. Nat Genet. 2004;36:147–150. pmid:14730303
  52. 52. Lin Y, Sohn CH, Dalal CK, Cai L, Elowitz MB. Combinatorial gene regulation by modulation of relative pulse timing. Nature. 2015;527:54–58. pmid:26466562
  53. 53. Cai L, Dalal CK, Elowitz MB. Frequency-modulated nuclear localization bursts coordinate gene regulation. Nature. 2008;455:485–490. pmid:18818649
  54. 54. Dalal CK, Cai L, Lin Y, Rahbar K, Elowitz MB. Pulsatile dynamics in the yeast proteome. Curr Biol. 2014;24:2189–2194. pmid:25220054
  55. 55. Roach PJ. Multisite and hierarchal protein phosphorylation. J Biol Chem. 1991;266:14139–14141. pmid:1650349
  56. 56. Thomson M, Gunawardena J. Unlimited multistability in multisite phosphorylation systems. Nature. 2009;460:274–277. pmid:19536158
  57. 57. Brewbaker C. A combinatorial interpretation of the poly-Bernoulli numbers and two Fermat analogues. Integers. 2008;8:A02. Available from: http://www.eudml.org/doc/130297.
  58. 58. Kaneko M. Poly-Bernoulli numbers. Journal de théorie des nombres de Bordeaux. 1997;9:221–228.
  59. 59. Martin GE. Counting: The Art of Enumerative Combinatorics. Springer; 2001.
  60. 60. Tanay A, Regev A, Shamir R. Conservation and evolvability in regulatory networks: the evolution of ribosomal regulation in yeast. Proc Natl Acad Sci U S A. 2005;102:7203–7206. pmid:15883364
  61. 61. Peric-Hupkes D, Meuleman W, Pagie L, Bruggeman SW, Solovei I, Brugman W, et al. Molecular maps of the reorganization of genome-nuclear lamina interactions during differentiation. Mol Cell. 2010;38:603–613. pmid:20513434
  62. 62. Zullo JM, Demarco IA, Piqué-Regi R, Gaffney DJ, Epstein CB, Spooner CJ, et al. DNA sequence-dependent compartmentalization and silencing of chromatin at the nuclear lamina. Cell. 2012;149:1474–1487. pmid:22726435
  63. 63. Filion GJ, van Bemmel JG, Braunschweig U, Talhout W, Kind J, Ward LD, et al. Systematic protein location mapping reveals five principal chromatin types in Drosophila cells. Cell. 2010;143:212–224. pmid:20888037
  64. 64. Edwards R, Glass L. Combinatorial explosion in model gene networks. Chaos. 2000;10:691–704. pmid:12779419
  65. 65. Green JA, Rees D. On semi-groups in which xr = x. Mathematical Proceedings of the Cambridge Philosophical Society. 1952;48:35–40.
  66. 66. Takahashi K, Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126:663–676. pmid:16904174
  67. 67. Murray RM, Sastry SS. Nonholonomic motion planning: steering using sinusoids. IEEE Transactions on Automatic Control. 1993;38:700–716.
  68. 68. Sternberg S. 1. In: Lie Algebras. Book online; 2004. p. 7–32. Available from: http://www.math.harvard.edu/~shlomo.
  69. 69. Carravetta M, Levitt MH. Long-lived nuclear spin states in high-field solution NMR. J Am Chem Soc. 2004;126:6228–6229. pmid:15149209
  70. 70. Attanasio C, Nord AS, Zhu Y, Blow MJ, Biddie SC, Mendenhall EM, et al. Tissue-specific SMARCA4 binding at active and repressed regulatory elements during embryogenesis. Genome Res. 2014;24:920–929. pmid:24752179
  71. 71. Kudron M, Niu W, Lu Z, Wang G, Gerstein M, Snyder M, et al. Tissue-specific direct targets of Caenorhabditis elegans Rb/E2F dictate distinct somatic and germline programs. Genome Biol. 2013;14:R5. pmid:23347407
  72. 72. Noordermeer D, de Wit E, Klous P, van de Werken H, Simonis M, Lopez-Jones M, et al. Variegated gene expression caused by cell-specific long-range DNA interactions. Nat Cell Biol. 2011;13:944–951. pmid:21706023
  73. 73. Nagano T, Lubling Y, Stevens TJ, Schoenfelder S, Yaffe E, Dean W, et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature. 2013;502:59–64. pmid:24067610
  74. 74. Schoenfelder S, Sexton T, Chakalova L, Cope NF, Horton A, Andrews S, et al. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat Genet. 2010;42:53–61. pmid:20010836
  75. 75. Marsman J, Horsfield JA. Long distance relationships: Enhancer-promoter communication and dynamic gene transcription. Biochim Biophys Acta. 2012;18–19:1217–1227.
  76. 76. Handoko L, Xu H, Li G, Ngan CY, Chew E, Schnapp M, et al. CTCF-mediated functional chromatin interactome in pluripotent cells. Nat Genet. 2011;43:630–638. pmid:21685913
  77. 77. Splinter E, Heath H, Kooren J, Palstra RJ, Klous P, Grosveld F, et al. CTCF mediates long-range chromatin looping and local histone modification in the β-globin locus. Genes Dev. 2006;20:2349–2354. pmid:16951251
  78. 78. Bell AC, West AG, Felsenfeld G. The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell. 1999;98:387–396. pmid:10458613
  79. 79. Hou C, Zhao H, Tanimoto K, Dean A. CTCF-dependent enhancer-blocking by alternative chromatin loop formation. Proc Natl Acad Sci U S A. 2008;105:20398–20403. pmid:19074263
  80. 80. Guertin MJ, Lis JT. Mechanisms by which transcription factors gain access to target sequence elements in chromatin. Curr Opin Genet Dev. 2013;23:116–123. pmid:23266217
  81. 81. Zentner GE, Tesar PJ, Scacheri PC. Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. Genome Res. 2011;21:1273–1283. pmid:21632746
  82. 82. Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–560. pmid:17603471
  83. 83. Mossman D, Kim KT, Scott RJ. Demethylation by 5-aza-2’-deoxycytidine in colorectal cancer cells targets genomic DNA whilst promoter CpG island methylation persists. BMC Cancer. 2010;10:366. pmid:20618997
  84. 84. Yoshida M, Kijima M, Akita M, Beppu T. Potent and specific inhibition of mammalian histone deacetylase both in vivo and in vitro by trichostatin A. J Biol Chem. 1990;265:17174–17179. pmid:2211619
  85. 85. Dannengerg LO, Edenberg HJ. Epigenetics of gene expression in human hepatoma cells: expression profiling the response to inhibition of DNA methylation and histone deacetylation. BMC Genomics. 2006;7:181.
  86. 86. Qi Y, Fan P, Hao Y, Han B, Fang Y, Feng M, et al. Phosphoproteomic analysis of protein phosphorylation networks in the hypopharyngeal gland of honeybee workers (Apis mellifera ligustica). J Proteome Res. 2015;14:4647–4661. pmid:26384081
  87. 87. Huttlin EL, Jedrychowski MP, Elias JE, Goswami T, Rad R, Beausoleil SA, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010;143:1174–1189. pmid:21183079
  88. 88. Schmidt B, Marrone DF, Markus EJ. Genomic antagonism between retinoic acid and estrogen signaling in breast cancer. Behav Brain Res. 2012;226:56–65.
  89. 89. Brown TI, Ross RS, Keller JB, Hasselmo ME, Stern CE. Which way was I going? Contextual retrieval supports the disambiguation of well learned overlapping navigational routes. J Neurosci. 2010;30:7414–7422. pmid:20505108
  90. 90. Leutgeb JK, Leutgeb S, Moser MB, Moser MI. Pattern separation in the dentate gyrus and CA3 of the hippocampus. Science. 2007;315:961–966. pmid:17303747
  91. 91. Chadwick MJ, Maguire DHMA. Decoding overlapping memories in the medial temporal lobes using high-resolution fMRI. Learn Mem. 2011;18:742–746. pmid:22086391
  92. 92. Waddington CH. The cybernetics of development. In: The Strategy of the Genes. George Allen & Unwin Ltd.; 1957. p. 11–58.
  93. 93. Huang S. Systems biology of stem cells: three useful perspectives to help overcome the paradigm of linear pathways. Philos Trans R Soc Lond B Biol Sci. 2011;366:2247–2259. pmid:21727130
  94. 94. St Johnston D, Nüsslein-Volhard C. The origin of pattern and polarity in the Drosophila embryo. Cell. 1992;68:201–219. pmid:1733499
  95. 95. Akashi K, He X, Chen J, Iwasaki H, Niu C, Steenhard B, et al. Transcriptional accessibility for genes of multiple tissues and hematopoietic lineages is hierarchically controlled during early hematopoiesis. Blood. 2003;101:383–389. pmid:12393558
  96. 96. Biddie SC, John S, Sabo PJ, Thurman RE, Johnson TA, Schiltz RL, et al. Transcription Factor AP1 Potentiates Chromatin Accessibility and Glucocorticoid Receptor Binding. Mol Cell. 2011;43:145–155. pmid:21726817
  97. 97. Voss TC, Schiltz RL, Sung MH, Yen PM, Stamatoyannopoulos JA, Biddie SC, et al. Dynamic Exchange at Regulatory Elements during Chromatin Remodeling Underlies Assisted Loading Mechanism. Cell. 2011;146:544–554. pmid:21835447
  98. 98. Huang S, Guo YP, May G, Enver T. Bifurcation dynamics in lineage-commitment in bipotent progenitor cells. Dev Biol. 2007;305:695–713. pmid:17412320
  99. 99. Ferrell J. Bistability, bifurcations, and Waddington’s epigenetic landscape. Curr Biol. 2012;22:R458–456. pmid:22677291
  100. 100. Romero PA, Arnold FH. Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol. 2008;10:866–876.
  101. 101. Fritz G, Buchler NE, Hwa T, Gerland U. Designing sequential transcription logic: a simple genetic circuit for conditional memory. Syst Synth Biol. 2007;1:89–98. pmid:19003438
  102. 102. Margulies D, Felder CE, Melman G, Shanzer A. A molecular keypad lock: a photochemical device capable of authorizing password entries. J Am Chem Soc. 2007;129:347–354. pmid:17212414
  103. 103. Lou C, Liu X, Ni M, Huang Y, Huang Q, Huang L, et al. Synthesizing a novel genetic sequential logic circuit: a push-on push-off switch. Mol Syst Biol. 2010;6:350. pmid:20212522
  104. 104. Voight CA. Genetic parts to program bacteria. Curr Opin Biotechnol. 2006;23:548–557.
  105. 105. Rao CV. Expanding the synthetic biology toolbox: engineering orthogonal regulators of gene expression. Curr Opin Biotechnol. 2012;23:689–694. pmid:22237017
  106. 106. Lubeck E, Coskun AF, Zhiyentayev T, Ahmad M, Cai L. Single-cell in situ RNA profiling by sequential hybridization. Nat Methods. 2014;11:360–361. pmid:24681720
  107. 107. Groh C, Moldovanu B, Sela A, Sunde U. Optimal seedings in elimination tournaments. Economic Theory. 2008;49:59–80.
  108. 108. Fudenberg D, Maskin E. Evolution and cooperation in noisy repeated games. In: The American Economic Review. vol. 80. American Economic Association; 1990. p. 274–279. Available from: http://www.jstor.org/stable/2006583.
  109. 109. Kim HK, Krotov DS, Lee JY. Poly-Bernoulli numbers and lonesum matrices. arXiv. 2011;1103.4884.