Skip to main content
  • Loading metrics

The Impact of Different Sources of Fluctuations on Mutual Information in Biochemical Networks


Stochastic fluctuations in signaling and gene expression limit the ability of cells to sense the state of their environment, transfer this information along cellular pathways, and respond to it with high precision. Mutual information is now often used to quantify the fidelity with which information is transmitted along a cellular pathway. Mutual information calculations from experimental data have mostly generated low values, suggesting that cells might have relatively low signal transmission fidelity. In this work, we demonstrate that mutual information calculations might be artificially lowered by cell-to-cell variability in both initial conditions and slowly fluctuating global factors across the population. We carry out our analysis computationally using a simple signaling pathway and demonstrate that in the presence of slow global fluctuations, every cell might have its own high information transmission capacity but that population averaging underestimates this value. We also construct a simple synthetic transcriptional network and demonstrate using experimental measurements coupled to computational modeling that its operation is dominated by slow global variability, and hence that its mutual information is underestimated by a population averaged calculation.

Author Summary

This work demonstrates how different sources of variability within biochemical networks impact the interpretation of information transmission. These sources are the intrinsic noise generated within the pathway of a single cell, variability due to initial conditions and/or global parameters across the population. A theoretical analysis of a simple signaling pathway and experimental exploration of a synthetic circuit are used to discuss the contributions of these sources of variability to information transmission using mutual information as a metric.


To survive in challenging conditions, cells need to detect, transduce, and process signals from their environment. A cell’s ability to precisely process environmental signals is limited by intrinsic fluctuations and variability of its cellular processes. This variability takes root in the stochastic nature of biochemical reactions. For a given pathway, this includes the stochastic steps involved in transcription and translation [14] as well as diffusion-reactions, dissociations, allosteric changes, and degradation of biological molecules. A signal propagates across cellular networks through molecules undergoing these various reactions, and gets distorted and altered by their probabilistic nature. Therefore, metrics for quantifying the limits of faithful information propagation (signaling fidelity) in biological pathways are crucial for understanding their information processing and transduction capabilities.

Mutual information [5] (MI) is a natural metric for characterizing information transmission between the inputs of a stochastic network and its nodes. MI quantifies the level of precision with which a given node(s) in a network estimates and responds to an input(s) by accounting for both the mean and variability in the response. Recent studies have used MI to characterize information transmission between environmental inputs and transcription factors in a number of genetic circuits [610]. In these studies, steady-state MI was computed for a variety of in silico networks to assess their stationary response as a function of input dose. More recently, these ideas were extended to optimize time-dependent MI in delay circuits with binary inputs, and MI was used to discuss maximally informative network topologies in these contexts [11]. In addition, time dependent MI calculations were used to obtain fundamental limits on the suppression of molecular fluctuations for different network topologies [12].

Several experimental studies have also used MI to assess signaling fidelity. MI was used as a metric to argue that negative feedback enables dose-response alignment and enhances information transmission in the pheromone pathway in yeast [13]. Similarly, MI was used to estimate time-dependent information transfer in tumor necrosis factor (TNF) signaling, and to assess transmission bottlenecks in this system [14]. Recently, robustness and compensation of information transmission in different pathways and pharmacological perturbations were attempted in PC12 cells using similar measurements [15]. These experimental studies relied on driving isogenic cell populations with various inputs, and then calculating the mutual information based on the overall variability in the population response. Such calculations mostly found low MI values, suggesting that cellular pathways might have on average low information transmission capacity. In this work, we argue that these calculations often under-estimate MI of a pathway in a single cell, since they do not account for 1) variability in initial conditions and 2) variability that is extrinsic to the pathway. The overall effect of these two sources of variability is that any single cell has a quantitatively distinct input-output relationship [3, 4, 16] and that calculations that take this into account are needed for more accurate estimation of MI from experimental data. By assuming that extrinsic variability manifests as cell-to-cell differences in a global parameter, such as translation capacity, we demonstrate in a simple in silico circuit that mixing cells with different parameters sets (and/or different initial conditions) reduces the value of the computed MI. We also argue this point experimentally by building a simple synthetic circuit that exhibits strong extrinsic variability, and then demonstrating with the help of computational modeling that single cells within the population have a larger mutual information than that exhibited by the averaged population. These results indicate that cells might possess higher capacity for information transmission than previously appreciated.


To compute mutual information in a given biological network, we apply simple step functions [14] of the appropriate environmental input to N populations of the same isogenic cells. The step function is mathematically defined as x(t) = 0 for t < 0 and x(t) = X+ for t ≥ 0, where X+ is a constant within a given population. For each of the N populations, X+ is sampled from a discrete uniform distribution, pu(x+), over the range of interest (0 to Xmax). The uniform distribution represents an unbiased distribution (other than the choice of Xmax) that has been routinely applied to steady-state mutual information calculations [6]. Experimentally, one can implement this scheme by growing replicas of the same culture in an N-well plate and stimulating each well with a different step function as defined above (Fig 1a). For a given population n and sampled input amplitude X+(n), the stochastic time-dependent response of measurable proteins y = [y1ym] of the population at t will be p(y, tX+(n)). For a general x+ between 0 and Xmax, we interpolate the central moments of adjacent sampled distributions to construct p(y, tx+). The time-dependent mutual information is then given by (1) The value of N (that is the number of experiments) can be chosen based on well established methods [14] to approximate the MI in Eq (1) (see Materials and Methods for further details).

Fig 1. Time-dependent mutual information transmission in a simple biochemical circuit.

a) Every well contains the same type of isogenic cell population where each well receives a different step input sampled from a uniform distribution. The resulting time series data is used to compute time-dependent mutual information. b) Schematic of pathway. c) Initial distribution of before any input is given, reflecting cell-cell variability in initial conditions. In this case, a single parameter set is used and the distribution is the result of intrinsic variability of the circuit. S1 denotes the mean of the distribution. d) I(x+, y1, tS1), I(x+, My2, tS1), and I(x+, y2, tS1). e) Time-dependent dose-response relationships at t = 0, 75 and t = 750 minutes between y2 and the input. The vertical black dashed line is where x+ = 150 nM. f) Mutual information between y2 and the input computed for heterogenous initial conditions (full distribution in panel c, black) and for homogeneous initial conditions (S1, red). Inset: same data but with time plotted to 2750 minutes and the MI plotted between 2.5 to 3 bits.

We first illustrate MI calculations using an in silico model of a simple signaling cascade (Fig 1b). Here the input Xn(t) causes the transformation of the inactive molecule to its active form Y1. Y1 could be a receptor or transcription factor responsive to the given input. Y1 in turn activates transcription of Y2, whose mRNA is denoted as My2 (chemical equations are detailed in Materials and Methods with the parameter values listed in Table 1). The uniform input distribution is between 0 nM and Xmax = 250 nM. We found that for this system and its corresponding parameters, N ≥ 20 was a conservatively large number of experiments to approximate MI.

Contribution of initial condition variability to time-dependent mutual information

We first assumed that this circuit is isolated from the rest of the cell, and that any stochasticity it exhibits is only the result of its chemical reactions (intrinsic variability). When this system is unstimulated (t ≤ 0), its molecular species assume a joint steady-state distribution, . As an example, we show the marginal distribution of in Fig 1c. This distribution represents the range of initial conditions in that a population containing this network would exhibit before any input is applied.

We will first compute the MI of the network while ignoring this initial distribution of states, assuming that all cells in the population start from the same initial condition (for state S1, this is the mean of the initial joint distribution, see Fig 1c) which we refer to as a homogeneous initial condition. This could be thought about as the mutual information of one cell in that population. We plot the time-dependent mutual information between the input and the different species of the circuit: I(x+; y1, tS1), I(x+; My2, tS1), I(x+; y2, tS1) (Fig 1d). The MI from the input to y1, I(x+; y1, tS1), has rapid dynamics, peaking initially and decaying with time to a steady-state. The initial peak in this MI is due solely to the activation and inactivation of y1, while the subsequent decrease to steady-state is due to the fluctuations in the synthesis and degradation of y1. By contrast, the MI from the input to My2 (I(x+; My2, tS1)) is slower and on the order of tens of minutes, while that of the protein y2 (I(x+; y2, tS1)) is on the order of hours. This is not unexpected, as the mutual information signals for each species follow the causality of the circuit where y2 shows the largest delay.

The increase of I(x+; y2, tS1) as a function of time has an intuitive explanation in terms of y2 dynamics. To visualize this, we plot , the mean of y2 as well as versus X+(n) for t = 0, 75, and 750 minutes (Fig 1e, red lines), where σy2(n, t) is the standard deviation in y2. We will refer to these plots as the time-dependent dose response relationships. For t = 750, more values of x+ are resolvable from measurement of y2 than at time t = 75. For example, for x+ > 150, steps in the dose response curves constrained between the standard deviations (Fig 1e, black lines for t = 75 and 750 minutes) approximate how well a measurement in y2 can infer the value of x+. At time 750, about 2 steps are resolvable allowing for two distinct ranges of x+ to be inferred. While for t = 75 minutes, only one distinct range of x+ is inferrable. The larger the number of resolved states, the higher the value of the mutual information.

When mutual information is calculated between the input and a given node over the entire time duration of the signals, the mutual information between the input and each successive node has an upper bound equal to that of the prior node. This is known as the data processing inequality [5]. However, since we are evaluating the time-dependent MI at a given time t, the instantaneous value y2 can have more information about the input than y1. Indeed, at t approximately greater than 150 minutes, we find that the MI I(x+; y2, tS1) is greater than I(x+; y1, tS1) or I(x+; My2, tS1) (Fig 1d).This is because for the particular parameter set used in this example, the noise propagated from y1 onto y2 is averaged out, and the only variability in y2 stems from its own production and degradation. As a result, I(x+; y2, tS1) can be modulated to be higher or lower than I(x+; y1, tS1) by changing the rates of y2 production and degradation [17]. On the other hand, increasing the number of y1 molecules would increase its mutual information as this would reduce the noise in the y1 signal. Therefore, the mutual information at each node of this pathway can be modulated through choice of kinetic parameters. Similar observations that filtering can improve time-dependent MI between success nodes have been discussed in the context of other types of pathways [18].

Next, we examined mutual information while accounting for the fact that cells assume a distribution of initial states across the population upon receiving the input stimulus. We do so by incorporating the pre-stimulus steady state initial joint distribution into the MI calculations. This variability in initial conditions transiently reduces the MI (Fig 1f). At steady-state, the mutual information curves computed for a single or a distribution of initial states eventually converge onto each other at approximately t = 2750 minutes (Fig 1f, inset). This convergence at longer times occurs because a population in which every cell assumes the same exact initial conditions will eventually produce a heterogeneous distribution of states due to the intrinsic stochasticity of the biochemical reactions. For the values of parameters used in this example, the convergence of the two MI curves proceeds very slowly. Therefore, even when only intrinsic fluctuations are present, with no extrinsic contributions to variability, and for a given distribution of initial conditions, a single cell still transiently assumes, on average, a higher time-dependent mutual information than the whole population. In our case, this difference is very modest.

Time-dependent mutual information transmission with global parameter variation

Thus far, in our MI calculations, we have only accounted for variability in initial conditions given a single parameter set for the pathway. More realistically, any given pathway in a cell is subjected to variability through coupling to other cellular activities. This is known as extrinsic noise to distinguish it from intrinsic noise generated by the pathway itself. There are many extrinsic sources of variability that cellular pathways experience. For example, different cells may contain different numbers of polymerases or ribosomes, and hence have different capacities for transcription and translation [3]. This extrinsic variability can be accounted for in many ways, the simplest is to assume that the transcription or translation rate constants themselves can assume different values in different cells across the population.

To demonstrate the contribution of extrinsic variability to MI calculations, we consider a simple case where cells in the population have different translation rates. To do so, we add a stochastic global variable, G, which affects the protein creation rates such that and , where and βy2 are the nominal values for the parameters used above. In this way, the protein creation rates keep their mean value, but fluctuate because of their coupling to G. For this example, G follows a memoryless birth/death process such that the mean of G is (βg, γg are the birth and death rates). It follows that G has a coefficient of variation given by .

First, setting , we chose βg = 1.5 × 10−6 mol-s−1 and γg = 3 × 10−8 s−1. These values establish a stationary distribution of states, which we use as an initial distribution for the MI calculations. Fluctuations in the translation rate induce extra variability in the pathway components (compare the initial distributions of in Fig 2a to Fig 1c). As a result, mutual information calculations with this added extrinsic variability (and using the population distribution of initial conditions) show that I(x+; y2, t) is now drastically reduced compared to the case when a single parameter set is used to represent the lack of global variability (compare black line in Fig 2b with value in Fig 1d). Here also, as expected, MI calculations from a single initial state corresponding to parameters (state S1), (state S2) and (state S3), generate high transient values (red (S1), blue (S2) and green (S3) curves in Fig 2b). This discrepancy between single cell and population MI is further highlighted by examining the time-dependent dose response relationship between y2 and x+ at t = 750 (Fig 2d (full population) and Fig 2c (S1, S2, and S3). Again, the sub-populations generated from S1, S2, S3 each have little variability (high mutual information) relative to the full population.

Fig 2. Global variability has large impact on mutual information.

a) Initial distribution of before any input is given, reflecting cell-cell variability in initial conditions. In this case, the protein synthesis parameter is stochastic, reflecting a globally varying source of noise. S1 denotes the mean of the distribution, while S2 and S3 denote cells that are one standard deviation away from the mean. b) Time-dependent mutual information computed for heterogeneous initial conditions (the whole distribution in panel a, black), or homogeneous initial conditions (S1: red, S2: blue, and S3: green) c) Time-dependent dose-response relationships between y2 and input at t = 750 minutes for a population starting from homogeneous initial conditions corresponding to S1, S2, and S3. d) Time-dependent dose-response relationship between y2 and input for heterogeneous initial conditions corresponding to full distribution in panel a. Calculations for panels a-d correspond to a slowly fluctuating global variable with γg = 3 × 10−8. e) Mutual information for different timescales of the fluctuations in the global variable.

While constraining , we investigated the time-dependent MI for different values of βg and γg. Our original choice of γg = 3 × 10−8 forces G, and hence the translation rates and , to fluctuate very slowly. Therefore, the convergence of the MI values computed from a single initial condition versus the full distribution also proceeds slowly. As γg increases, this convergence proceeds faster (Fig 2e). Therefore, these results indicate that the mutual information of a pathway can be severely underestimated by population-based measurements if the pathway is subjected to global fluctuations that proceed on a slower timescale than the pathway itself.

Probing the mutual information of a simple synthetic circuit

Next, we sought to probe the major determinants of mutual information for a simple synthetic transcriptional circuit (Fig 3a). In this circuit, a constitutively expressed transcription factor interacts with a small molecule X+, leading to the activation of the transcription factor. The active transcription factor Y1 translocates into the nucleus and activates expression of a gene Y2.

Fig 3. Variability in a transcriptional synthetic circuit is dominated by slowly fluctuating global variable.

a) Schematic of the simple synthetic circuit. b-g) Dose-response relationships for t = 65 (blue), 165 (green), 330 (red), and 580 (black) minutes. Insets in d and g represent the ratio of the noise at time t to that at time t = 580 where for green: t = 165, red: t = 330, and black: t = 580. Noise is defined as standard deviation over the mean. b) Experimental y2 data. c) Experimental y1r data. d) Normalized experimental y2 data. Each curve is normalized by its maximum mean value in panel b. e) y2 data generated by a computational model of the circuit with slow fluctuations in protein synthesis rates. f) y1r data generated by a computational model of the circuit with slow fluctuations in protein synthesis rates. g) Normalized y2 data generated by a computational model of the circuit with slow fluctuations in protein synthesis rates. Each curve is normalized by its maximum mean value in panel e.

In our implementation, is an estradiol (input X+) responsive chimeric transcription regulator (TR) consisting of three fused elements: an activation domain (from MSN2), a lipid-binding domain (from the human estradiol receptor, hER-LBD), and a DNA binding domain (from GAL4). When estradiol binds to the LBD, the activated TR Y1 translocates to the nucleus and controls the expression of promoters containing Gal4-binding-sites. Therefore, the protein Y2 (in this case a fluorescent protein) is produced from a Gal4-responsive GAL10 promoter (See Materials and Methods for more details). At the same time, is produced from an altered version of the promoter of the alcohol dehydrogenase 1 gene (ADH1). We constructed two strains for measurement purposes. Strain 1 contains the circuit in addition to two copies of the GAL10 promoter, one driving YFP and the other driving mCherry. Strain 2 contains the circuit, but this time with two copies of the ADH1 promoter, one driving the production of and the other driving the production of YFP (which we will refer to as Y1r). The same strain also contains a GAL10 promoter driving the production of the mCherry (Y2) protein. These strains were useful for two reasons. First, we wanted to establish how mutual information computations depend on the ability to simultaneously measure different quantities in a circuit (e.g. Y1 and Y2 versus Y2 alone). Given that this necessitates the use of two fluorescent proteins, in this case YFP and mCherry, we wanted to ascertain that the results we obtain are qualitatively independent of the choice of fluorophores, given that mCherry has lower dynamic range than YFP with higher background fluorescence and hence increased noise at low concentrations.

For each strain, we subjected 12 exponentially growing populations (wells) of cells cultured in non-repressive media to input concentrations of estradiol (x+) log-sampled between 0 and 100 nM. The 12 measurement points sufficiently sampled the dose response relationships. The number of cells measured from each well was greater than 3000, ensuring good statistics for approximating the MI [14]. All cultures were started from zero estradiol concentrations. Samples were taken at t = 0, 65, 165, 330 and 580 minutes. Fig 3b shows the time-dependent dose-response relationships of estradiol versus y2 (in this case YFP, strain 1) for these timepoints, where fluorescence values were normalized with respect to side scatter in order to minimize the effects of cellular volume and shape dependent differences.

The dose-response relationships of y2 normalized by their respective maximum mean values (Fig 3d) exhibit an interesting trend: for the last 3 time points, the traces for the mean and variability are very similar to each other. The only outlier to this trend is the time point at t = 65 minutes after stimulation (Fig 3d). For this timepoint, fluorescence is weak and strongly overlaps with autofluorescence and folding delays, and therefore the true signal cannot be accurately estimated. Autofluorescence and folding delay also contributes, albeit less dramatically, to the measurement at the t = 165 minutes timepoint (Fig 3d). The mCherry measurements (strain 1 or strain 2) generated the same trend (S1a and S1b Fig) albeit with a noisier outcome than YFP due to the limited dynamic range of mCherry. As a consequence, the y2 (YFP) and y1r (YFP) experimental measurements from the two strains can be used in combination for comparison of modeling with data. The fact that variability in the y2 data irrespective of the fluorescent protein does not decrease with increasing mean values suggests that dominant fluctuations are unlikely to be intrinsic to the pathway.

Fig 3c plots measurements of y1r (YFP, strain 2). Unexpectedly, despite the common assumption that the ADH1 promoter has constitutive and constant expression, we found that it exhibits a modest dependence on estradiol. We do not know the root of this dependence, but it is likely to reflect the influence of the circuit itself on the metabolic state of the cell, hence affecting ADH1 promoter activity. Overall, the growth rate of these strains is independent of estradiol for concentrations under 100 nM over the duration of the experiment (S1c and S1d Fig), and therefore this effect can be compensated for in the mutual information calculations. It is worth noting here that we are making the assumption that despite the fact that YFP (Y1r) and are different proteins sharing only the same transcription rate (since both are driven by the ADH1 promoter), they share the same dominant noise characteristics. This would be the case if their intrinsic noise, which can be different, is insignificant compared to a dominant source of extrinsic noise affecting both. Next, we present data and modeling demonstrating that, indeed, noise in both Y1 and Y2 is most likely dominated by the same extrinsic global component.

Since the measured distributions are approximately gaussian for the majority of estradiol concentrations (S2 Fig) and the synthetic circuit (Fig 3a) follows the same basic chemical equations as the simple pathway we have studied in Fig 1b, we used this already established model to computationally explore different noise scenarios (see Materials and Methods for a more technical justification of the model). Specifically, we simulated the model (parameter values listed in Table 2) with both intrinsic variability and added global extrinsic parameter variability as sources of stochasticity. The data we collected are in fluorescent units, therefore we set our model to arbitrarily yield maximum y2 protein expression levels of about 2500 molecules, likely an underestimation of the actual system. However, this choice constitutes a scaling factor and does not affect any of our results. We also accounted for the estradiol dependence of ADH1 (Fig 3c) by adding to the model a term depicting the modest estradiol dependent repression of this promoter.

Table 2. Parameters for model of synthetic circuit (global model).

For global parameter variability, we again chose to focus on the parameters affecting protein expression. We potentially could model the global parameter variability with cell-to-cell heterogeneity in the protein degradation rates. However, given that our experimental data does not measure the expression of genes involved in either of these processes, i.e. no way to experimentally distinguish the source(s) of global parameter variability, we chose to model global variability in the protein creation rates. Following the same procedure as in the previous section, we added a stochastic global parameter, G, which affects the protein creation rates for Y1* and Y2 such that and . The noise in the experimental Y1r data is approximately .155, therefore modeling Y1 and Y1r using Poissonian statistics sets the mean of the global noise variable to 42. We first assumed that global parameter fluctuations are slow relative to the circuit timescales (γg = 3 × 10−6). Simulating the model with this slow global source of fluctuations (SGF model, (Fig 3e–3g)) generated profiles for normalized y2 (Fig 3g) that recapitulated the highly similar variance envelopes of the experimental time-dependent dose responses (Fig 3d). This behavior was a characteristic feature of the model for any γg < = 3 × 10−6. By contrast, as the global fluctuating variable assumes a faster timescale (γg > 3 × 10−6), the variability envelopes in the normalized time-dependent dose response of y2 started to diverge from each other (S3a and S3b Fig, γg = 3 × 10−4). As expected, the system modeled with intrinsic variability only (, parameter values listed in Table 3) shows a normalized time-dependent dose response in which variability decreases as a function of time as the protein levels increase (S3c and S3d Fig). Given the data in Fig 3b, if the fluctuations were purely intrinsic, the ratio of the standard deviation to the mean between times 165 and 580 minutes would decrease by a factor of approximately 1.7. This is a change we should be able to detect in our data since for the number of cells sampled, the error in estimating the means and standard deviations in the dose response relationships are .5 percent and 2 percent, respectively. However, as previously discussed, the experimental data shows that this ratio is relatively invariant for the last 3 time points (Fig 3d, inset) while increasing for the both the fast global fluctuations model (S3b Fig, inset) and the intrinsic variability model (S3d Fig, inset). Our argument is further strengthened by the fact that in order to capture the noise observed in y1r with the intrinsic variability model for the first timepoint, we had to set the y1r mean copy number in the model to an unrealistically low value for a strong promoter such as ADH1 (approximately 40 proteins), further indicating that variability is unlikely to be intrinsic. The results for the SGF model for γg < = 3 × 10−6 are not an artifact of the estradiol dependence of ADH1 since an SGF model without this effect yields indistinguishable results (S3e and S3f Fig). We therefore conclude that the dominant source of variability in this synthetic circuit is likely to be due to a globally slow fluctuating variable. This is consistent with previous results, which also indicated that global parameters play a dominant role in cell to cell variability and that these parameters exhibit fluctuations at a slower timescale than fluctuations of processes involved in gene expression [4].

Table 3. Parameters for model of synthetic circuit (intrinsic model).

In terms of mutual information, the fact that the normalized time-dependent dose responses coincide in terms of their variability (Fig 3d) implies that the experimentally computed mutual information I(x+, y2, t) at t = 165, 330 and 580 minutes should be similar. This is indeed the case (Fig 4a, solid black). Importantly, I(x+, y2, t) peaks and reaches a plateau at approximately 1 bit, at an earlier time than when y2 reaches its steady-sate. This further lends credence to the idea that the variability in the population is dominated by global parameter variability. Gratifyingly, the model with ‘slow’ global parameter fluctuations (with γg = 3 × 10−6) also captures the time-dependent mutual information seen in the data without any further parameter tuning (Fig 4a, solid blue).

Fig 4. Mutual information modeling predictions and experimental measurements of the transcriptional synthetic circuit.

a) Mutual information I(x+; y2, t) from experimental data (black solid, YFP in Strain 1), I(x+; y2, t) from the SGF model (blue solid), and I(x+; y2, tSi) from the SGF model conditioned on S1 (red dashed), S2 (blue dashed), S3 (green dashed), and Sg (black dashed). b) Dose response of y2 as a function of estradiol at time 580 minutes for the SGF model with initial condition S1, S2, and S3. c) Mutual information I(x+;[y1r y2], t) based on joint measurement of y1r and y2 computed for the SGF model (black dashed), same calculations for equalized mean of y1r (magenta dashed). Also shown for comparison are the mutual informations I(x+; y2, t) (SGF, blue solid) and I(x+; y2, tS1) (SGF, red dashed). d) Measured joint mutual information I(x+;[y1r y2], t) (black dashed) including the mean equalized case (magenta dashed). The measured mutual information I(x+; y2, t) (mCherry, Strain 2) is also shown for comparison (blue solid).

Since slow global fluctuations seem to dominate in this circuit, our analysis above indicates that the population mutual information might be under-estimating the fidelity of a single cell. To illustrate this point, we used the model to computationally isolate and compute the mutual information I(x+; y2, tSi) for single cells S1, S2 and S3 as defined in the computational example above. These calculations yield a substantially higher MI value than the population MI for the time span simulated (Fig 4a). Evidently, and as explained above, the MI for S1, S2, or S3 will eventually converge back to the whole population MI, but here it will do so on a much slower timescale than that of the system. For example, the dose response and distribution of y2 at time 580 minutes when the system is started from S1, S2 and S3 (Fig 4b) still shows tighter variability than that of the full population. Allowing for intrinsic variability in the initial conditions, i.e. starting with cells with at time zero (state Sg) yields a similar MI value to that of I(x+; y2, tS1) for γg = 3 × 10−6 (Fig 4a).

Finally, we explored how simultaneous measurement of y1r and y2 affects mutual information calculations. Calculations using the model indicate that as expected, knowledge of y1r improves the estimate of mutual information. For a slow globally fluctuating variable (γg = 3 × 10−6), the joint mutual information I(x+; [y1r y2], t) is larger than I(x+; y2, t). It can be shown that I(x+; [y1r y2], t) = I(x+; y1r, t) + E[I(x+; y2, ty1r)] where E[I(x+; y2, ty1r)] is the expected value of I(x+; y2, ty1r). Since the influence of estradiol on y1r adds (albeit very slightly (Fig 3c (data) and Fig 3f (model)) to the mutual information, i.e. I(x+; y1r, t) > 0, we normalized for this effect. To do so, at a given time, we set the y1r mean at each estradiol value to the value of the y1r mean at zero estradiol while adjusting the variance to preserve the noise in y1r at each estradiol value. Importantly, this operation does not affect correlation between y2 and y1r at each estradiol value, but enforces I(x+; y1r, t) = 0. We confirm that this does not change our conclusions that knowledge of y1r improves the estimate of mutual information (compare Fig 4c blue, dashed black and dashed magenta).

For comparison, we can carry out mutual information from the data obtained using Strain 2 in which both y1r and y2 are measured. In this strain, y2 is the fluorescent protein mCherry which has a limited dynamic range. Importantly, at the highest estradiol values and peak mCherry signal (time 580 minutes), the measured correlation between y1r and y2 (greater than .79) is less than ten percent below the model predictions. Even at these signal levels the noise in the mCherry signal still deteriorates the correlation. For decreasing values of estradiol the correlations become increasingly inaccurate. Therefore, the values of the MI cannot be quantitatively compared to the model which was fitted to YFP data. However, the qualitative trend of increased MI due to measurement of y1r relative to computing the MI with no knowledge of y1r should also hold. This is indeed seen to be the case (Fig 4d). This insight is in agreement with recent work [19] that studied mutual information in the RAS/ERK pathway. Nuclear ERK (erknuc) was used as a readout of pathway information transmission. The MI at time t between this readout and the input x was conditioned for single cell ERK levels, using measurement of total ERK (erktot). It was also shown that I(x; [erktot erknuc], t) is greater than I(x; erknuc, t). Therefore, simultaneous measurements of different cellular variables improve estimates of mutual information capabilities of single cells.


In this work, we illustrated how variability in initial conditions across a population, as well as slow-fluctuating extrinsic (global) variables can generate low values for the population mutual information in response to an input. We also demonstrated that when subpopulations of cells that have similar parameters or initial states are isolated, their mutual information values are transiently much higher than those of the whole population. These findings are important in light of the fact that many previous studies have found that extrinsic variability is a substantial contributor to pathway fluctuations. Indeed, our own experimental data using a synthetic circuit also implicated extrinsic fluctuations as a major source of variability. As a result, cells in a population cannot be considered to be the same noisy channel for mutual information calculations. Rather, each cell is a different noisy channel possessing its own parameters. Recent work [20] using light-inducible input signals [21, 22] to a mammalian RAS/MAPK pathway observed that different isogenic single cells have quantitatively different dose-response relationships. Interestingly, for the RAS/MAPK mammalian system, the dose-repsonse relationships were repeatable for hours within a given single cell [20], suggesting slow global parameters that affect that pathway for that duration.

A direct assessment of mutual information requires repeated time-resolved measurements in single cells. Another strategy to better approximate mutual information is to simultaneously measure a large number of interconnected variables, including global states. This might be increasingly feasible with breakthrough technologies such as mass-cytometry (a.k.a. CyTOF) [23] as well as improvements in fluorescent reporter technologies. In the mean time, however, we have demonstrated that computational modeling, especially with respect to the patterns of time-dependent variability, can generate valuable insights into whether intrinsic or extrinsic fluctuations dominate variability in a circuit. These results produce a more accurate quantification of mutual information, and therefore promise to generate a more realistic assessment of signaling fidelity in cellular circuits.

Our results and those from [20] support a view in which individual cells have distinct transfer functions over relevant signaling timescales, and have superior signaling fidelity (> 1.5 bits) than estimated from pooled measurements of a population. From this perspective, it could be the case that a diversity of high fidelity but different single cell signaling transfer functions across the population is a beneficial trait. However, some situations might arise where variability in population signal transmission capacity is not desirable. In this case, cells might use strategies such as negative feedback to constrain this variability. In either case, cells might also capitalize on the integration of signals from many pathways that respond to a given input(s) in order to generate a desired population response. In this view, each such pathway will add to the mutual information of the desired cellular output (e.g. level or activity of a transcription factor), allowing the population to further circumvent in this way any information fidelity bottleneck. Researchers of the subject are likely to encounter both situations, and perhaps a revised form of population mutual information might be needed to quantify these effects, along with the formulation of new information theoretic metrics. As an example, for any given input x(t), the mutual information gives us a sense of the diversity (or spread) in responses in y2 given the cell-to-cell variability encoded in . S4 Fig shows the results of this metric applied to the simple signaling cascade (Fig 1b) for different input step function amplitudes x+ and for different times. We envision these kind of metrics to reflect the different subpopulations with similar parameters within a given population and to serve as a potential tool to quantify how cell-to-cell variability across a population might change in structure due to various time-dependent inputs.

Finally, most studies to date have focused on variability in populations of non-communicating cells. Information fidelity in cells that communicate, for example through quorum sensing for bacterial communities or cell-to-cell mechanical coupling for tissues, is still largely unstudied. How cell-to-cell communication modulates global variability and variability in initial conditions across a population, and hence mutual information of cellular pathways, is a topic that should be explored in order to determine whether and when multicellularity offers a beneficial strategy in terms of signaling fidelity.

Materials and Methods

Approximating mutual information with N experiments

Because we are using a finite number of experiments, the input distribution pu(x+) is sampled with N discrete points. In practice, these points are spaced to accurately sample the input-output transfer function p(y, tx+) for x+ ranging from 0 to Xmax. The time-dependent mutual information is then calculated with this data. For values of x+ between the sampled values, p(y, tx+) is approximated by linearly interpolating the moments of the adjacent sampled distributions. Since the distributions generated by systems in this paper are approximately gaussian (and approximately negative binomial at very low x+ for the synthetic circuit data), only the means and covariances are required. A larger number of experiments (N) generates a more accurate approximation of mutual information. However, we observed that convergence to accurate MI values does not increase monotonically with N for the logarithmic sampling of the doses response that we have adopted. Rather, convergence proceeds exponentially, followed by marginal gains in accuracy as N increases. Therefore, for every N, we examine the last three sample number values, N, N−1 and N−2. Given their measured convergence rates, we can extrapolate an upper bound on the MI at an infinite number samples. We choose N whose calculated MI at N is within 1 percent of the extrapolated upper bound.

Chemical equations for the simple in silico network

The chemical equations for the circuit in Fig 1b are (2a) (2b) (2c) (2d) (2e) where . The propensities of the reactions appear above the reaction arrows. The system is a simple cascade of reactions where the input X activates Y1, and subsequently the Y1-dependent transcription of Y2. The parameter values are tabulated in Table 1.

Here the mean total number of Y1 molecules, active and inactive, is . The max mean numbers for Y2 mRNA and Y2 protein are βm2/γm2 = 200 and , respectively. This system has only a single stationary solution. This allows us to approximate and efficiently calculate the master equation with a local affine assumption using the first two moments Eqs (6) and (7) taken from [24].

Inclusion of global parameter variability within chemical equations.

In this work, we have assumed that the stochastic global parameter, G, manifests itself in variation in translation rates. To incorporate G we define our new protein creations rates and . Here, and βy2 are the nominal values in Eq (2). For simplicity we define G to follow a memoryless birth/death process through the reaction equations (3) where and the noise in G, ηg, is given as the standard deviation over the mean: .

Chemical equations for the synthetic circuit.

The chemical equations for the synthetic circuit in Fig 3a are the same as the simple circuit except that the production of now involves an mRNA step, which does not directly affect any of our results. Also, we have added a YFP reporter of that has a half-life of 6 hours which we set the transcription factor itself to be the same. We define the estradiol dependence in mRNA as The parameters for the circuit are given in Table 2 (global fluctuation model) and Table 3 (intrinsic fluctuation model).

Computation of first two moments using affine assumption

The formulation that we assume in our model and data consists of a system of well-stirred chemical reactions with N molecular species. For some environmental input X(t), we define the pathway state Y(t) to denote the vector whose integer elements Yi(t) are the number of molecules of the ith species at time t. If there are M elementary chemical reactions that can occur among these N species, then we associate with each reaction rj (j = 1, …, M) a non-negative propensity function defined such that aj(Y(t)) τ+o(τ) is the probability that reaction rj will happen in the next small time interval [t, t+τ], as τ → 0. The polynomial form of the propensities aj(y) may be derived from fundamental principles under certain assumptions [25]. The occurrence of a reaction rj leads to a change of νjZN (the set of nonnegative integers) for the state Y. νj is therefore a stoichiometric vector that reflects the integer change in reactant species due to a reaction rj.

This set of well-stirred chemical reactions can be represented by the joint probability density function P(y, tX(t)) which describes the probability of the system being in state y at time t, given the environmental signal X(t). The evolution of P(y, tX(t)) is given by (4) Eq (4) is the so-called chemical master equation (CME)[26, 27].

To approximate the CME with moment equations, we approximate the propensity function aj(y) with a locally affine Taylor series expansion [24] about the mean of the distribution, z(t), to get (5) From the time dependent mean equation for the kth species is (6) and the time dependent covariance equation for the kth and k′th species is (7) The calculation of the mutual information requires probability distributions. Given that we solve the first two moments, we constrain our distributions to be either a negative binomial distribution or a normal distribution. For cases when , we apply the negative binomial distribution since it only requires the first two moments and is non-negative. The negative binomial is very close to a normal distribution for and we therefore apply the normal distribution in these regions. The value of 3 used is heuristic, but the tail of the normal distribution at negative values is negligible at this point. For linear transcriptional systems, the negative binomial is a natural steady state solution [28] which was our motivation for applying it. Importantly, our data never violated any constraints required by the negative binomial distribution, for example, μkCkk. Note that the negative binomial distribution is only required for our modeling of the synthetic circuit data. Our theoretical example in the first half of the paper has large enough basal levels at zero input which always keeps it in the normal distribution regime. As a demonstration of the validity of the moment approach, S5 Fig shows very good agreement in the distributions derived from stochastic simulations (SSA) and the moment equations for the synthetic circuit model (γg = 3 × 10−6, initial condition S1).

Synthetic circuit constructs


All plasmids used in this study were derived from a set of yeast single integration vectors containing selectable markers and targeting sequences for the LEU2, HIS3, TRP1 and URA3 loci. These vectors were linearized by digestion with PmeI and transformed using standard yeast transformation techniques. All strains were derived from haploid W303a and were deleted for GAL4 to eliminate competition between the endogenous Gal4p and the previously described estradiol-inducible Gal4 chimera (Gal4DBD-ER) for binding to the GAL10 promoter [29]. The sequences for the ADH1 and GAL10 promoters were 658 and 646 bp upstream from the start codons for these genes, respectively. The genotypes for these strains are listed in Table 4.

Growth conditions and flow cytometry.

Cells were grown in YPD at 30°C. Prior to the experiment, cells were grown and maintained in exponential phase (optical density <.15) for approximately twenty-four hours in the absence of estradiol and then diluted to an optical density (OD600) of 0.05 at the beginning of the experiment and periodically diluted to stay under 0.15. Using a deep-well 96-well plate (Thermo), estradiol concentrations decreased from columns one to eleven in logarithmically spaced points according to the following equation: 100(2/3)c−1 nM where c denotes the column number. Well 12 did not receive estradiol. Estradiol did not change the growth rate of the cell population. Two replicates were performed. Measurements were taken at t = 0, 65, 165, 330, and 580 minutes.

Fluorescence measurements were performed on a LSRII analyzer (BD Biosciences). A blue (488 nm) laser was used to excite YFP (Venus) and a green (561 nm) laser was used to excite RFP (mCherry). Emission was detected using a 530/30-nm bandpass filter for Venus (Chroma) and a 610/20 bandpass filter for mCherry (Chroma). Greater than three thousand cells were collected for each measurement. Flow cytometry data was analyzed in MATLAB (Mathworks).

Mutual information of multivariate measurements

Here we discuss how multi-variate MI measurements relates to MI measurements from particular initial conditions: We start with the distribution p(ym, ys, tx(t)) where ym are the dynamic cellular pathway/network signals, ys are the slowly fluctuating pathway component quantities relative to the timescale of a given experiment, and x(t) is the input signal(s). The time dependent mutual information is (8) where the second line is simply a chain-rule representation. In addition to the assumption that the quantities of ys are fluctuating extremely slowly, we will also impose that the quantities in ys are independent of x(t). This results in p(ys, tx(t)) = p(ys, t) ≈ p(ys). The time dependent MI is approximated as (9) Finally, we can examine the mutual information between ys and ym for a given input signal(s) x(t) using the formula (10)

Supporting Information

S1 Fig. Comparison of Y2 mCherry measurements show that Strains 1 and 2 are equivalent in Y2 output.

Strains 1 and 2 have similar growth rates that are independent of estradiol concentrations less than 100 nM. a-b) Dose-response relationships for t = 65 (blue), 165 (green), 330 (red), and 580 (black) minutes. The top panel is the normalized experimental y2 data. Each curve is normalized by its maximum mean value in the corresponding lower panel. The lower panel is the un-normalized experimental y2 data. a) Experimental y2 (mCherry) data for Strain 1. b) Experimental y2 (mCherry) data for Strain 2. c)-d) Single well growth curves accouting for dilution of 1/2 at t = 165 and 1/2 at t = 330 (single wells (red), 0 nM estradiol well (blue), 100 nM estradiol well (green), average (black)). The flow cytometer sample volume changed at t = 165 minutes, hence the jump in cell count. There was also pipetting error from the deep well plate to the sample plate at t = 0 and 65 minutes resulting in error in cell count across wells. For t = 165, 330 and 580 the pipetting error was minimized. c) Growth curves for Strain 1. d) Growth curves for Strain 2.


S2 Fig. The majority of peak normalized dose-dependent distributions of Y2 (YFP) for estradiol doses from 2.6 to 44.4 nM are close to gaussian.

The distributions presented in ascending mean values are for ascending estradiol nM values of 2.60, 5.85, 8.78, 13.2, 19.8, 29.6, and 44.4, respectively. a) Peak normalized distributions of Y2 (YFP) at 580 minutes. b) Close-up of peak normalized distributions at 580 minutes for low doses of estradiol. c) Peak normalized distributions of Y2 (YFP) at 165 minutes. d) Close-up of peak normalized distributions at 165 minutes for low doses of estradiol.


S3 Fig. Comparison of alternative computational models for the transcriptional synthetic circuit demonstrate that only the slow global fluctuations model can recapitulate the experimental data.

a)-f) Dose-response relationships for t = 65 (blue), 165 (green), 330 (red), and 580 (black) minutes for alternative computational models. Insets in b, d and f represent the ratio of the noise at time t to that at time t = 580 where for green: t = 165, red: t = 330, and black: t = 580. Noise is defined as standard deviation over the mean. a)-b) Fast global fluctuations model. a) y1r data. b) Normalized y2 data. Each curve is normalized by its maximum mean value. c)-d) Intrinsic fluctuations model. c) y1r data. d) Normalized y2 data. Each curve is normalized by its maximum mean value. e)-f) Slow global fluctuations model with no estradiol dependent effects on y1r production. e) y1r data. f) Normalized y2 data. Each curve is normalized by its maximum mean value.


S4 Fig. Time snapshots of mutual information, .

The mutual information is applied to the simple pathway model with slow global model fluctuations for γg = 3 × 10−8 and . Here x(t) is a simple step function with amplitude x+. Therefore, the range of x+ represents a range of different x(t). Times are t = 65 (blue), 165 (green), 330 (red), and 750 (black) minutes.


S5 Fig. Comparison of SSA and moment equations for the synthetic circuit model (γg = 3 × 10−6 and initial condition S1).

a) Plot of dose response from moment equations at time 580 minutes. b) Plots of the distribution p(y2, t = 580∣x(t))/max(p(y2, t = 580∣x(t))) derived from the SSA and moment equations for select estradiol values up to 8.5 nM. c) Plots of the distribution p(g, t = 580∣x(t))/max(p(g, t = 580∣x(t))) derived from the SSA and moment equations. Analytical steady-state distribution shows that the moment equations and SSA results have evolved the global variable distribution p(g, t = 580∣x(t)) about halfway to steady state.



We would like to thank members of the El-Samad lab, especially Rajat Bhatnagar, Kieran Mace, and Ben Heineike for critical reading of the manuscript. We would also like to thank Lucien Bogar for help in strain construction and flow cytometry measurements.

Author Contributions

Conceived and designed the experiments: MC OV. Performed the experiments: MC OV. Analyzed the data: MC OV HE. Contributed reagents/materials/analysis tools: MC OV. Wrote the paper: MC OV HE.


  1. 1. McAdams HH, Arkin A. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences. 1997;94(3):814–819.
  2. 2. Kepler TB, Elston TC. Stochasticity in Transcriptional Regulation: Origins, Consequences, and Mathematical Representations. Biophysical Journal. 2001;81(6):3116–3136. pmid:11720979
  3. 3. Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proceedings of the National Academy of Sciences. 2002;99(20):12795–12800.
  4. 4. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic Gene Expression in a Single Cell. Science. 2002;297(5584):1183–1186. pmid:12183631
  5. 5. Shannon CE, Weaver W. The Mathematical Theory of Communication. Urbana, Illinois: University of Illinois Press; 1949.
  6. 6. Ziv E, Nemenman I, Wiggins C. Optimal Signal Processing in Small Stochastic Biochemical Networks. PLos ONE. 2007;2(10):e1077. pmid:17957259
  7. 7. Tkačik G, Callan CG, Bialek W. Information capacity of genetic regulatory elements. Phys Rev E. 2008 Jul;78:011910.
  8. 8. Tkačik G, Callan CG, Bialek W. Information flow and optimization in transcriptional regulation. Proceedings of the National Academy of Sciences. 2008;105(34):12265–12270.
  9. 9. Tkačik G, Walczak AM, Bialek W. Optimizing information flow in small genetic networks. Phys Rev E. 2009 Sep;80:031920.
  10. 10. Walczak AM, Tkačik G, Bialek W. Optimizing information flow in small genetic networks. II. Feed-forward interactions. Phys Rev E. 2010 Apr;81:041905.
  11. 11. Mancini F, Wiggins CH, Marsili M, Walczak AM. Time-dependent information transmission in a model regulatory circuit. Phys Rev E. 2013 Aug;88:022708.
  12. 12. Lestas I, Vinnicombe G, Paulsson J. Fundamental limits on the suppression of molecular fluctuations. Nature. 2010;467:174–178. pmid:20829788
  13. 13. Yu RC, Pesce CG, Colman-Lerner A, Lok L, Pincus D, Serra E, et al. Negative feedback that improves information transmission in yeast signalling. Nature. 2008;456:755–761. pmid:19079053
  14. 14. Cheong R, Rhee A, Wang CJ, Nemenman I, Levchenko A. Information Transduction Capacity of Noisy Biochemical Signaling Networks. Science. 2011;334(6054):354–358. pmid:21921160
  15. 15. Uda S, Saito TH, Kudo T, Kokaji T, Tsuchiya T, Kubota H, et al. Robustness and Compensation of Information Transmission of Signaling Pathways. Science. 2013;341(6145):558–561. pmid:23908238
  16. 16. Colman-Lerner A, Gordon A, Serra E, Chin T, Resnekov O, Endy D, et al. Regulated cell-to-cell variation in a cell-fate decision system. Nature. 2005;437:699–706. pmid:16170311
  17. 17. Paulsson J. Models of stochastic gene expression. Physics of Life Reviews. 2005;2(2):157–175.
  18. 18. Mugler A, Walczak AM, Wiggins CH. Information-Optimal Transcriptional Response to Oscillatory Driving. Phys Rev Lett. 2010 Jul;105:058101. pmid:20867954
  19. 19. Voliotis M, Perrett RM, McWilliams C, McArdle CA, Bowsher CG. Information transfer by leaky, heterogeneous, protein kinase signaling systems. Proceedings of the National Academy of Sciences. 2014;111(3):E326–E333.
  20. 20. Toettcher J, Weiner O, Lim W. Using Optogenetics to Interrogate the Dynamic Control of Signal Transmission by the Ras/Erk Module. Cell. 2013;155(6):1422–1434. pmid:24315106
  21. 21. Toettcher JE, Gong D, Lim WA, Weiner OD. Light-based feedback for controlling intracellular signaling dynamics. Nat Meth. 2011;8:837–839.
  22. 22. Toettcher JE, Gong D, Lim WA, Weiner OD. Chapter seventeen—Light Control of Plasma Membrane Recruitment Using the Phy-PIF System. In: Voigt C, editor. Synthetic Biology, Part A. vol. 497 of Methods in Enzymology. Academic Press; 2011. p. 409–423. Available from:
  23. 23. Bodenmiller B, Zunder ER, Finck R, Chen TJ, Savig ES, Bruggner RV, et al. Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators. Nat Biotech. 2012;30:858–867.
  24. 24. Engblom S. Computing the moments of high dimensional solutions of the master equation. Applied Mathematics and Computation. 2006;180(2):498–515.
  25. 25. Gillespie DT. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of Computational Physics. 1976;22(4):403–434.
  26. 26. McQuarrie D. Stochastic approach to chemical kinetics. Journal of Applied Probability. 1967;4(4):413–478.
  27. 27. Gillespie DT. A rigorous derivation of the chemical master equation. Physica A: Statistical Mechanics and its Applications. 1992;188(1-3):404–425.
  28. 28. Shahrezaei V, Swain PS. Analytical distributions for stochastic gene expression. Proceedings of the National Academy of Sciences. 2008;105(45):17256–17261.
  29. 29. Stewart-Ornstein J, Weissman JS, El-Samad H. Cellular Noise Regulons Underlie Fluctuations in Saccharomyces cerevisiae. Molecular Cell. 2012;45(4):483–493. pmid:22365828