^{*}

Conceived and designed the experiments: ST-N PRtW. Performed the experiments: ST-N PRtW. Analyzed the data: ST-N PRtW. Contributed reagents/materials/analysis tools: ST-N PRtW. Wrote the paper: ST-N PRtW.

The authors have declared that no competing interests exist.

Experiments in recent years have vividly demonstrated that gene expression can be highly stochastic. How protein concentration fluctuations affect the growth rate of a population of cells is, however, a wide-open question. We present a mathematical model that makes it possible to quantify the effect of protein concentration fluctuations on the growth rate of a population of genetically identical cells. The model predicts that the population's growth rate depends on how the growth rate of a single cell varies with protein concentration, the variance of the protein concentration fluctuations, and the correlation time of these fluctuations. The model also predicts that when the average concentration of a protein is close to the value that maximizes the growth rate, fluctuations in its concentration always reduce the growth rate. However, when the average protein concentration deviates sufficiently from the optimal level, fluctuations can enhance the growth rate of the population, even when the growth rate of a cell depends linearly on the protein concentration. The model also shows that the ensemble or population average of a quantity, such as the average protein expression level or its variance, is in general not equal to its time average as obtained from tracing a single cell and its descendants. We apply our model to perform a cost-benefit analysis of gene regulatory control. Our analysis predicts that the optimal expression level of a gene regulatory protein is determined by the trade-off between the cost of synthesizing the regulatory protein and the benefit of minimizing the fluctuations in the expression of its target gene. We discuss possible experiments that could test our predictions.

Biochemical networks, consisting of biomolecules such as proteins and DNA that chemically and physically interact with one another, are the processing devices of life. Metabolic networks allow living cells to process food, while signal transduction pathways and gene regulatory networks allow living cells to process information. Experiments in recent years have demonstrated that these networks are often very “noisy”: the protein concentrations often fluctuate strongly. However, how this “biochemical noise” affects the growth rate or fitness of an organism is poorly understood. We present here a mathematical model that makes it possible to predict quantitatively how protein concentration fluctuations affect the growth rate of a cell population. The model predicts that fluctuations reduce the growth rate when evolution has tuned the average protein concentration to the level that maximizes the growth rate; however, when the average concentration deviates sufficiently from the optimal one, fluctuations can actually enhance the growth rate. Our analysis also predicts that the optimal design of a regulatory network is determined by the trade-off between the cost of synthesizing the proteins that constitute the regulatory network and the benefit of reducing the fluctuations in the network that it controls. Our predictions can be tested in wild-type and synthetic networks.

Cells continually have to respond and adapt to a changing environment. One important strategy to cope with a fluctuating environment is to sense the changes in the environment and respond appropriately, for example by switching phenotype or behavior. Arguably the most studied and best characterized example is the

It has long been recognized that organisms in a clonal population can exhibit a large variation of phenotypes. Within highly inbred lines, for instance, phenotypic variation can still be detected

Our model integrates a description of how the internal dynamics of the composition of a cell affects the growth rate of that cell with a description of how the growth rates of the individual cells collectively determine the growth rate of the population. This allows us to address a number of fundamental questions: (a) How does the growth rate of the population depend upon the growth rate of a single cell as a function of its protein expression levels? (b) How does the population's growth rate depend upon the variance and the correlation time of these fluctuations? Our model predicts that an important parameter that controls the effect of biochemical noise is the correlation time of the fluctuations: only when the correlation time is long compared to the cell cycle time, does biochemical noise affect the growth rate of the population. Interestingly, recent experiments on

Our analysis highlights the difference between ensemble averages and time averages

The model also allows us to perform a cost-benefit analysis of regulatory control. Recently, Dekel and Alon performed a series of experiments that strongly suggest that protein expression is the result of a cost-benefit optimization problem

While the cost function of synthesizing a gene regulatory protein is probably similar to that of producing a metabolic enzyme, their benefit functions are fundamentally different. The benefit of producing a metabolic enzyme is that it allows the uptake of the sugar by the metabolic network. In contrast, the benefit of synthesizing a regulatory protein is indirect and is derived from that of the metabolic enzyme; synthesizing a regulatory protein can be beneficial because it allows the cell to adjust the expression level of the metabolic enzyme to its optimum in response to a changing sugar concentration. However, a given optimal expression level of the metabolic enzyme as a function of the sugar concentration, does not uniquely determine the optimal expression level of the regulatory protein. A given optimal response function of the enzyme expression level as a function of the sugar concentration, can be obtained by different combinations of parameters such as the binding affinity of the inducer to the regulatory protein, the binding strength of the regulatory protein to the DNA, the degree to which these molecules bind cooperatively with each other, as well as the total concentration of the regulatory protein. What determines the optimal combination of these parameters that all can yield the same response curve of the enzyme expression level as a function of sugar concentration?

We conjecture that the benefit function of the regulatory protein is determined by the fluctuations in the expression level of its target, the metabolic enzyme, although other factors such as the response time could play a role as well. As we will show, when the average expression level of the metabolic enzyme is close to its optimum, fluctuations will tend to reduce the population's growth rate. Different gene regulatory networks can yield the same average response function, but can have markedly different noise properties. In particular, our analysis predicts that the inducer, e.g., sugar, should bind the gene regulatory protein strongly in order to reduce the fluctuations in the enzyme concentration. Moreover, it predicts that higher expression levels of the regulatory protein lower the noise in the expression level of the metabolic enzyme. We therefore predict that the optimal expression level of a regulatory protein is determined by the interplay between the cost of making the regulatory protein and the benefit of reducing the fluctuations in the target gene. Recently, a similar idea has independently been proposed by Kalisky, Dekel, and Alon

In order to describe the effects of biochemical noise on the growth rate of a population of cells, we have to develop a model that describes how (a) the internal dynamics of a cell affects the growth rate of that cell and (b) how the latter affects the growth rate of the population of cells. We now first discuss the latter.

In order to quantify the growth rate of a cell, we have to define a parameter that monitors the progress along the cell cycle. This parameter, _{i} at the beginning of the cell cycle and a value _{f} at the end of the cell cycle. The value of the ‘cell cycle coordinate’

The growth rate λ depends upon the composition of the cell. This is determined by the expression level of ribosomal proteins, which are needed to make new proteins, and the expression levels of metabolic enzymes and other non-ribosomal proteins, which are required to produce the building blocks for protein synthesis and cell growth _{1},_{2},…,_{n}_{−1},_{n}

To determine the growth rate of a population of cells, a key quantity is the probability density

The first term on the right-hand side describes the evolution of _{X} is the Fokker-Planck operator encoding the evolution of _{f} amounts to a “dilution” of the probability of finding cells with intermediate _{s}(

This condition formalizes the observation that upon cell division a cell at the end of the cell cycle gives birth to two newborns. Importantly,

The above model is a generic model of the cell cycle. To make further progress, we have to specify the dynamics of

Here, _{i}_{i}_{s,i} is the deviation of the concentration _{i}_{s,i}, and _{ij}_{i}_{i}_{x}

If the composition of the cells would not fluctuate in time, then the evolution of the cell cycle parameter _{0}, and proportional to the growth rate of the population, _{0}∼_{s}^{τ}_{s}

To obtain the growth rate _{s}_{s}

The equation for the stead-state probability density _{s}(_{s}(

From now on we shall rescale the time and the _{f}−_{i} = log(2). In order to understand why such a transformation is useful, it should be noted that in the absence of protein concentration fluctuations, each cell in the population needs a constant time between birth and division _{cycle} = (_{f}−_{i})/_{0}. At the population level, _{cycle} is also the time it takes for the population to double in size, such that the growth rate of the population is _{cycle}. Clearly, in the zero fluctuation limit, the growth rate of the population of cells equals the growth rate of each single in the population: _{0}. In the presence of protein concentration fluctuations, however, the cell cycle times of the individual cells will fluctuate, such that even a population of cells that are initially perfectly synchronized will eventually converge towards a steady-state distribution as given by Equation 7.

Our model shows that the “time average” of a quantity such as the average protein expression level or the noise in gene expression, is, in general, not equal to its “ensemble average”

Here, _{α}_{i}_{s}. In contrast, the distribution of _{s}+^{0}, where ^{0} may deviate from zero. Moreover, not only the mean, but also the variance of the two distributions will, in general, differ, as we will show now.

In order to understand the non-trivial effects of biochemical noise on the growth rate of a population of cells, it is instructive to consider a simple example. Let's consider a single metabolic enzyme X, and assume that the temporal dynamics of its concentration is given by_{s}, γ^{−1} is the response time, which is typically on the order of the cell cycle time, and η is a Gaussian white noise term, of zero mean and strength 2

We assume that over the concentration range of interest, the growth rate of a given cell as a function of the expression level of X can be written as_{0}(_{s}) is the growth rate of the cell when the enzyme concentration equals _{s}. The growth rate of the population of cells is then given by (see

Here, ^{2} is the variance of the fluctuations in ^{2} = 〈^{2}〉−〈^{2}. This ensemble or population average is given by

The ensemble average ^{2} can be written in terms of the time average of the variance

Let us now consider the scenario in which the average expression level of the enzyme is such that the growth rate is maximal: _{s} = _{opt} (see ^{2}^{2}<0. The growth rate of the population is then _{0}(_{s})+^{2}. Since _{0}. Hence, when the composition is close to its optimum, biochemical noise always tends to reduce the overall growth of the population.

If the average expression level _{s} is close to the optimal expression level _{opt}, biochemical noise will always decrease the growth rate. If, however, the average expression level deviates sufficiently from the optimal expression level (i.e. if ^{2} in Equation 11), then fluctuations can enhance the growth rate of the population, even when the growth rate λ of a single cell is linear in

If the average expression level _{s} deviates significantly from the optimal expression level _{opt}, the situation is qualitatively different (see ^{−1}. If the response time is much faster than the cell cycle time, then on the relevant time scale of the cell cycle, the concentrations in all the cells will be the same and no benefit from the noise can be gained. However, both in prokaryotic ^{0} means that the time average of _{s}, is not equal to the ensemble average of _{s}+^{0}.

Lastly, we note here that it is conceivable that the curvature ^{2}>4

The analysis above describes how fluctuations in the composition can affect the growth rate of a population of cells in a constant environment. We now briefly discuss how fluctuations in the environment affect the population's growth rate. As before, we consider the scenario in which cells respond to changes in the environment via the mechanism of responsive switching: they thus sense the changes in the environment and respond appropriately.

If the environmental signals are described by the vector

Here, ^{c} denote the correlated fluctuations between the different cells, while ^{u} corresponds to the fluctuations in the environmental signals that are uncorrelated from one cell to the next within the population.

The uncorrelated fluctuations in the external signals can be treated in the same spirit as the fluctuations in the internal signals. Their dynamics could be added to that of _{ij}^{u} couple to the fluctuations in the composition

The effect of the correlated fluctuations in the external signals, ^{c}, are much more difficult to treat analytically

We can make an estimate for the time it takes for the population to relax towards a new steady after a change in the environment has occurred. If prior to an environmental change, the cell cycle coordinate

In order to understand the design criteria that determine the magnitude of the fluctuations in the expression level of a given protein for cells that respond via responsive switching, we do not only have to understand how these fluctuations affect the growth rate, as discussed above, but also the indirect energetic cost of controlling these fluctuations. Both the magnitude of the concentration fluctuations and the cost of controlling these fluctuations are determined by the design of the network that regulates the expression level of the protein of interest. We will now show, using the

We use a simple model of the

Here, _{s}, _{s}, _{X} and _{E} model the (Gaussian white) noise in their expression. The factor _{E}(_{E}(_{e}_{E}.

To make further progress, we need to know how the growth rate of each cell,

Following Dekel and Alon _{0}, due to the production of the gene regulatory protein and the metabolic enzyme relative to the growth rate in the absence of these proteins, _{0}, as:

The first term on the right-hand side encodes the gain in the growth rate due to the metabolic activity of the enzyme; importantly,

As discussed in the introduction, a given average optimal expression curve of _{opt}(_{s}, is given by that level

This expression is, in fact, the principal result of the cost-benefit analysis of the optimal enzyme expression level of Dekel and Alon _{s}, the optimal TF-L and TF-operator binding strengths—under the assumption that the steady-state enzyme expression level as a function of lactose concentration is fixed and given by Equation 20:

To obtain the growth rate at _{s}+_{s}+_{s}), we expand the growth rate around

On the left-hand side of the above equation, _{D}, which is given by

To demonstrate this explicitly, we will study in more detail the last two terms in Equation 21, which describe the contribution of the transcription factor to the growth rate:

In our model, the steady-state enzyme concentration is given by _{s} = _{opt} = _{E}(_{s},

To make further progress, we have to assume a model for the fluctuations in _{X} is the copy number of _{s}∝

This expression shows a maximum as a function of _{X}. The position of this optimum—the copy number of X that maximizes the growth rate—is related to the copy number of E by

We therefore predict that the optimal TF copy number is linear in the square root of the copy number of the enzyme it regulates. This prediction could perhaps be tested by performing a statistical analysis of the expression levels of transcription factors and the expression levels of the target genes these transcription factors regulate. Such a statistical analysis could be performed in the spirit of that of

Dekel and Alon _{WT} is the fully induced wild-type concentration of the enzyme, and we use _{WT}. As explained in the section

The growth rate is averaged over different lactose concentrations in the environment (see Equation 17), for two different lactose concentration distributions in the environment.

Equation 21 shows that the effect of the noise in

Here, _{free} is the concentration of _{D} is the dissociation constant for ligand-TF binding. The unbound transcription factor represses the expression of _{free}(

We show these relations in _{free}(

This is illustrated for two regulatory networks of the _{free}/

To minimize the gain _{D} should be as small as possible, which corresponds to strong TF-L binding. Since the function _{opt}(_{D}). The conclusion that TF-L and TF-operator binding should be strong is supported by the experimental observation that the dissociation constant for the binding of lac repressor to its primary operator site is in the nM range, while the binding of the inducer allolactose to the repressor is on the order of 0.1 µM

Contour plot of the growth rate as a function of the repressor copy number _{D}. The weighting of the lactose levels is nonuniform. Lower binding constants allow for higher optimal growth rates at lower optimal expression levels for the repressor.

The response machinery allows a living cell to adjust its composition to a changing environment. If the response machinery is fast and operates well, then in each environment the cell's composition is optimized such that the growth rate is maximized. Our analysis suggests that under these conditions, there is an evolutionary pressure to minimize the fluctuations in the composition. However, the response machinery cannot always optimally adjust the cell's composition. When there is a drastic change in the environment, for instance, the cell probably has to change its genotype so as to change its response machinery. Our analysis suggests that along such an “evolutionary trajectory” from a sub-optimal configuration of the response machinery to a new optimal one, fluctuations in the composition could be beneficial, because cells that happen to have a composition that is closer to the new optimum will grow more rapidly and thereby increase the overall growth rate of the population. Based on this observation we predict that the periods of fast evolution (for example when a population colonizes an entirely new environment) are correlated with a positive influence of fluctuations and thus an increased variability in the population. This idea is supported by the observation that the regulatory networks that control the response to environmental changes are in general noisier than the conserved cell machinery

It has been recognized before in a different context that phenotypic variance can be detrimental under stabilizing selection for the optimal genotype and advantageous far from this optimal genotype ^{τ}

Recently, Kalisky, Dekel and Alon

Our model predicts that if the expression level of the gene regulatory protein is varied by a factor 2 from its optimal value, the change in the growth rate would be on the order of 10^{−4}. This change is sufficient to provide a selection pressure that is large enough in a typical bacterial population with an effective size larger than 10^{6} cells; indeed, as discussed in ^{−6} are sufficient to balance the genetic drift in such a population. A change in the growth rate of 10^{−4} is thus large enough to provide a selection mechanism in a typical bacterial population for driving the transcription factor expression level to within a factor 2 from the predicted optimal level.

Another fundamental question we can address with our model is the relative efficiency, from the fluctuations point of view, of different modes of regulation (see

In this paper, we have focused on the expression of a single protein. Yet, it is clear that the model presented in

How could our predictions be tested experimentally? Ideally one would like to perform an experiment in which the

In this section we derive the solution (Equation 7) for the stationary probability distribution _{s}(

The three terms on the right hand side of Equation 30 describe, in order, the drift along the cell-cycle coordinate _{ij}_{i}η_{j}_{ij}_{i}η_{j}_{i}_{j}_{s}(_{f},_{s}(_{i},

The instantaneous growth rate is given by:_{i}−_{f} = log(2) we obtain

For this multidimensional polynomial equation to be satisfied for all the values of ^{0} must satisfy the set of

We can read from the Equations 37 that negative curvatures of the instantaneous advancement rate (_{i}_{s}_{i}_{s}

We derive here Equation 12. As discussed in the text, we model the dynamics of enzyme X via the linearized Langevin dynamics,

If we insert this into Equation 40, we find that we have to solve the equations

We now present the derivation and the approximations leading to Equation 21. A mean-field analysis of the cost-benefit function of Dekel and Alon _{s}, is finite. Since the average transcription factor concentration, _{s}, is nevertheless small, it is reasonable to assume that the growth rate of a cell with _{s}+_{s}+

Here, _{0} is the growth rate of each single cell when the gene regulatory protein and the enzyme are not expressed _{D} is the “deterministic” growth rate, thus the growth rate when the regulatory protein and the enzyme are expressed, but fluctuations are not taken into account. It is given by:_{s} we have:

Equations 36 and 37 can now be solved using Equations 45–47 to obtain the growth rate that takes into account the noise. This leads to the following expression for the growth rate:

In deriving Equation 49 we also use the fact that the transcription factor concentration is much smaller than the typical enzyme concentration, yielding _{D}: the growth rate of the population of cells, _{D}.

The last term in Equation 49 is positive, and, interestingly,

Therefore, the last term in Equation 49 is negligible at our level of approximation.

We also have

Around the steady state

If the response times of the enzyme and the transcription factor are not equal, the same analysis gives_{X} is the degradation rate, i.e., the response time, of the transcription factor and _{E} is the degradation rate (response time) of the enzyme. This shows that the effect of the fluctuations in the transcription factor concentration, _{X}<_{E}), are the fluctuations in _{X}>_{E}), then the slow enzyme dynamics will effectively integrate out the fluctuations in

We thank Daan Frenkel and Frank Poelwijk for a critical reading of the manuscript.