Skip to main content
  • Loading metrics

Molecular noise of innate immunity shapes bacteria-phage ecologies


Mathematical models have been used successfully at diverse scales of biological organization, ranging from ecology and population dynamics to stochastic reaction events occurring between individual molecules in single cells. Generally, many biological processes unfold across multiple scales, with mutations being the best studied example of how stochasticity at the molecular scale can influence outcomes at the population scale. In many other contexts, however, an analogous link between micro- and macro-scale remains elusive, primarily due to the challenges involved in setting up and analyzing multi-scale models. Here, we employ such a model to investigate how stochasticity propagates from individual biochemical reaction events in the bacterial innate immune system to the ecology of bacteria and bacterial viruses. We show analytically how the dynamics of bacterial populations are shaped by the activities of immunity-conferring enzymes in single cells and how the ecological consequences imply optimal bacterial defense strategies against viruses. Our results suggest that bacterial populations in the presence of viruses can either optimize their initial growth rate or their population size, with the first strategy favoring simple immunity featuring a single restriction modification system and the second strategy favoring complex bacterial innate immunity featuring several simultaneously active restriction modification systems.

Author summary

Mathematical understanding of how randomness at the molecular scale, also known as molecular noise, ultimately affects the fate of organisms and whole populations is widely recognized as a challenging problem in multi-scale modeling. Here, we develop an analytical framework for analyzing how the randomness of individual reaction events in single cells propagates to higher levels of biological organization and affects population and ecology scale dynamics. We deploy our mathematical results to study an example from the ecology of bacteria and bacteriophage viruses. Bacteria defend themselves against viruses by a simple innate immune system composed of a pair of enzymes. Due to molecular noise, however, viruses sometimes escape this immunity, causing bacterial populations to plummet. Noise can also cause the immune system to turn against its own host. By analyzing how such costs and benefits of bacterial immunity balance at the ecological level, we predict the optimal parameters for bacterial innate immune systems. While the focus of this work is on bacteria-phage ecologies, we expect that our results will generally help to better understand population and ecology scale consequences of stochastic cell fate decisions in diverse biological domains ranging from cell differentiation in developmental biology to studies of the microbiome and consequences of stochasticity in apoptotic responses for anti-cancer therapies.


One of the major challenges in biology is to understand how interactions between individual molecules shape living organisms and ultimately give rise to emergent behaviors at the level of populations or even ecosystems. At the very bottom of this hierarchy, inside single cells, interacting biomolecules such as DNA or proteins are often present in small numbers, giving rise to intrinsic stochasticity of individual reaction events [1, 2]. As a result, genetically identical organisms occupying identical environments can express different phenotypes [3, 4] and make different decisions when presented with identical environmental cues [5, 6]. This molecular noise is known to be the cause of biologically and medically important traits of bacteria such as persistence in response to antibiotics [7, 8] and competence during acquisition of heterologous DNA [9]. However, while its causes and consequences are relatively well-studied at the organismal level [1013], how molecular noise propagates to higher scales of biological organization to affect the ecology and evolution of organisms remains mostly unknown [4]. Recently it has been shown that ecosystems can follow surprisingly deterministic trajectories despite the prevalence of stochastic events [14, 15], yet these trajectories could themselves be strongly influenced by molecular noise. Thus, the extent to which ecological interactions are affected by molecular noise, and the extent to which these ecological consequences feed back to reshape individual traits, remain to be explored.

Perhaps the most prevalent biological systems in which molecular noise is thought to play an important role are restriction-modification (RM) systems [16]. Present in nearly all prokaryotic genomes [17], RM systems are a highly diverse class of genetic elements. They have been shown to play multiple roles in bacteria as well as archaea, including regulation of genetic flux [18] and stabilization of mobile genetic elements [19], but have most frequently been described as primitive innate immune systems due to their ability to protect their hosts from bacterial viruses [20]. When a virus (bacteriophage or phage) infects a bacterium carrying a RM system, the DNA of the phage gets cleaved with a very high probability, thus aborting the infection. With a very small probability, however, the phage can escape and become immune to restriction by that specific RM system through epigenetic modification, leading to its spread and potentially death of the whole bacterial population in absence of alternative mechanisms of phage resistance [21, 22]. Thus, in the context of RM systems, molecular noise occurring at the level of individual bacteria can have profound ecological and evolutionary consequences. Because RM systems are ultimately based on only two very well characterized enzymatic activities (restriction and modification) [23], they represent a simple and tractable biological system in which we can investigate propagation of effects of molecular noise across different scales of biological organization.

Here, we mathematically model the action of RM systems from individual molecular events occurring inside a single cell, through individual bacteria competing in a population, to interactions between populations of bacteria and phages in a simple ecological setting, as shown in Fig 1. We demonstrate that, by imposing a tradeoff between the efficiency and cost of immunity, molecular noise in RM systems occurring at the level of individual bacteria has consequences that propagate all the way up to the ecological scale, and that the ecological consequences in turn imply the existence of optimal bacterial defense strategies against phages.

Fig 1. Effects of molecular noise at different scales of biological organization.

In single cells (green shade), stochastic activity of restriction-modification enzymes (R—red, M—blue) may inadvertently lead to full methylation of invading phage DNA in a process known as “phage escape” from the bacterial immunity, resulting in direct consequences at the ecological scale (blue shade). Also, stochastic activity of the same enzymes may lead to accidental cutting of bacterial DNA in a process known as “self-restriction,” resulting in lowered bacterial growth rate at the population level (red shade). Trade-offs between self-restriction and phage escape at the ecological scale in turn influence R and M enzyme activities in surviving cells, thereby optimally balancing the efficiency and cost of immunity.

Methods and results

Self-restriction in single cells and in growing populations

RM systems consist of two enzymes, a restriction-endonuclease R, that recognizes and cuts specific DNA sequences (restriction sites), and a methyl-transferase M, that recognizes the same DNA sequences and ensures that only invading phage DNA can be cut by the endonuclease while the bacterial DNA remains methylated and protected. However, since chemical reactions occur stochastically, RM systems can produce errors and fully methylate invading phage DNA before it is cut and degraded (phage escape, typically occurring with a probability ranging between 10−2 and 10−8) [24, 25]. Similarly, it is possible that newly replicated restriction sites on the bacterial DNA, which are originally unmethylated, are accidentally cleaved instead of methylated (self-restriction) [26].

Inside a single cell, the probability of such self-restriction events depends on the total activity, r, of all restriction endonuclease molecules R, the total activity, m, of all methylase molecules M, as well as the bacterial replication rate λ, since λ determines the rate at which new unmethylated restriction sites are generated. To investigate how self-restriction depends on these parameters, we model the corresponding biochemical reactions at each individual restriction site on the bacterial DNA with the stochastic reaction network displayed in Fig 2a (see S1 Appendix Section S.1). The time τS until the first self-restriction event in a given cell—i.e., until that cell’s death or substantial reduction in growth rate—can be obtained as the time when the first restriction site is cut, that is as τS = mini∈{1,…,NS} τi, for bacterial DNA with NS restriction sites, where τi, i = 1,…,NS are the waiting times for cutting events at individual sites. It can be shown that all τi follow a phase-type distribution (see [27] and Fig 2b and 2c): (1) with pQ = [p0 p1 p2] being the initial methylation configuration, i.e., the proportion of restriction sites that are unmethylated (p0), hemi-methylated (p1) and doubly-methylated (p2); see S1 Appendix Section S.1.

Fig 2. Self-restriction in single cells.

(a) Model of the individual-site methylation dynamics. The restriction site can either be doubly-methylated (first circle), hemi-methylated (second circle), or unmethylated (third circle). Methylation events happen at a rate proportional to the activity m of M. During growth at rate λ, the DNA and its restriction sites are replicated, with the newly synthesized DNA having no methylation marks; for our model, growth is thus identical to demethylation reactions denoted in the figure. Restriction is assumed to be lethal (no DNA repair) and can happen when a site is unmethylated at a rate proportional to the activity r of R, leading to cell death (fourth circle). (b) Distribution of time to self restriction, f(τ), depends on enzyme activities, r and m, for a restriction site that is initially doubly-methylated. Increasing values of r lead to decreasing expected times to self-restriction for all m (cf. blue vs red). Given r, as m grows large, the site will almost never be unmethylated and cut, yielding a very broad distribution (green). λ = λref = 1.7 ⋅ 10−2 min −1; reference (red) values: mref = 0.05 min−1, rref = 0.1 min−1; “Small” (“large”) r, m values are 2e-fold lower (e-fold higher) than reference values. (c) Dependence of self-restriction on the initial methylation configuration, pQ (dark blue = doubly-methylated; blue = hemi-methylated; light blue = unmethylated) at rref, mref.

Eq [1] allows us to derive the expected time until self-restriction of a single site as (2) more generally, Fig 2b shows how the distribution of waiting times depends on the restriction rate r (increasing the probability of the site getting cut when it is unmethylated) and the magnitude of m relative to λ (which decreases the probability that the site is unmethylated in the first place).

Fig 2c shows that time to self-restriction at a single site depends essentially on an unknown quantity, the methylation configuration pQ. We will now proceed to show that when we consider an exponentially growing population of bacterial cells, the configuration pQ can no longer be freely chosen, and has to be determined self-consistently instead. Intuitively, this is because when the bacterial population is in steady-state growth, new unmethylated sites are constantly replenished by replication, while cells with more unmethylated sites are simultaneously and preferentially being removed, as illustrated in Fig 3a and required by Eq [2]. These two forces, generation of new unmethylated sites and their preferential removal, will push any initial pQ towards a unique steady state equilibrium.

Fig 3. Growth of self-restricting populations.

(a) Top: Selective cutting of cells (red ‘X’) with unmethylated restriction sites biases growing populations towards methylation configurations with larger numbers of methylated sites. Bottom: The long-term population distribution of methylation configurations, pQSD (p0—unmethylated, p1—hemi-methylated, p2—doubly-methylated; see S1 Appendix Section S.1). Increasing either r or m reduces the probability of finding unmethylated restriction sites in the population. Three example choices for r, m (blue, red, green) and λref are same as in Fig 2 and are denoted with crosses in panel b and corresponding colors in panel c; NS = 5 is chosen for illustration purposes. (b) Dependence of the self-restriction rate, μ(r, m, λref), on the enzyme activities. (c) Effective population growth rate λe = λ − μ(r, m, λ) as a function of the replication rate λ for different enzyme activities r and m. For the reference parameters (red), the self-restriction rate computed using the quasi-steady state distribution pQSD (solid) is significantly different when compared to an estimate based on fully equilibrated pQ in which restriction does not lead to an absorbing state (dashed).

Mathematically, assuming that the methylation dynamics in all cells are equilibrated and that cells cannot be distinguished, the internal methylation configuration of any randomly chosen cell at any time during growth of the population can be derived from the quasi-stationary distribution pQSD(r, m) of the individual-site methylation process in Fig 2a (see S1 Appendix Section S.1). pQSD(r, m) is the equilibrium distribution of the stochastic process conditional on it not having reached the absorbing state where the DNA is cut and the cell has died (Fig 3a); in short, methylation and growth equilibrate “in all directions except the one leading towards self-restriction”. Then, setting pQ = pQSD(r, m) in Eq [1] reduces the phase-type distribution f(τi) for the time τi until self-restriction at an individual restriction site to a single exponential, implying further that the waiting time τS = mini∈{1,…,NS} τi until self-restriction of any site in the cell is also exponentially distributed. Consequently, we are led to the main result of this section: growth with self-restriction can be rigorously modeled at the population level with a Markov birth-death process for which the expected population size n(t) follows a simple ordinary differential equation (3) where λe(r, m, λ) = λ − μ(r, m, λ) is the effective growth rate and μ(r, m, λ) is the rate of self-restriction, defined as the inverse of the per-cell expected waiting time until self-restriction (4) with γ1 being the largest eigenvalue of B (an explicit stochastic simulation validating this analytical result is provided in the S1 Appendix Section S.2).

Eq [4] allows us to straightforwardly evaluate the reduction in the population growth rate due to random self-restriction events in single cells for any given pair of enzyme activities, r and m. To study possible qualitative effects of self-restriction, we explore in Fig 3b a wide range of enzyme activities for a system with NS = 5 restriction sites (chosen, for illustration purposes, significantly smaller than the typical number of sites recognized by real RM systems). We find that the main determinant of self-restriction is the activity m of the methyl-transferase and that the effects of molecular noise can be suppressed by sufficiently increasing m. Furthermore, so long as m is large enough such that unmethylated restriction sites are only rarely available, μ(r, m, λ) lies on a large plateau of low self-restriction and changes only little with r and m, suggesting that stochastic fluctuations in enzyme activities would only have minor consequences for the population, especially when they are positively correlated, as would be the case if R and M enzymes were expressed from the same operon (S1 Appendix Section S.3).

The (r, m) plane in Fig 3b contains a transition region that separates the large plateau with low self-restriction from the plateau where self-restriction is severe enough to stop the population growth altogether. We have chosen our reference (red) parameter values (rref,mref) to lie in this transition region, and explored the regime with an e-fold higher rates (“large r & m”, indicated by green), and with 2e-fold lower rates (“small r & m”, indicated by blue) in Fig 3b and 3c. The comparison of these three regimes in Fig 3c is most clear when the effective growth rate is shown as a function of λ, the rate at which the cells, and thus the restriction sites, are replicated. In the “small r & m” regime, self-restriction is so infrequent that it can easily be outgrown by replication (except at very low λ). In the “large r & m” regime, m is sufficiently high to keep the restriction sites protected and thus self-restriction is rare, except at extremely large λ, where the green curve falls below the blue curve. In the reference regime, r is too large and m not high enough to protect, so self-restriction can not be “outgrown”; effective growth thus falls significantly below λ. Our numerical analyses further show that the self-restriction rate μ(r, m, λ) grows faster-than-linearly with λ (S1 Appendix Section S.1), causing the effective population growth to slow down and ultimately drop to zero at high enough λ.

We end this section by highlighting a non-trivial interaction between the single-cell and population-scale processes. While increasing the activity r of the endonuclease always decreases the effective growth rate of the population due to self-restriction, the effect can be smaller than expected from the single-cell analysis (dashed lines in Fig 3c). This is because high values of r feed back through the population scale to bias the steady-state distribution of methylation configurations away from cells with lots of unmethylated sites, as shown in Fig 3a, making self-restriction less likely. Implicit feedback effects of this type frequently give rise to complex dynamics in multi-scale models.

Phage escape

RM systems lower the growth rate of the population due to self restriction, especially when the activity m of the methyl-transferase is small. Upon infection by a phage, however, small values of m are advantageous, making it less likely that the unmethylated phage DNA will get methylated and escape the immune system before it can be cut by the restriction endonuclease.

Assuming that all restriction sites are identical and independent, the probability of phage escape can be calculated [28] as (5) where NV is the number of restriction sites on the phage DNA. From Eq [5] it is straightforward to see that pV(r, m) is monotonically increasing in m and decreasing in r. One might therefore expect that the balance between avoiding self-restriction that favors high m, Eq [4], and minimizing phage escape that favors low m, Eq [5], would impose a tradeoff and thus lead to an optimal value of m. However, this is not the case, because the phage escape probability pV(r, m) and the population self-restriction rate μ(r, m, λ) can both approach zero so long as r and m both increase to infinity but r does so faster. While mathematically possible, this limit is, however, not biologically relevant: large enzyme expression levels should incur a cost (metabolic or due to toxicity presumably caused by non-specific protein-DNA interactions in the case of RM systems) for the cells [29, 30], which we sought to incorporate into our model by including a growth rate penalty proportional to the activity of restriction and methylation enzymes, i.e., λe(r, m, λ) = λ − μ(r, m, λ) − crrcmm. Interestingly, it can be verified that our reasoning is valid only because two subsequent demethylation events need to occur to create a restriction-susceptible site on the bacterial DNA (S1 Appendix Section S.1). If hemi-methylated sites could be recognized by the restriction endonuclease, or if both methyl groups could be lost in a single event, our initial expectation about the existence of the tradeoff would be correct, and a particular choice of r and m values would simultaneously minimize the phage escape and self-restriction, even in the absence of the expression cost for R and M.

Our model can be generalized to multiple coexisting RM systems that recognize different restriction sites and operate in parallel, as is often observed for bacteria in the wild [17]. This provides increased protection from phages since the phage has to escape all RM systems to infect successfully. However, multiple RM systems also imply that the bacteria either have to pay higher expression and self-restriction costs or that they have to re-balance the expression levels of the enzymes such that lower self-restriction rates per RM system are obtained with the same overall enzyme activity. Allowing bacteria to have multiple RM systems, but assuming for the sake of simplicity that these systems are all equivalent in terms of enzyme activities and number of recognition sites, we obtain the phage escape probability for k RM systems as , with the corresponding growth rate being (6)

Population dynamics in the presence of phages

What is the combined effect of phage escape and self-restriction in simple bacteria-phage ecologies? To investigate this question, we first extended an established deterministic model of bacteria-phage ecology [31] to track the population dynamics of bacteria with and without RM systems and both susceptible and methylated phages (see S1 Appendix Section S.4.2). By numerically integrating this population model for more than a million parameter combinations for the activity of restriction (r) and methylation (m) enzymes, we find that whether or not phages will ultimately take over the population depends on the ecological parameters (e.g. phage adsorption rate, rate of spontaneous phage inactivation, etc.) but is completely independent of RM system efficiency. This result might seem surprising at a first glance, but closer analysis reveals that for efficient RM systems the phage population reaches levels that are so small that they should be considered as extinction from a biological perspective. Nevertheless, even in these cases methylated phage eventually takes over the population. This is because phages cannot go extinct in the mathematical sense and the phage population always remains at levels that are strictly larger than zero if ecology models based on ordinary differential equations are used for the analysis. While this clearly limits the practical relevance of such models, the finding that RM systems apparently cannot provide long term protection if phage escape probability and phage population size remain strictly larger than zero is still interesting since it suggests that the task of RM systems cannot be to prevent phage escape but only to delay it as much as possible to give bacteria enough time to develop alternative mechanisms of phage resistance through genetic mutations [21, 22].

In line with the above reasoning, we decided to represent phage escape events stochastically and to focus in more detail on how RM systems impact exponentially growing bacterial populations until the first phage escape event. To explore this question, we formulated several efficiency measures that quantify how RM systems can help bacterial populations before the first phage escape event:

  1. How much can a bacterial population grow before the first phage escape event happens?
  2. What is the probability that an immunity conferring mutation happens in any bacterium before the first phage escapes?
  3. What is the total number of mutations accumulated in the population when methylated phages start to spread?

Here we will show that questions (i)-(iii) can be answered rigorously if we assume that the size of the phage population remains approximately constant until the first phage escape event. An example of an ecological scenario where this assumption is realistic is that of bacteria colonizing a phage-dominated environment in which the number of phages is much larger than the number of bacteria such that the reduction in the phage population size due to unsuccessful infections is negligible. More generally, any ecological scenario in which the phage population size is for some reason in equilibrium at least until the first phage escapes on a bacterium carrying a RM system, will fulfill this assumption.

Mathematically, we consider a bacterial population of initial size n0 trying to colonize an environment containing a phage population of size v. As we have shown before, the bacterial population will initially grow exponentially at rate λe until the time τp at which the first phage escape event occurs. Interpreting these events as random, the crucial unknown is therefore τp, the random time to first phage escape, characterized by its probability distribution, f(τp), which we find to be given by (see S1 Appendix Section S.4.3): (7)

Specific examples of this probability distribution are visualized in Fig I in S1 Appendix. In general, larger values of any of the parameters ρ, v, n0, pV or λe will imply that phage escape is likely to occur at earlier times. Importantly, the waiting time distribution until first phage escape, f(τp), allows us to analytically answer questions (i)-(iii), as summarized in Table 1 (see S1 Appendix Section S.4.3). We note that despite the somewhat intricate form of f(τp) the “bacterial performance” metrics derived for all three efficiency criteria turn out to be remarkably simple, depending only on some of the parameters that define f(τp). More concretely, by examining these metrics, we can make two important observations:

Table 1. Efficiency criteria for bacterial populations in different ecological scenarios.

Efficiency criterion in the first column is captured by the corresponding mathematical expression for the bacteria-phage ecology in the second column, which implies the maximization of the “bacterial performance” quantity in the third column. cmut is the mutation rate per unit time per bacterial cell to gain resistance against phage (and is the normalized mutation rate); τmut denotes the (random) time at which the first mutation happens.

First, assuming a fixed mutation rate cmut, expressions for bacterial performance in Table 1 are functions of λe and pV, which depend solely on the restriction rate r, the methylation rate m, and the number of concurrently active RM systems, k. This means that optimal bacterial strategies at the ecological level can be found mathematically —and possibly tuned evolutionarily— by adjusting the three parameters, r, m, and k, defined at the single-cell level.

Second, despite the dependence of the time to phage escape on the initial population size n0, the performance of the bacterial population is independent of n0 according to all criteria. This has the important consequence that optimal defense strategies against phages do not depend on the size of the bacterial population and that there exists a single unique best defense strategy that is constant in time: if phage escape has not happened until a certain time during which the bacterial population has grown to a new size, the same defense strategy continues to be optimal with the initial size taken to be the new size, with no need to re-balance the activity levels r and m of the RM enzymes, or the number of RM systems, k. For cases (ii) and (iii), we further observe that the results are independent of the effective growth rate λe. Faster growth leads to quicker increases in the probability that immunity conferring mutations happen but this is exactly compensated by the increase in probability of a phage escape event.

An in-depth study of the consequences and implications of case (i) is presented in the following section while questions (ii) and (iii) are treated in the S1 Appendix (Section S.4.4). Taken together, these results show how the mathematical framework developed in this paper can be readily adjusted to analyze trade-offs between the efficiency and cost of immunity in different ecological contexts. For the concrete scenarios that we considered here, we find that (i) and (iii) imply overall similar results in which bacteria can relatively directly trade cost for efficiency and vice versa. The results for (ii), however, are qualitatively different since (ii) implies that increasing the cost beyond a certain point provides only diminishing returns in terms of efficiency (Fig J in S1 Appendix). We conclude that analyzing such trade-offs in practice will require careful consideration of the efficiency criteria according to which bacteria might have been shaped in a particular ecological context. Reversely, different trade-offs and optimal strategies for different efficiency criteria imply that the criterion on which evolution might have been operating in a given ecological context can, in principle, be reverse engineered from observations of phage defense strategies that are employed by the bacteria.

Tradeoffs and optimality in bacterial immunity

Can bacteria tune the single-cell parameters over evolutionary timescales in order to minimize the cost of RM systems, that is maximize the growth rate λe(r, m, k), while also maximizing their efficiency, quantified here as the increase in population size before the first phage escape, in (i), that is determined by λe(r, m, k)/pV (r, m, k)? Eq [6] and Table 1 assert that cost and efficiency are necessarily in a tradeoff and cannot be optimized simultaneously. This tradeoff is the first key result of the section. With no single optimum possible, we look instead for Pareto-optimal parameter combinations, (r, m, k), i.e., solutions for which λe cannot be further increased without reducing ns and vice versa [32, 33]. Different Pareto-optimal solutions trace out a “front” in the plot of λe vs ns in Fig 4a that jointly maximizes growth rates and population sizes to the extent possible. Points in the interior of the front are sub-optimal and could be improved by adjusting parameter values, while points beyond the front are inaccessible to any bacterial population. Which Pareto-optimal solution ultimately emerges as an evolutionary stable strategy depends on the actual bacterial and phage species considered as well as their biological context. Rather than focusing on specific examples, we next establish several general results of our analysis, contrasting in particular “fast growth” bacterial strategies that maximize λe with “large size” strategies that maximize ns.

Fig 4. Optimal tradeoffs in the presence of phages.

For all panels, we set NS = 599, NV = 5, and λ = 0.017 min−1 in line with the strain used in [26] carrying the EcoRI system, and without loss of generality we fix l/ρv = 1min such that ns and λe/pV are the same up to units. Note that for these results, we used a biologically realistic number of restrictions sites on the bacterial DNA as opposed to the small value of NS that was used for illustration purposes earlier in the paper. (a) Pareto front (thick line) for expected growth until phage escape, ns(r, m, k), and effective growth rate, λe (r, m, k), traced out by varying m, r, and the number of RM systems, k (different colors; dotted lines show individual Pareto fronts at fixed k); c = cr = cm = 10−5. Examples with specific parameter values are marked on k = 1 front with crosses. (b) Optimal enzyme activities, ropt and mopt (in min−1), on Pareto fronts for different k (color as in a), as a function of effective growth rate, λe. (c) Pareto fronts for different replication rates, λ, and cost c = cr = cm ≈ 3.7 · 10−7 chosen to make E. coli in [26] lie on the front (black dot, also in panel d); only k ≤ 6 are considered. (d) Pareto fronts for different cost c = cr = cm at fixed λ = 0.017 min−1.

We start by examining in Fig 4b the optimal enzyme activities, mopt and ropt, along the Pareto fronts. For the “large size” regime at low λe, the bacterial population primarily needs to defend against phage escape, favoring low m and high r, even at the cost of self-restriction. As we move towards the “fast growth” regime, r can drop to decrease the cost, but m must increase to protect against self-restriction, until maximal mopt is reached. For even higher λe, it is optimal to “shut down” the RM systems altogether to save on the cost, by tuning r and m simultaneously to zero. Numerical analysis (S1 Appendix S.4.5) reveals that along the Pareto front of Fig 4a, the total cost of running the RM systems varies in precise inverse linear relationship with λe. Pareto-optimal solutions are further characterized by the fact that the reduction in growth rate, λ − λe, is split equally between the cost of running RM systems, c(r + m), and self-restriction. If this were not the case and the cost were larger (or smaller) than self-restriction cost to growth, cells could always down- (or up-)regulate the RM system activity to trade cost for self-restriction and obtain an overall smaller total growth reduction. This universal equality of cost of running RM systems and self-restriction at optimality is the second key result of the section.

A detailed examination of the Pareto front in Fig 4a reveals a striking shift in the structure of optimal solutions as we move from “fast growth” to “large size” regime. In situations where fast growth is favored, we observe that a single RM system (k = 1) is optimal. In contrast, large bacterial population sizes favor kopt > 1 RM systems, with the optimal number, kopt, set by the costs, cm and cr, of operating the RM systems. These results are quantitatively robust to changes in replication rate, λ, as shown in Fig 4c, where Pareto fronts for different λ are nearly rescaled versions of each other. These results are also qualitatively robust to changes in the cost c = cr = cm so long as the cost is nonzero, as shown in Fig 4d.

Establishing that “fast growth” regime favors simple innate immunity with a single RM system while “large size” regime favors complex innate immunity with multiple RM systems is the third key result of this section. This result can be understood intuitively by considering under what conditions, if any, multiple RM systems could be optimal at “fast growth”. If costs for R and M enzymes are vanishingly small, a single RM system can provide arbitrarily good protection, as we showed previously. If the costs are not vanishingly small, multiple RM systems must be more costly than a single system at comparable phage escape and self-restriction rates: to keep self-restriction constant with k RM systems, not only does the cell require k times more M molecules than at k = 1, but their individual activities need to be higher as well, leading to a higher cost for M and thus a lower effective growth rate; thus, k > 1 cannot be optimal for “fast growth” and can only be tolerated in the “large size” regime where protection from phages is more important than fast growth.

Lastly, we sought to put our results into perspective by relating them to a typical E. coli strain. Recent measurements [26] quantified the self-restriction rate in a bacterial population with the EcoRI system replicating at λ = 0.017 min−1 to be around μ ≈ 10−3 min−1. The cost of RM systems was not detectable in WT strain but could be detected in strains overexpressing M enzymes. Treating the cost c as unknown and assuming that E. coli is Pareto-optimal in light of criterion (i) in Table 1 (black dots in Fig 4c and 4d), would lead us to predict the following parameter values for the RM systems: cost c ≈ 3.7 ⋅ 10−7, enzyme activities r ≈ 1.2 ⋅ 103 min−1, m ≈ 1.5 ⋅ 102 min−1, with the optimal number of RM systems being at the boundary between k = 1 and k = 2. Clearly, this prediction depends on the chosen measure of the efficiency of RM systems, which is determined by the considered ecological scenario and the particular objective that bacteria have in this scenario. Consequently, the concrete numbers presented here should not be understood as general results, but rather as a demonstration of how our framework can be used to calculate optimal bacterial strategies given different modeling assumptions about the phage-bacteria ecology.


Despite the ubiquity of RM systems in prokaryotic genomes [17], basic ecological and evolutionary aspects of these otherwise simple genetic elements are poorly understood [20]. Although RM systems have been discovered more than six decades ago due to their ability to protect bacteria from phage [34] and this is often assumed to be their main function [35], only a few experimental studies focused on the ecological and evolutionary dynamics of interactions between RM systems and phage [36, 37]. Similarly, effects of RM systems on their host bacteria, such as their cost in individual bacteria due to self-restriction, began to be addressed quantitatively only recently [26, 38]. In this work, we bridged these two scales using mathematical modeling. Our model captures the stochastic nature of RM systems originating at the level of interacting molecules in individual bacteria and extends it all the way to the dynamics of interactions between bacterial and phage populations.

Using this approach, we analytically described tradeoffs between the cost and the efficiency in different ecological contexts of immunity conferred by RM systems. The existence of such tradeoffs was previously indicated by quantitative single-cell experiments with two RM systems isolated from E. coli [26] and has since then been reported in the context of other RM systems [39]. We used our mathematical framework to quantify these tradeoffs and to study their ecological consequences, as well as the implications that these consequences have for optimally tuning the R and M enzymatic activities at the level of single cells. Our results for different ecological scenarios suggest that we should expect observed expression levels and enzymatic activities of naturally occurring RM systems to represent adaptations to specific environmental pressures. Such “tuning” of expression levels towards optimality has previously been directly experimentally shown in different molecular systems [29]. The expression levels of both R and M should be readily tunable by mutations in the often complex gene-regulatory regions [40].

With optimal bacterial defense strategies depending on the ecological scenario and the particular objective of the bacteria (see S1 Appendix Section S.4 and Table 1), making general predictions on R and M expression levels or numbers of concurrently active RM systems that we should expect to find in bacteria in the wild is difficult. However, we want to highlight that, in a given context, assuming optimality of the bacterial defense strategy allows one to make clear and quantitative predictions about enzymatic activities and the number of RM systems, and improving these predictions to take into account more relevant biological detail (if needed and known) remains only a technical, rather than conceptual challenge. Second, for the ecological scenario that we investigated in detail in this paper, parameter values measured for an E. coli RM system put optimal solutions into a regime that permits a large variation in the optimal number of RM systems, between one to six, with relatively small changes in the effective growth rate. This observation allows us to advance the following hypothesis: the number of RM systems in different bacterial strains and species is not a historical contingency, but an evolutionary adaptation to different ecological niches. In other words, the tradeoff between the cost and the efficiency of immunity can be partially alleviated in bacteria employing multiple RM systems. It is therefore interesting to note that many bacterial species carry multiple RM systems and the number of RM systems varies significantly among bacteria with different genome sizes and lifestyles [16, 17]. Our results indicate that different numbers of RM systems would be optimal in populations under different selection pressures (phage predation/resource limitation).

The analytical model presented here makes several simplifying assumptions. First, we consider only interactions between a single species of bacteria and a single species of phage. In natural environments, many bacterial and phage species interact and this diversity will certainly impact the resulting ecological end evolutionary dynamics [36, 4143]. Second, we assumed the key parameters such as the numbers of restriction sites in bacterial and phage genomes to be constant in time and thus disregarded the long-term evolutionary dynamics. Bioinformatic studies have shown that many bacteria and phage avoid using restriction sites in their genomes [4446]. Restriction site avoidance can represent an adaptive mechanism for increasing the probability of escape in phages [45, 47] and decreasing the probability of self-restriction in bacteria [26, 48]. The stochastic nature of RM systems observed at the level of individual cells is thus likely to critically shape the ecological and evolutionary dynamics of interactions between bacteria, RM systems and phage.

Supporting information


The authors would like to thank Moritz Lang, Gregory Batt, Eugenio Cinquemani and Christoph Zechner for helpful discussions.


  1. 1. McAdams H, Arkin A. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences of the USA. 1997;94(3):814–819. pmid:9023339
  2. 2. McAdams H, Arkin A. It’s a noisy business! Genetic regulation at the nanomolecular scale. Trends in Genetics. 1999;15(2):65–69.
  3. 3. Novick A, Weiner M. Enzyme induction as an all-or-none phenomenon. Proceedings of the National Academy of Sciences of the USA. 1957;43(7):553–566. pmid:16590055
  4. 4. Vilar J, Guet C, Leibler S. Modeling network dynamics: the lac operon, a case study. The Journal of Cell Biology. 2003;161(3):471–476. pmid:12743100
  5. 5. Arkin A, Ross J, McAdams H. Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage λ-infected Escherichia coli Cells. Genetics. 1998;149(4):1633–1648.
  6. 6. Zeng L, Skinner S, Zong C, Sippy J, Feiss M, Golding I. Decision Making at a Subcellular Level Determines the Outcome of Bacteriophage Infection. Cell. 2010;141(4):682–691. pmid:20478257
  7. 7. Balaban N, Merrin J, Chait R, Kowalik L, Leibler S. Bacterial persistence as a phenotypic switch. Science. 2004;305(5690):1622–1625. pmid:15308767
  8. 8. Wakamoto Y, Dhar N, Chait R, Schneider K, Signorino-Gelo F, Leibler S, et al. Dynamic Persistence of Antibiotic-Stressed Mycobacteria. Science. 2013;339(6115):91–95. pmid:23288538
  9. 9. Maamar H, Raj A, Dubnau D. Noise in Gene Expression Determines Cell Fate in Bacillus subtilis. Science. 2007;317(5837):526–529. pmid:17569828
  10. 10. Elowitz M, Levine A, Siggia E, Swain P. Stochastic Gene Expression in a Single Cell. Science. 2002;297(5584):1183–1186. pmid:12183631
  11. 11. Huh D, Paulsson J. Non-genetic heterogeneity from stochastic partitioning at cell division. Nature Genetics. 2011;43(2):95–100. pmid:21186354
  12. 12. Tkacik G, Bialek W. Information processing in living systems. Annu Rev Cond Matt Phys. 2015;7:12.1–12.29.
  13. 13. Ghusinga K, Dennehy J, Singh A. First-passage time approach to controlling noise in the timing of intracellular events. Proceedings of the National Academy of Sciences of the USA. 2017;114(4):693–698. pmid:28069947
  14. 14. Hekstra D, Leibler S. Contingency and Statistical Laws in Replicate Microbial Closed Ecosystems. Cell. 2012;149(5):1164–1173. pmid:22632978
  15. 15. Frentz Z, Kuehn S, Leibler S. Strongly Deterministic Population Dynamics in Closed Microbial Communities. Physical Review X. 2015;5(4):041014.
  16. 16. Vasu K, Nagaraja V. Diverse Functions of Restriction-Modification Systems in Addition to Cellular Defense. Microbiology and Molecular Biology Reviews. 2013;77(1):53–72. pmid:23471617
  17. 17. Oliveira P, Touchon M, Rocha E. The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Research. 2014;42(16):10618–10631. pmid:25120263
  18. 18. Oliveira P, Touchon M, Rocha E. Regulation of genetic flux between bacteria by restriction-modification systems. Proceedings of the National Academy of Sciences of the USA. 2016;113(20):5658–5663. pmid:27140615
  19. 19. Kobayashi I. Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution. Nucleic Acids Research. 2001;29(18):3742–3756. pmid:11557807
  20. 20. Murray NE. Immigration control of DNA in bacteria: self versus non-self. Microbiology. 2002;148(1):3–20. pmid:11782494
  21. 21. Korona R, Korona B, Levin B. Sensitivity of naturally occurring coliphages to type I and type II restriction and modification. Journal of General Microbiology. 1993;6:1283–1290.
  22. 22. Gurney J, Pleška M, Levin B. Why put up with immunity when there is resistance: an excursion into the population and evolutionary dynamics of restriction–modification and CRISPR-Cas. Philosophical Transactions of the Royal Society B: Biological Sciences. 2019;374(1772):20180096.
  23. 23. Arber W, Linn S. DNA modification and restriction. Annual Review of Biochemistry. 1969;38:467–500. pmid:4897066
  24. 24. Tock M, Dryden D. The biology of restriction and anti-restriction. Current Opinion in Microbiology. 2005;8(4):466–472. pmid:15979932
  25. 25. Pleška M, Lang M, Refardt D, Levin B, Guet C. Phage–host population dynamics promotes prophage acquisition in bacteria with innate immunity. Nature Ecology & Evolution. 2018;2:359–366.
  26. 26. Pleška M, Qian L, Okura R, Bergmiller T, Wakamoto Y, Kussell E, et al. Bacterial Autoimmunity Due to a Restriction-Modification System. Current Biology. 2016;26(3):404–409. pmid:26804559
  27. 27. Buchholz P, Kriege J, Felko I. Input modeling with phase-type distributions and Markov models. Berlin, Germany: Springer Briefs in Mathematics; 2014.
  28. 28. Enikeeva F, Severinov K, Gelfand M. Restriction-modification systems and bacteriophage invasion: Who wins? Journal of Theoretical Biology. 2010;266:550–559. pmid:20633563
  29. 29. Dekel E, Alon U. Optimality and evolutionary tuning of the expression level of a protein. Nature. 2005;436:588–592. pmid:16049495
  30. 30. Stoebel D, Dean A, Dykhuizen D. The cost of expression of Escherichia coli lac operon proteins is in the process, not in the products. Genetics. 2008;178(3):1653–1660. pmid:18245823
  31. 31. Campbell A. Conditions for the Existence of Bacteriophage. Evolution. 1961;15(2):153–165.
  32. 32. Shoval O, Sheftel H, Shinar G, Hart Y, Ramote O, Mayo A, et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science. 2012;336(6085):1157–1160. pmid:22539553
  33. 33. Otero-Muras I, Banga J. Automated Design Framework for Synthetic Biology Exploiting Pareto Optimality. ACS Synthetic Biology. 2017;6(7):1180–1193. pmid:28350462
  34. 34. Bertani G, Weigle J. Host controlled variation in bacterial viruses. Journal of bacteriology. 1953;65(2):113. pmid:13034700
  35. 35. Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nature Reviews Microbiology. 2010;8(5):317. pmid:20348932
  36. 36. Korona R, Levin BR. Phage-mediated selection and the evolution and maintenance of restriction-modification. Evolution. 1993;47(2):556–575. pmid:28568719
  37. 37. Pleška M, Lang M, Refardt D, Levin BR, Guet CC. Phage–host population dynamics promotes prophage acquisition in bacteria with innate immunity. Nature ecology & evolution. 2018;2(2):359.
  38. 38. Nagamalleswari E, Rao S, Vasu K, Nagaraja V. Restriction endonuclease triggered bacterial apoptosis as a mechanism for long time survival. Nucleic acids research. 2017;45(14):8423–8434. pmid:28854737
  39. 39. Mruk I, Kaczorowski T, Witczak A. Natural tuning of restriction endonuclease synthesis by cluster of rare arginine codons. Scientific Reports. 2019;9:5808. pmid:30967604
  40. 40. Mruk I, Kobayashi I. To be or not to be: regulation of restriction–modification systems and other toxin–antitoxin systems. Nucleic acids research. 2013;42(1):70–86.
  41. 41. Dillon R, Vennard C, Buckling A, Charnley A. Diversity of locust gut bacteria protects against pathogen invasion. Ecology Letters. 2005;8(12):1291–1298.
  42. 42. Kassen R, Buckling A, Bell G, Rainey PB. Diversity peaks at intermediate productivity in a laboratory microcosm. Nature. 2000;406(6795):508. pmid:10952310
  43. 43. van Houte S, Ekroth AK, Broniewski JM, Chabas H, Ashby B, Bondy-Denomy J, et al. The diversity-generating benefits of a prokaryotic adaptive immune system. Nature. 2016;532(7599):385. pmid:27074511
  44. 44. Gelfand MS, Koonin EV. Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes. Nucleic acids research. 1997;25(12):2430–2439. pmid:9171096
  45. 45. Sharp PM. Molecular evolution of bacteriophages: evidence of selection against the recognition sites of host restriction enzymes. Molecular biology and evolution. 1986;3(1):75–83. pmid:2832688
  46. 46. Rocha EP, Danchin A, Viari A. Evolutionary role of restriction/modification systems as revealed by comparative genome analysis. Genome research. 2001;11(6):946–958. pmid:11381024
  47. 47. Pleška M, Guet CC. Effects of mutations in phage restriction sites during escape from restriction–modification. Biology letters. 2017;13(12):20170646.
  48. 48. Qian L, Kussell E. Evolutionary dynamics of restriction site avoidance. Physical review letters. 2012;108(15):158105. pmid:22587291