## Figures

## Abstract

Mathematical models have been used successfully at diverse scales of biological organization, ranging from ecology and population dynamics to stochastic reaction events occurring between individual molecules in single cells. Generally, many biological processes unfold across multiple scales, with mutations being the best studied example of how stochasticity at the molecular scale can influence outcomes at the population scale. In many other contexts, however, an analogous link between micro- and macro-scale remains elusive, primarily due to the challenges involved in setting up and analyzing multi-scale models. Here, we employ such a model to investigate how stochasticity propagates from individual biochemical reaction events in the bacterial innate immune system to the ecology of bacteria and bacterial viruses. We show analytically how the dynamics of bacterial populations are shaped by the activities of immunity-conferring enzymes in single cells and how the ecological consequences imply optimal bacterial defense strategies against viruses. Our results suggest that bacterial populations in the presence of viruses can either optimize their initial growth rate or their population size, with the first strategy favoring simple immunity featuring a single restriction modification system and the second strategy favoring complex bacterial innate immunity featuring several simultaneously active restriction modification systems.

## Author summary

Mathematical understanding of how randomness at the molecular scale, also known as molecular noise, ultimately affects the fate of organisms and whole populations is widely recognized as a challenging problem in multi-scale modeling. Here, we develop an analytical framework for analyzing how the randomness of individual reaction events in single cells propagates to higher levels of biological organization and affects population and ecology scale dynamics. We deploy our mathematical results to study an example from the ecology of bacteria and bacteriophage viruses. Bacteria defend themselves against viruses by a simple innate immune system composed of a pair of enzymes. Due to molecular noise, however, viruses sometimes escape this immunity, causing bacterial populations to plummet. Noise can also cause the immune system to turn against its own host. By analyzing how such costs and benefits of bacterial immunity balance at the ecological level, we predict the optimal parameters for bacterial innate immune systems. While the focus of this work is on bacteria-phage ecologies, we expect that our results will generally help to better understand population and ecology scale consequences of stochastic cell fate decisions in diverse biological domains ranging from cell differentiation in developmental biology to studies of the microbiome and consequences of stochasticity in apoptotic responses for anti-cancer therapies.

**Citation: **Ruess J, Pleška M, Guet CC, Tkačik G (2019) Molecular noise of innate immunity shapes bacteria-phage ecologies. PLoS Comput Biol 15(7):
e1007168.
https://doi.org/10.1371/journal.pcbi.1007168

**Editor: **Oleg A. Igoshin,
Rice University, UNITED STATES

**Received: **January 31, 2019; **Accepted: **June 7, 2019; **Published: ** July 2, 2019

**Copyright: ** © 2019 Ruess et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **All relevant data are within the manuscript and its Supporting Information files.

**Funding: **JR acknowledges support from the Agence Nationale de la Recherche (ANR, www.agence-nationale-recherche.fr) under Grants No. ANR-16-CE33-0018 (MEMIP), ANR-16-CE12-444 0025 (COGEX), and ANR-18-CE91-0002-01 (CyberCircuits). CCG was supported by the HFSP Young Investigators’ grant (www.hfsp.org/funding). MP was a recipient of a DOC Fellowship of the Austrian Academy of Science (www.oeaw.ac.at) at the Institute of Science and Technology Austria. GT was supported in part by the Austrian Science Fund (www.fwf.ac.at) grant FWF P28844. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

One of the major challenges in biology is to understand how interactions between individual molecules shape living organisms and ultimately give rise to emergent behaviors at the level of populations or even ecosystems. At the very bottom of this hierarchy, inside single cells, interacting biomolecules such as DNA or proteins are often present in small numbers, giving rise to intrinsic stochasticity of individual reaction events [1, 2]. As a result, genetically identical organisms occupying identical environments can express different phenotypes [3, 4] and make different decisions when presented with identical environmental cues [5, 6]. This *molecular noise* is known to be the cause of biologically and medically important traits of bacteria such as persistence in response to antibiotics [7, 8] and competence during acquisition of heterologous DNA [9]. However, while its causes and consequences are relatively well-studied at the organismal level [10–13], how molecular noise propagates to higher scales of biological organization to affect the ecology and evolution of organisms remains mostly unknown [4]. Recently it has been shown that ecosystems can follow surprisingly deterministic trajectories despite the prevalence of stochastic events [14, 15], yet these trajectories could themselves be strongly influenced by molecular noise. Thus, the extent to which ecological interactions are affected by molecular noise, and the extent to which these ecological consequences feed back to reshape individual traits, remain to be explored.

Perhaps the most prevalent biological systems in which molecular noise is thought to play an important role are restriction-modification (RM) systems [16]. Present in nearly all prokaryotic genomes [17], RM systems are a highly diverse class of genetic elements. They have been shown to play multiple roles in bacteria as well as archaea, including regulation of genetic flux [18] and stabilization of mobile genetic elements [19], but have most frequently been described as primitive innate immune systems due to their ability to protect their hosts from bacterial viruses [20]. When a virus (bacteriophage or phage) infects a bacterium carrying a RM system, the DNA of the phage gets cleaved with a very high probability, thus aborting the infection. With a very small probability, however, the phage can escape and become immune to restriction by that specific RM system through epigenetic modification, leading to its spread and potentially death of the whole bacterial population in absence of alternative mechanisms of phage resistance [21, 22]. Thus, in the context of RM systems, molecular noise occurring at the level of individual bacteria can have profound ecological and evolutionary consequences. Because RM systems are ultimately based on only two very well characterized enzymatic activities (restriction and modification) [23], they represent a simple and tractable biological system in which we can investigate propagation of effects of molecular noise across different scales of biological organization.

Here, we mathematically model the action of RM systems from individual molecular events occurring inside a single cell, through individual bacteria competing in a population, to interactions between populations of bacteria and phages in a simple ecological setting, as shown in Fig 1. We demonstrate that, by imposing a tradeoff between the efficiency and cost of immunity, molecular noise in RM systems occurring at the level of individual bacteria has consequences that propagate all the way up to the ecological scale, and that the ecological consequences in turn imply the existence of optimal bacterial defense strategies against phages.

In single cells (green shade), stochastic activity of restriction-modification enzymes (R—red, M—blue) may inadvertently lead to full methylation of invading phage DNA in a process known as “phage escape” from the bacterial immunity, resulting in direct consequences at the ecological scale (blue shade). Also, stochastic activity of the same enzymes may lead to accidental cutting of bacterial DNA in a process known as “self-restriction,” resulting in lowered bacterial growth rate at the population level (red shade). Trade-offs between self-restriction and phage escape at the ecological scale in turn influence R and M enzyme activities in surviving cells, thereby optimally balancing the efficiency and cost of immunity.

## Methods and results

### Self-restriction in single cells and in growing populations

RM systems consist of two enzymes, a restriction-endonuclease R, that recognizes and cuts specific DNA sequences (restriction sites), and a methyl-transferase M, that recognizes the same DNA sequences and ensures that only invading phage DNA can be cut by the endonuclease while the bacterial DNA remains methylated and protected. However, since chemical reactions occur stochastically, RM systems can produce errors and fully methylate invading phage DNA before it is cut and degraded (phage escape, typically occurring with a probability ranging between 10^{−2} and 10^{−8}) [24, 25]. Similarly, it is possible that newly replicated restriction sites on the bacterial DNA, which are originally unmethylated, are accidentally cleaved instead of methylated (self-restriction) [26].

Inside a single cell, the probability of such self-restriction events depends on the total activity, *r*, of all restriction endonuclease molecules R, the total activity, *m*, of all methylase molecules M, as well as the bacterial replication rate λ, since λ determines the rate at which new unmethylated restriction sites are generated. To investigate how self-restriction depends on these parameters, we model the corresponding biochemical reactions at each individual restriction site on the bacterial DNA with the stochastic reaction network displayed in Fig 2a (see S1 Appendix Section S.1). The time *τ*_{S} until the first self-restriction event in a given cell—i.e., until that cell’s death or substantial reduction in growth rate—can be obtained as the time when the first restriction site is cut, that is as *τ*_{S} = min_{i∈{1,…,NS}} *τ*_{i}, for bacterial DNA with *N*_{S} restriction sites, where *τ*_{i}, *i* = 1,…,*N*_{S} are the waiting times for cutting events at individual sites. It can be shown that all *τ*_{i} follow a phase-type distribution (see [27] and Fig 2b and 2c):
(1)
with **p**_{Q} = [*p*_{0} *p*_{1} *p*_{2}] being the initial methylation configuration, i.e., the proportion of restriction sites that are unmethylated (*p*_{0}), hemi-methylated (*p*_{1}) and doubly-methylated (*p*_{2}); see S1 Appendix Section S.1.

**(a)** Model of the individual-site methylation dynamics. The restriction site can either be doubly-methylated (first circle), hemi-methylated (second circle), or unmethylated (third circle). Methylation events happen at a rate proportional to the activity *m* of M. During growth at rate λ, the DNA and its restriction sites are replicated, with the newly synthesized DNA having no methylation marks; for our model, growth is thus identical to demethylation reactions denoted in the figure. Restriction is assumed to be lethal (no DNA repair) and can happen when a site is unmethylated at a rate proportional to the activity *r* of R, leading to cell death (fourth circle). **(b)** Distribution of time to self restriction, *f*(*τ*), depends on enzyme activities, *r* and *m*, for a restriction site that is initially doubly-methylated. Increasing values of *r* lead to decreasing expected times to self-restriction for all *m* (cf. blue vs red). Given *r*, as *m* grows large, the site will almost never be unmethylated and cut, yielding a very broad distribution (green). λ = λ_{ref} = 1.7 ⋅ 10^{−2} min ^{−1}; reference (red) values: *m*_{ref} = 0.05 min^{−1}, *r*_{ref} = 0.1 min^{−1}; “Small” (“large”) *r*, *m* values are 2*e*-fold lower (*e*-fold higher) than reference values. **(c)** Dependence of self-restriction on the initial methylation configuration, **p**_{Q} (dark blue = doubly-methylated; blue = hemi-methylated; light blue = unmethylated) at *r*_{ref}, *m*_{ref}.

Eq [1] allows us to derive the expected time until self-restriction of a single site as
(2)
more generally, Fig 2b shows how the distribution of waiting times depends on the restriction rate *r* (increasing the probability of the site getting cut when it is unmethylated) and the magnitude of *m* relative to λ (which decreases the probability that the site is unmethylated in the first place).

Fig 2c shows that time to self-restriction at a single site depends essentially on an unknown quantity, the methylation configuration **p**_{Q}. We will now proceed to show that when we consider an exponentially growing population of bacterial cells, the configuration **p**_{Q} can no longer be freely chosen, and has to be determined self-consistently instead. Intuitively, this is because when the bacterial population is in steady-state growth, new unmethylated sites are constantly replenished by replication, while cells with more unmethylated sites are simultaneously and preferentially being removed, as illustrated in Fig 3a and required by Eq [2]. These two forces, generation of new unmethylated sites and their preferential removal, will push any initial **p**_{Q} towards a unique steady state equilibrium.

**(a) Top:** Selective cutting of cells (red ‘X’) with unmethylated restriction sites biases growing populations towards methylation configurations with larger numbers of methylated sites. **Bottom:** The long-term population distribution of methylation configurations, **p**_{QSD} (*p*_{0}—unmethylated, *p*_{1}—hemi-methylated, *p*_{2}—doubly-methylated; see S1 Appendix Section S.1). Increasing either *r* or *m* reduces the probability of finding unmethylated restriction sites in the population. Three example choices for *r*, *m* (blue, red, green) and λ_{ref} are same as in Fig 2 and are denoted with crosses in panel b and corresponding colors in panel c; *N*_{S} = 5 is chosen for illustration purposes. **(b)** Dependence of the self-restriction rate, *μ*(*r*, *m*, λ_{ref}), on the enzyme activities. **(c)** Effective population growth rate λ_{e} = λ − *μ*(*r*, *m*, λ) as a function of the replication rate λ for different enzyme activities *r* and *m*. For the reference parameters (red), the self-restriction rate computed using the quasi-steady state distribution **p**_{QSD} (solid) is significantly different when compared to an estimate based on fully equilibrated **p**_{Q} in which restriction does not lead to an absorbing state (dashed).

Mathematically, assuming that the methylation dynamics in all cells are equilibrated and that cells cannot be distinguished, the internal methylation configuration of any randomly chosen cell at any time during growth of the population can be derived from the quasi-stationary distribution **p**_{QSD}(*r*, *m*) of the individual-site methylation process in Fig 2a (see S1 Appendix Section S.1). **p**_{QSD}(*r*, *m*) is the equilibrium distribution of the stochastic process conditional on it not having reached the absorbing state where the DNA is cut and the cell has died (Fig 3a); in short, methylation and growth equilibrate “*in all directions except the one leading towards self-restriction*”. Then, setting **p**_{Q} = **p**_{QSD}(*r*, *m*) in Eq [1] reduces the phase-type distribution *f*(*τ*_{i}) for the time *τ*_{i} until self-restriction at an individual restriction site to a single exponential, implying further that the waiting time *τ*_{S} = min_{i∈{1,…,NS}} *τ*_{i} until self-restriction of any site in the cell is also exponentially distributed. Consequently, we are led to the main result of this section: growth with self-restriction can be rigorously modeled at the population level with a Markov birth-death process for which the expected population size *n*(*t*) follows a simple ordinary differential equation
(3)
where λ_{e}(*r*, *m*, λ) = λ − *μ*(*r*, *m*, λ) is the effective growth rate and *μ*(*r*, *m*, λ) is the rate of self-restriction, defined as the inverse of the per-cell expected waiting time until self-restriction
(4)
with γ_{1} being the largest eigenvalue of **B** (an explicit stochastic simulation validating this analytical result is provided in the S1 Appendix Section S.2).

Eq [4] allows us to straightforwardly evaluate the reduction in the population growth rate due to random self-restriction events in single cells for any given pair of enzyme activities, *r* and *m*. To study possible qualitative effects of self-restriction, we explore in Fig 3b a wide range of enzyme activities for a system with *N*_{S} = 5 restriction sites (chosen, for illustration purposes, significantly smaller than the typical number of sites recognized by real RM systems). We find that the main determinant of self-restriction is the activity *m* of the methyl-transferase and that the effects of molecular noise can be suppressed by sufficiently increasing *m*. Furthermore, so long as *m* is large enough such that unmethylated restriction sites are only rarely available, *μ*(*r*, *m*, λ) lies on a large plateau of low self-restriction and changes only little with *r* and *m*, suggesting that stochastic fluctuations in enzyme activities would only have minor consequences for the population, especially when they are positively correlated, as would be the case if R and M enzymes were expressed from the same operon (S1 Appendix Section S.3).

The (*r*, *m*) plane in Fig 3b contains a transition region that separates the large plateau with low self-restriction from the plateau where self-restriction is severe enough to stop the population growth altogether. We have chosen our reference (red) parameter values (*r*_{ref},*m*_{ref}) to lie in this transition region, and explored the regime with an *e*-fold higher rates (“large r & m”, indicated by green), and with 2*e*-fold lower rates (“small r & m”, indicated by blue) in Fig 3b and 3c. The comparison of these three regimes in Fig 3c is most clear when the effective growth rate is shown as a function of λ, the rate at which the cells, and thus the restriction sites, are replicated. In the “small r & m” regime, self-restriction is so infrequent that it can easily be outgrown by replication (except at very low λ). In the “large r & m” regime, *m* is sufficiently high to keep the restriction sites protected and thus self-restriction is rare, except at extremely large λ, where the green curve falls below the blue curve. In the reference regime, *r* is too large and *m* not high enough to protect, so self-restriction can not be “outgrown”; effective growth thus falls significantly below λ. Our numerical analyses further show that the self-restriction rate *μ*(*r*, *m*, λ) grows faster-than-linearly with λ (S1 Appendix Section S.1), causing the effective population growth to slow down and ultimately drop to zero at high enough λ.

We end this section by highlighting a non-trivial interaction between the single-cell and population-scale processes. While increasing the activity *r* of the endonuclease always decreases the effective growth rate of the population due to self-restriction, the effect can be smaller than expected from the single-cell analysis (dashed lines in Fig 3c). This is because high values of *r* feed back through the population scale to bias the steady-state distribution of methylation configurations away from cells with lots of unmethylated sites, as shown in Fig 3a, making self-restriction less likely. Implicit feedback effects of this type frequently give rise to complex dynamics in multi-scale models.

### Phage escape

RM systems lower the growth rate of the population due to self restriction, especially when the activity *m* of the methyl-transferase is small. Upon infection by a phage, however, small values of *m* are advantageous, making it less likely that the unmethylated phage DNA will get methylated and escape the immune system before it can be cut by the restriction endonuclease.

Assuming that all restriction sites are identical and independent, the probability of phage escape can be calculated [28] as
(5)
where *N*_{V} is the number of restriction sites on the phage DNA. From Eq [5] it is straightforward to see that *p*_{V}(*r*, *m*) is monotonically increasing in *m* and decreasing in *r*. One might therefore expect that the balance between avoiding self-restriction that favors high *m*, Eq [4], and minimizing phage escape that favors low *m*, Eq [5], would impose a tradeoff and thus lead to an optimal value of *m*. However, this is not the case, because the phage escape probability *p*_{V}(*r*, *m*) and the population self-restriction rate *μ*(*r*, *m*, λ) can both approach zero so long as *r* and *m* both increase to infinity but *r* does so faster. While mathematically possible, this limit is, however, not biologically relevant: large enzyme expression levels should incur a cost (metabolic or due to toxicity presumably caused by non-specific protein-DNA interactions in the case of RM systems) for the cells [29, 30], which we sought to incorporate into our model by including a growth rate penalty proportional to the activity of restriction and methylation enzymes, i.e., λ_{e}(*r*, *m*, λ) = λ − *μ*(*r*, *m*, λ) − *c*_{r}*r* − *c*_{m}*m*. Interestingly, it can be verified that our reasoning is valid only because two subsequent demethylation events need to occur to create a restriction-susceptible site on the bacterial DNA (S1 Appendix Section S.1). If hemi-methylated sites could be recognized by the restriction endonuclease, or if both methyl groups could be lost in a single event, our initial expectation about the existence of the tradeoff would be correct, and a particular choice of *r* and *m* values would simultaneously minimize the phage escape and self-restriction, even in the absence of the expression cost for R and M.

Our model can be generalized to multiple coexisting RM systems that recognize different restriction sites and operate in parallel, as is often observed for bacteria in the wild [17]. This provides increased protection from phages since the phage has to escape all RM systems to infect successfully. However, multiple RM systems also imply that the bacteria either have to pay higher expression and self-restriction costs or that they have to re-balance the expression levels of the enzymes such that lower self-restriction rates per RM system are obtained with the same overall enzyme activity. Allowing bacteria to have multiple RM systems, but assuming for the sake of simplicity that these systems are all equivalent in terms of enzyme activities and number of recognition sites, we obtain the phage escape probability for *k* RM systems as , with the corresponding growth rate being
(6)

### Population dynamics in the presence of phages

What is the combined effect of phage escape and self-restriction in simple bacteria-phage ecologies? To investigate this question, we first extended an established deterministic model of bacteria-phage ecology [31] to track the population dynamics of bacteria with and without RM systems and both susceptible and methylated phages (see S1 Appendix Section S.4.2). By numerically integrating this population model for more than a million parameter combinations for the activity of restriction (*r*) and methylation (*m*) enzymes, we find that whether or not phages will ultimately take over the population depends on the ecological parameters (e.g. phage adsorption rate, rate of spontaneous phage inactivation, etc.) but is completely independent of RM system efficiency. This result might seem surprising at a first glance, but closer analysis reveals that for efficient RM systems the phage population reaches levels that are so small that they should be considered as extinction from a biological perspective. Nevertheless, even in these cases methylated phage eventually takes over the population. This is because phages cannot go extinct in the mathematical sense and the phage population always remains at levels that are strictly larger than zero if ecology models based on ordinary differential equations are used for the analysis. While this clearly limits the practical relevance of such models, the finding that RM systems apparently cannot provide long term protection if phage escape probability and phage population size remain strictly larger than zero is still interesting since it suggests that the task of RM systems cannot be to prevent phage escape but only to delay it as much as possible to give bacteria enough time to develop alternative mechanisms of phage resistance through genetic mutations [21, 22].

In line with the above reasoning, we decided to represent phage escape events stochastically and to focus in more detail on how RM systems impact exponentially growing bacterial populations until the first phage escape event. To explore this question, we formulated several efficiency measures that quantify how RM systems can help bacterial populations before the first phage escape event:

- How much can a bacterial population grow before the first phage escape event happens?
- What is the probability that an immunity conferring mutation happens in any bacterium before the first phage escapes?
- What is the total number of mutations accumulated in the population when methylated phages start to spread?

Here we will show that questions (i)-(iii) can be answered rigorously if we assume that the size of the phage population remains approximately constant until the first phage escape event. An example of an ecological scenario where this assumption is realistic is that of bacteria colonizing a phage-dominated environment in which the number of phages is much larger than the number of bacteria such that the reduction in the phage population size due to unsuccessful infections is negligible. More generally, any ecological scenario in which the phage population size is for some reason in equilibrium at least until the first phage escapes on a bacterium carrying a RM system, will fulfill this assumption.

Mathematically, we consider a bacterial population of initial size *n*_{0} trying to colonize an environment containing a phage population of size *v*. As we have shown before, the bacterial population will initially grow exponentially at rate λ_{e} until the time *τ*_{p} at which the first phage escape event occurs. Interpreting these events as random, the crucial unknown is therefore *τ*_{p}, the random time to first phage escape, characterized by its probability distribution, *f*(*τ*_{p}), which we find to be given by (see S1 Appendix Section S.4.3):
(7)

Specific examples of this probability distribution are visualized in Fig I in S1 Appendix. In general, larger values of any of the parameters *ρ*, *v*, *n*_{0}, *p*_{V} or λ_{e} will imply that phage escape is likely to occur at earlier times. Importantly, the waiting time distribution until first phage escape, *f*(*τ*_{p}), allows us to analytically answer questions (i)-(iii), as summarized in Table 1 (see S1 Appendix Section S.4.3). We note that despite the somewhat intricate form of *f*(*τ*_{p}) the “bacterial performance” metrics derived for all three efficiency criteria turn out to be remarkably simple, depending only on some of the parameters that define *f*(*τ*_{p}). More concretely, by examining these metrics, we can make two important observations:

Efficiency criterion in the first column is captured by the corresponding mathematical expression for the bacteria-phage ecology in the second column, which implies the maximization of the “bacterial performance” quantity in the third column. *c*_{mut} is the mutation rate per unit time per bacterial cell to gain resistance against phage (and is the normalized mutation rate); *τ*_{mut} denotes the (random) time at which the first mutation happens.

First, assuming a fixed mutation rate *c*_{mut}, expressions for bacterial performance in Table 1 are functions of λ_{e} and *p*_{V}, which depend solely on the restriction rate *r*, the methylation rate *m*, and the number of concurrently active RM systems, *k*. This means that optimal bacterial strategies at the ecological level can be found mathematically —and possibly tuned evolutionarily— by adjusting the three parameters, *r*, *m*, and *k*, defined at the single-cell level.

Second, despite the dependence of the time to phage escape on the initial population size *n*_{0}, the performance of the bacterial population is independent of *n*_{0} according to all criteria. This has the important consequence that optimal defense strategies against phages do not depend on the size of the bacterial population and that there exists a single unique best defense strategy that is constant in time: if phage escape has not happened until a certain time during which the bacterial population has grown to a new size, the same defense strategy continues to be optimal with the initial size taken to be the new size, with no need to re-balance the activity levels *r* and *m* of the RM enzymes, or the number of RM systems, *k*. For cases (ii) and (iii), we further observe that the results are independent of the effective growth rate λ_{e}. Faster growth leads to quicker increases in the probability that immunity conferring mutations happen but this is exactly compensated by the increase in probability of a phage escape event.

An in-depth study of the consequences and implications of case (i) is presented in the following section while questions (ii) and (iii) are treated in the S1 Appendix (Section S.4.4). Taken together, these results show how the mathematical framework developed in this paper can be readily adjusted to analyze trade-offs between the efficiency and cost of immunity in different ecological contexts. For the concrete scenarios that we considered here, we find that (i) and (iii) imply overall similar results in which bacteria can relatively directly trade cost for efficiency and vice versa. The results for (ii), however, are qualitatively different since (ii) implies that increasing the cost beyond a certain point provides only diminishing returns in terms of efficiency (Fig J in S1 Appendix). We conclude that analyzing such trade-offs in practice will require careful consideration of the efficiency criteria according to which bacteria might have been shaped in a particular ecological context. Reversely, different trade-offs and optimal strategies for different efficiency criteria imply that the criterion on which evolution might have been operating in a given ecological context can, in principle, be reverse engineered from observations of phage defense strategies that are employed by the bacteria.

### Tradeoffs and optimality in bacterial immunity

Can bacteria tune the single-cell parameters over evolutionary timescales in order to minimize the cost of RM systems, that is maximize the growth rate λ_{e}(*r*, *m*, *k*), while also maximizing their efficiency, quantified here as the increase in population size before the first phage escape, in (i), that is determined by λ_{e}(*r*, *m*, *k*)/*p*_{V} (*r*, *m*, *k*)? Eq [6] and Table 1 assert that cost and efficiency are necessarily in a tradeoff and cannot be optimized simultaneously. This tradeoff is the first key result of the section. With no single optimum possible, we look instead for Pareto-optimal parameter combinations, (*r*, *m*, *k*), i.e., solutions for which λ_{e} cannot be further increased without reducing *n*_{s} and vice versa [32, 33]. Different Pareto-optimal solutions trace out a “front” in the plot of λ_{e} vs *n*_{s} in Fig 4a that jointly maximizes growth rates and population sizes to the extent possible. Points in the interior of the front are sub-optimal and could be improved by adjusting parameter values, while points beyond the front are inaccessible to any bacterial population. Which Pareto-optimal solution ultimately emerges as an evolutionary stable strategy depends on the actual bacterial and phage species considered as well as their biological context. Rather than focusing on specific examples, we next establish several general results of our analysis, contrasting in particular “fast growth” bacterial strategies that maximize λ_{e} with “large size” strategies that maximize *n*_{s}.

For all panels, we set *N*_{S} = 599, *N*_{V} = 5, and λ = 0.017 min^{−1} in line with the strain used in [26] carrying the EcoRI system, and without loss of generality we fix *l*/*ρv* = 1min such that *n*_{s} and λ_{e}/*p*_{V} are the same up to units. Note that for these results, we used a biologically realistic number of restrictions sites on the bacterial DNA as opposed to the small value of *N*_{S} that was used for illustration purposes earlier in the paper. **(a)** Pareto front (thick line) for expected growth until phage escape, *n*_{s}(*r*, *m*, *k*), and effective growth rate, λ_{e} (*r*, *m*, *k*), traced out by varying *m*, *r*, and the number of RM systems, *k* (different colors; dotted lines show individual Pareto fronts at fixed *k*); *c* = *c*_{r} = *c*_{m} = 10^{−5}. Examples with specific parameter values are marked on *k* = 1 front with crosses. **(b)** Optimal enzyme activities, *r*_{opt} and *m*_{opt} (in min^{−1}), on Pareto fronts for different *k* (color as in a), as a function of effective growth rate, λ_{e}. **(c)** Pareto fronts for different replication rates, λ, and cost *c* = *c*_{r} = *c*_{m} ≈ 3.7 · 10^{−7} chosen to make *E. coli* in [26] lie on the front (black dot, also in panel d); only *k* ≤ 6 are considered. **(d)** Pareto fronts for different cost *c* = *c*_{r} = *c*_{m} at fixed λ = 0.017 min^{−1}.

We start by examining in Fig 4b the optimal enzyme activities, *m*_{opt} and *r*_{opt}, along the Pareto fronts. For the “large size” regime at low λ_{e}, the bacterial population primarily needs to defend against phage escape, favoring low *m* and high *r*, even at the cost of self-restriction. As we move towards the “fast growth” regime, *r* can drop to decrease the cost, but *m* must increase to protect against self-restriction, until maximal *m*_{opt} is reached. For even higher λ_{e}, it is optimal to “shut down” the RM systems altogether to save on the cost, by tuning *r* and *m* simultaneously to zero. Numerical analysis (S1 Appendix S.4.5) reveals that along the Pareto front of Fig 4a, the *total* cost of running the RM systems varies in precise inverse linear relationship with λ_{e}. Pareto-optimal solutions are further characterized by the fact that the reduction in growth rate, λ − λ_{e}, is split equally between the cost of running RM systems, *c*(*r* + *m*), and self-restriction. If this were not the case and the cost were larger (or smaller) than self-restriction cost to growth, cells could always down- (or up-)regulate the RM system activity to trade cost for self-restriction and obtain an overall smaller total growth reduction. This universal equality of cost of running RM systems and self-restriction at optimality is the second key result of the section.

A detailed examination of the Pareto front in Fig 4a reveals a striking shift in the structure of optimal solutions as we move from “fast growth” to “large size” regime. In situations where fast growth is favored, we observe that a single RM system (*k* = 1) is optimal. In contrast, large bacterial population sizes favor *k*_{opt} > 1 RM systems, with the optimal number, *k*_{opt}, set by the costs, *c*_{m} and *c*_{r}, of operating the RM systems. These results are quantitatively robust to changes in replication rate, λ, as shown in Fig 4c, where Pareto fronts for different λ are nearly rescaled versions of each other. These results are also qualitatively robust to changes in the cost *c* = *c*_{r} = *c*_{m} so long as the cost is nonzero, as shown in Fig 4d.

Establishing that “fast growth” regime favors simple innate immunity with a single RM system while “large size” regime favors complex innate immunity with multiple RM systems is the third key result of this section. This result can be understood intuitively by considering under what conditions, if any, multiple RM systems could be optimal at “fast growth”. If costs for R and M enzymes are vanishingly small, a single RM system can provide arbitrarily good protection, as we showed previously. If the costs are not vanishingly small, multiple RM systems must be more costly than a single system at comparable phage escape and self-restriction rates: to keep self-restriction constant with *k* RM systems, not only does the cell require *k* times more M molecules than at *k* = 1, but their individual activities need to be higher as well, leading to a higher cost for M and thus a lower effective growth rate; thus, *k* > 1 cannot be optimal for “fast growth” and can only be tolerated in the “large size” regime where protection from phages is more important than fast growth.

Lastly, we sought to put our results into perspective by relating them to a typical *E. coli* strain. Recent measurements [26] quantified the self-restriction rate in a bacterial population with the EcoRI system replicating at λ = 0.017 min^{−1} to be around *μ* ≈ 10^{−3} min^{−1}. The cost of RM systems was not detectable in WT strain but could be detected in strains overexpressing M enzymes. Treating the cost *c* as unknown and assuming that *E. coli* is Pareto-optimal in light of criterion (i) in Table 1 (black dots in Fig 4c and 4d), would lead us to predict the following parameter values for the RM systems: cost *c* ≈ 3.7 ⋅ 10^{−7}, enzyme activities *r* ≈ 1.2 ⋅ 10^{3} min^{−1}, *m* ≈ 1.5 ⋅ 10^{2} min^{−1}, with the optimal number of RM systems being at the boundary between *k* = 1 and *k* = 2. Clearly, this prediction depends on the chosen measure of the efficiency of RM systems, which is determined by the considered ecological scenario and the particular objective that bacteria have in this scenario. Consequently, the concrete numbers presented here should not be understood as general results, but rather as a demonstration of how our framework can be used to calculate optimal bacterial strategies given different modeling assumptions about the phage-bacteria ecology.

## Discussion

Despite the ubiquity of RM systems in prokaryotic genomes [17], basic ecological and evolutionary aspects of these otherwise simple genetic elements are poorly understood [20]. Although RM systems have been discovered more than six decades ago due to their ability to protect bacteria from phage [34] and this is often assumed to be their main function [35], only a few experimental studies focused on the ecological and evolutionary dynamics of interactions between RM systems and phage [36, 37]. Similarly, effects of RM systems on their host bacteria, such as their cost in individual bacteria due to self-restriction, began to be addressed quantitatively only recently [26, 38]. In this work, we bridged these two scales using mathematical modeling. Our model captures the stochastic nature of RM systems originating at the level of interacting molecules in individual bacteria and extends it all the way to the dynamics of interactions between bacterial and phage populations.

Using this approach, we analytically described tradeoffs between the cost and the efficiency in different ecological contexts of immunity conferred by RM systems. The existence of such tradeoffs was previously indicated by quantitative single-cell experiments with two RM systems isolated from *E. coli* [26] and has since then been reported in the context of other RM systems [39]. We used our mathematical framework to quantify these tradeoffs and to study their ecological consequences, as well as the implications that these consequences have for optimally tuning the R and M enzymatic activities at the level of single cells. Our results for different ecological scenarios suggest that we should expect observed expression levels and enzymatic activities of naturally occurring RM systems to represent adaptations to specific environmental pressures. Such “tuning” of expression levels towards optimality has previously been directly experimentally shown in different molecular systems [29]. The expression levels of both R and M should be readily tunable by mutations in the often complex gene-regulatory regions [40].

With optimal bacterial defense strategies depending on the ecological scenario and the particular objective of the bacteria (see S1 Appendix Section S.4 and Table 1), making general predictions on R and M expression levels or numbers of concurrently active RM systems that we should expect to find in bacteria in the wild is difficult. However, we want to highlight that, in a given context, assuming optimality of the bacterial defense strategy allows one to make clear and quantitative predictions about enzymatic activities and the number of RM systems, and improving these predictions to take into account more relevant biological detail (if needed and known) remains only a technical, rather than conceptual challenge. Second, for the ecological scenario that we investigated in detail in this paper, parameter values measured for an *E. coli* RM system put optimal solutions into a regime that permits a large variation in the optimal number of RM systems, between one to six, with relatively small changes in the effective growth rate. This observation allows us to advance the following hypothesis: the number of RM systems in different bacterial strains and species is not a historical contingency, but an evolutionary adaptation to different ecological niches. In other words, the tradeoff between the cost and the efficiency of immunity can be partially alleviated in bacteria employing multiple RM systems. It is therefore interesting to note that many bacterial species carry multiple RM systems and the number of RM systems varies significantly among bacteria with different genome sizes and lifestyles [16, 17]. Our results indicate that different numbers of RM systems would be optimal in populations under different selection pressures (phage predation/resource limitation).

The analytical model presented here makes several simplifying assumptions. First, we consider only interactions between a single species of bacteria and a single species of phage. In natural environments, many bacterial and phage species interact and this diversity will certainly impact the resulting ecological end evolutionary dynamics [36, 41–43]. Second, we assumed the key parameters such as the numbers of restriction sites in bacterial and phage genomes to be constant in time and thus disregarded the long-term evolutionary dynamics. Bioinformatic studies have shown that many bacteria and phage avoid using restriction sites in their genomes [44–46]. Restriction site avoidance can represent an adaptive mechanism for increasing the probability of escape in phages [45, 47] and decreasing the probability of self-restriction in bacteria [26, 48]. The stochastic nature of RM systems observed at the level of individual cells is thus likely to critically shape the ecological and evolutionary dynamics of interactions between bacteria, RM systems and phage.

## Acknowledgments

The authors would like to thank Moritz Lang, Gregory Batt, Eugenio Cinquemani and Christoph Zechner for helpful discussions.

## References

- 1. McAdams H, Arkin A. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences of the USA. 1997;94(3):814–819. pmid:9023339
- 2. McAdams H, Arkin A. It’s a noisy business! Genetic regulation at the nanomolecular scale. Trends in Genetics. 1999;15(2):65–69.
- 3. Novick A, Weiner M. Enzyme induction as an all-or-none phenomenon. Proceedings of the National Academy of Sciences of the USA. 1957;43(7):553–566. pmid:16590055
- 4. Vilar J, Guet C, Leibler S. Modeling network dynamics: the lac operon, a case study. The Journal of Cell Biology. 2003;161(3):471–476. pmid:12743100
- 5.
Arkin A, Ross J, McAdams H. Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage λ-infected
*Escherichia coli*Cells. Genetics. 1998;149(4):1633–1648. - 6. Zeng L, Skinner S, Zong C, Sippy J, Feiss M, Golding I. Decision Making at a Subcellular Level Determines the Outcome of Bacteriophage Infection. Cell. 2010;141(4):682–691. pmid:20478257
- 7. Balaban N, Merrin J, Chait R, Kowalik L, Leibler S. Bacterial persistence as a phenotypic switch. Science. 2004;305(5690):1622–1625. pmid:15308767
- 8. Wakamoto Y, Dhar N, Chait R, Schneider K, Signorino-Gelo F, Leibler S, et al. Dynamic Persistence of Antibiotic-Stressed Mycobacteria. Science. 2013;339(6115):91–95. pmid:23288538
- 9.
Maamar H, Raj A, Dubnau D. Noise in Gene Expression Determines Cell Fate in
*Bacillus subtilis*. Science. 2007;317(5837):526–529. pmid:17569828 - 10. Elowitz M, Levine A, Siggia E, Swain P. Stochastic Gene Expression in a Single Cell. Science. 2002;297(5584):1183–1186. pmid:12183631
- 11. Huh D, Paulsson J. Non-genetic heterogeneity from stochastic partitioning at cell division. Nature Genetics. 2011;43(2):95–100. pmid:21186354
- 12. Tkacik G, Bialek W. Information processing in living systems. Annu Rev Cond Matt Phys. 2015;7:12.1–12.29.
- 13. Ghusinga K, Dennehy J, Singh A. First-passage time approach to controlling noise in the timing of intracellular events. Proceedings of the National Academy of Sciences of the USA. 2017;114(4):693–698. pmid:28069947
- 14. Hekstra D, Leibler S. Contingency and Statistical Laws in Replicate Microbial Closed Ecosystems. Cell. 2012;149(5):1164–1173. pmid:22632978
- 15. Frentz Z, Kuehn S, Leibler S. Strongly Deterministic Population Dynamics in Closed Microbial Communities. Physical Review X. 2015;5(4):041014.
- 16. Vasu K, Nagaraja V. Diverse Functions of Restriction-Modification Systems in Addition to Cellular Defense. Microbiology and Molecular Biology Reviews. 2013;77(1):53–72. pmid:23471617
- 17. Oliveira P, Touchon M, Rocha E. The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Research. 2014;42(16):10618–10631. pmid:25120263
- 18. Oliveira P, Touchon M, Rocha E. Regulation of genetic flux between bacteria by restriction-modification systems. Proceedings of the National Academy of Sciences of the USA. 2016;113(20):5658–5663. pmid:27140615
- 19. Kobayashi I. Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution. Nucleic Acids Research. 2001;29(18):3742–3756. pmid:11557807
- 20. Murray NE. Immigration control of DNA in bacteria: self versus non-self. Microbiology. 2002;148(1):3–20. pmid:11782494
- 21. Korona R, Korona B, Levin B. Sensitivity of naturally occurring coliphages to type I and type II restriction and modification. Journal of General Microbiology. 1993;6:1283–1290.
- 22. Gurney J, Pleška M, Levin B. Why put up with immunity when there is resistance: an excursion into the population and evolutionary dynamics of restriction–modification and CRISPR-Cas. Philosophical Transactions of the Royal Society B: Biological Sciences. 2019;374(1772):20180096.
- 23. Arber W, Linn S. DNA modification and restriction. Annual Review of Biochemistry. 1969;38:467–500. pmid:4897066
- 24. Tock M, Dryden D. The biology of restriction and anti-restriction. Current Opinion in Microbiology. 2005;8(4):466–472. pmid:15979932
- 25. Pleška M, Lang M, Refardt D, Levin B, Guet C. Phage–host population dynamics promotes prophage acquisition in bacteria with innate immunity. Nature Ecology & Evolution. 2018;2:359–366.
- 26. Pleška M, Qian L, Okura R, Bergmiller T, Wakamoto Y, Kussell E, et al. Bacterial Autoimmunity Due to a Restriction-Modification System. Current Biology. 2016;26(3):404–409. pmid:26804559
- 27.
Buchholz P, Kriege J, Felko I. Input modeling with phase-type distributions and Markov models. Berlin, Germany: Springer Briefs in Mathematics; 2014.
- 28. Enikeeva F, Severinov K, Gelfand M. Restriction-modification systems and bacteriophage invasion: Who wins? Journal of Theoretical Biology. 2010;266:550–559. pmid:20633563
- 29. Dekel E, Alon U. Optimality and evolutionary tuning of the expression level of a protein. Nature. 2005;436:588–592. pmid:16049495
- 30. Stoebel D, Dean A, Dykhuizen D. The cost of expression of Escherichia coli lac operon proteins is in the process, not in the products. Genetics. 2008;178(3):1653–1660. pmid:18245823
- 31. Campbell A. Conditions for the Existence of Bacteriophage. Evolution. 1961;15(2):153–165.
- 32. Shoval O, Sheftel H, Shinar G, Hart Y, Ramote O, Mayo A, et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science. 2012;336(6085):1157–1160. pmid:22539553
- 33. Otero-Muras I, Banga J. Automated Design Framework for Synthetic Biology Exploiting Pareto Optimality. ACS Synthetic Biology. 2017;6(7):1180–1193. pmid:28350462
- 34. Bertani G, Weigle J. Host controlled variation in bacterial viruses. Journal of bacteriology. 1953;65(2):113. pmid:13034700
- 35. Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nature Reviews Microbiology. 2010;8(5):317. pmid:20348932
- 36. Korona R, Levin BR. Phage-mediated selection and the evolution and maintenance of restriction-modification. Evolution. 1993;47(2):556–575. pmid:28568719
- 37. Pleška M, Lang M, Refardt D, Levin BR, Guet CC. Phage–host population dynamics promotes prophage acquisition in bacteria with innate immunity. Nature ecology & evolution. 2018;2(2):359.
- 38. Nagamalleswari E, Rao S, Vasu K, Nagaraja V. Restriction endonuclease triggered bacterial apoptosis as a mechanism for long time survival. Nucleic acids research. 2017;45(14):8423–8434. pmid:28854737
- 39. Mruk I, Kaczorowski T, Witczak A. Natural tuning of restriction endonuclease synthesis by cluster of rare arginine codons. Scientific Reports. 2019;9:5808. pmid:30967604
- 40. Mruk I, Kobayashi I. To be or not to be: regulation of restriction–modification systems and other toxin–antitoxin systems. Nucleic acids research. 2013;42(1):70–86.
- 41. Dillon R, Vennard C, Buckling A, Charnley A. Diversity of locust gut bacteria protects against pathogen invasion. Ecology Letters. 2005;8(12):1291–1298.
- 42. Kassen R, Buckling A, Bell G, Rainey PB. Diversity peaks at intermediate productivity in a laboratory microcosm. Nature. 2000;406(6795):508. pmid:10952310
- 43. van Houte S, Ekroth AK, Broniewski JM, Chabas H, Ashby B, Bondy-Denomy J, et al. The diversity-generating benefits of a prokaryotic adaptive immune system. Nature. 2016;532(7599):385. pmid:27074511
- 44. Gelfand MS, Koonin EV. Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes. Nucleic acids research. 1997;25(12):2430–2439. pmid:9171096
- 45. Sharp PM. Molecular evolution of bacteriophages: evidence of selection against the recognition sites of host restriction enzymes. Molecular biology and evolution. 1986;3(1):75–83. pmid:2832688
- 46. Rocha EP, Danchin A, Viari A. Evolutionary role of restriction/modification systems as revealed by comparative genome analysis. Genome research. 2001;11(6):946–958. pmid:11381024
- 47. Pleška M, Guet CC. Effects of mutations in phage restriction sites during escape from restriction–modification. Biology letters. 2017;13(12):20170646.
- 48. Qian L, Kussell E. Evolutionary dynamics of restriction site avoidance. Physical review letters. 2012;108(15):158105. pmid:22587291