## Figures

## Abstract

Exposure to environmental stressors, including certain antibiotics, induces stress responses in bacteria. Some of these responses increase mutagenesis and thus potentially accelerate resistance evolution. Many studies report increased mutation rates under stress, often using the standard experimental approach of fluctuation assays. However, single-cell studies have revealed that many stress responses are heterogeneously expressed in bacterial populations, which existing estimation methods have not yet addressed. We develop a population dynamic model that considers heterogeneous stress responses (subpopulations of cells with the response *off* or *on*) that impact both mutation rate and cell division rate, inspired by the DNA-damage response in *Escherichia coli* (SOS response). We derive the mutant count distribution arising in fluctuation assays under this model and then implement maximum likelihood estimation of the mutation-rate increase specifically associated with the expression of the stress response. Using simulated mutant count data, we show that our inference method allows for accurate and precise estimation of the mutation-rate increase, provided that this increase is sufficiently large and the induction of the response also reduces the division rate. Moreover, we find that in many cases, either heterogeneity in stress responses or mutant fitness costs could explain similar patterns in fluctuation assay data, suggesting that separate experiments would be required to identify the true underlying process. In cases where stress responses and mutation rates are heterogeneous, current methods still correctly infer the effective increase in population mean mutation rate, but we provide a novel method to infer distinct stress-induced mutation rates, which could be important for parameterising evolutionary models.

## Author summary

How does environmental stress, especially from antibiotics, affect mutation rates in bacteria? This question has often been examined by estimating mutation rates using fluctuation assays, an experiment dating back to Luria and Delbrück in the 1940s. In this study, we consider variation in stress responses within bacterial populations, as revealed by recent single-cell studies, which is neglected in currently available mutation-rate estimation methods. Our approach involves a population dynamic model inspired by the DNA-damage response in *E. coli* (SOS response). It accounts for a subpopulation with high expression of the stress response, which increases the mutation rate and decreases the division rate of a cell. We use computer simulations to generate synthetic fluctuation assay data. Notably, we find that over a wide range of scenarios, existing models and our heterogeneous-response model cannot be distinguished using fluctuation assay data alone. This emphasises the need for separate experiments to uncover the true underlying processes. Nevertheless, when stress responses are known to be heterogeneous, our study offers a novel method for accurately estimating mutation rates specifically associated with the high expression of the stress response. Uncovering the heterogeneity in stress-induced mutation rates could be important for predicting the evolution of antibiotic resistance.

**Citation: **Lansch-Justen L, El Karoui M, Alexander HK (2024) Estimating mutation rates under heterogeneous stress responses. PLoS Comput Biol 20(5):
e1012146.
https://doi.org/10.1371/journal.pcbi.1012146

**Editor: **Mark M. Tanaka,
University of New South Wales, AUSTRALIA

**Received: **November 27, 2023; **Accepted: **May 8, 2024; **Published: ** May 28, 2024

**Copyright: ** © 2024 Lansch-Justen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The complete annotated documentation of the computational analyses of this study is archived on Zenodo, https://doi.org/10.5281/zenodo.11174801.

**Funding: **This work was supported by the UKRI Biotechnology and Biological Sciences Research Council (BBSRC) grant number BB/T00875X/1 and a University of Edinburgh Principal’s Career Development PhD Scholarship to LLJ, a Wellcome Trust Investigator Award 205008/Z/16/Z to MEK, and a Royal Society University Research Fellowship URF/R1/191269 to HKA. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Bacteria are commonly exposed to adverse conditions, such as starvation, sub-optimal temperatures or toxins, including antibiotics. To cope with these conditions, bacteria have evolved a range of stress responses that enhance viability under stress, often at the expense of a lower growth rate. Some of these response pathways also increase mutagenic mechanisms by, for example, increasing the expression of error-prone DNA polymerases or down-regulating error-correcting enzymes [1, 2]. It has been proposed that this so-called ‘stress-induced mutagenesis’ (SIM) in bacterial cells could accelerate the evolution of populations that are poorly adapted to their environment [3–6]. Consequently, inhibiting bacterial stress responses has been suggested to prevent antibiotic resistance evolution and gained some experimental support [7–9].

Several studies report increased mutation rates in bacterial populations exposed to sub-lethal antibiotic concentrations [8, 10–15]. These mutation rates have been typically measured with fluctuation assays. This experiment (see, for example, [16] for a protocol) involves inoculating several parallel cultures at a small population size and growing them under permissive conditions for several hours, typically overnight. During this *growth phase*, mutations occur randomly, and the experiment is designed to minimise selection on mutant cells. Subsequently, each culture is plated on strong selective media such that only mutant cells can grow and form a colony. The mutation rate to the chosen selective marker is estimated from the distribution of the number of mutant colonies on the plates, the *mutant count distribution*; see [17] for a summary of estimation methods. The experiment is repeated to quantify the mutation-rate increase associated with stress, by exposing the cultures to a stressor during the growth phase. Then, the *stress-induced* mutation rate is estimated and compared with the mutation rate under permissive conditions. However, stress impacts the growth of bacterial cells in several ways, which are neglected in commonly applied estimation methods, potentially leading to biased estimates of the mutation rate. For instance, increased cell death leads to overestimating the mutation rate [18]. Another effect that has not yet been addressed is within-population heterogeneity in stress responses.

In recent years, single-cell experiments have revealed extensive heterogeneity in the expression of stress responses in bacterial populations [8, 19–28]. Heterogeneity can arise for various reasons, including stochastic expression of genes involved in stress responses, especially where the corresponding proteins are initially present in small numbers [20–22], phenotypic variability in the stability of key regulators [25], or micro-environmental variation in cell-to-cell interactions [28]. Positive and negative feedback loops are common features of stress response regulatory networks, which can generate, amongst other features, cell-to-cell variation [29]. In some cases, a subpopulation of cells showing elevated stress responses has been directly associated with a higher rate of DNA mismatches or higher mutant frequency [8, 20–22, 24, 26].

In addition to mutagenic mechanisms, stress responses can alter cell division and death rates. For example, the widely studied SOS response, which leads to the transcriptional induction of approximately 40 genes after exposure to DNA damage, involves inhibition of cell division, filamentation and induction of error-prone DNA polymerases that could increase mutation rate [30, 31]. Single-cell studies using fluorescent reporters for the SOS response in *E. coli* have revealed that its expression is highly heterogeneous. Under certain conditions, a subpopulation of cells with a very high level of SOS compared to the rest of cells with lower expression levels has been observed, and this heterogeneity can be approximated as a bimodal response [19, 21, 27]. Overall, heterogeneously expressed stress responses are, therefore, likely to impact both bacterial population dynamics and mutational input during the growth phase of a fluctuation assay, and it is unclear whether estimation methods that neglect heterogeneity in stress responses produce reliable results.

In this study, we present a population dynamics model that considers within-population heterogeneity in stress responses. Motivated by the SOS response, we describe two discrete subpopulations of cells, where high expression of the stress response is associated with both a higher mutation rate and a lower division rate than in cells with low expression. We derive the resulting mutant count distribution in the total population and implement maximum likelihood estimation of the mutation-rate increase associated with the induction of the stress response. We test the performance of our method using stochastic simulations of fluctuation assays under permissive and stressful conditions, including robustness to biologically realistic model deviations such as mutant fitness costs and cell death. We also apply formal model comparison to assess whether within-population heterogeneity could be detected from fluctuation assays alone.

## Model and methods

Studying stress-induced mutagenesis with fluctuation assays requires a pair of experiments: one with a growth phase under permissive conditions (as a baseline for comparison) and one under ‘stressful’ conditions, where a stressor such as a low dose of antibiotic (which is supposed to induce a mutagenic stress response in the cells) is added during the growth phase. In addition to performing the experiments, researchers have to decide on a mathematical model of the underlying dynamics, including the population dynamics of non-mutants and mutants during the growth phase, and how these dynamics change under exposure to stress. This then allows them to estimate the model parameters, most importantly mutation rates, and assess increases in mutation rates due to stress. Many studies of SIM to date, for example [8, 11, 14], have implicitly assumed that the stress response is homogeneous, i.e. the stressful condition results in a population-wide elevation of the mutation rate. In contrast, our new model considers within-population heterogeneity in stress responses and mutation rates.

In the following, we recap what we call the *standard model* used in a classical fluctuation assay and extensions particularly relevant to stress. Then, we formalise the *homogeneous-response model*, a version of which is considered in the aforementioned studies of SIM, and introduce our new *heterogenenous-response model* with a detailed description of the population dynamic model under heterogeneous stress responses and the derivation of the resulting mutant count distribution. Next, we describe our model fitting and parameter estimation approach using maximum likelihood estimation and summarise the inference parameters. Finally, we describe the simulations to generate synthetic mutant count data, and how we evaluate the estimation methods using these data.

For schematics of the models used in simulation and inference, see Fig 1. The complete documentation of the computational methods can be found in the **README** at https://github.com/LucyL-J/Quantifying-SIM.

In the standard model (**A**), non-mutants are assumed to grow exponentially (rate *γ*), mutations to arise randomly (rate *ν* per cell per unit time), and mutants to divide stochastically (birth rate *γ*). In models of homogeneous stress responses (**B**), it is assumed that both fluctuation assays under permissive (superscript *p*, light blue) and stressful (superscript *s*, light red) conditions can be described by the standard model, with optional differential fitness of mutants compared to non-mutants (factor *ρ*). In our model of heterogeneous stress responses, on the other hand, we assume that the induction of the stress response (rate *α*) results in the separation into two subpopulations: response-*off* (subscript *off*, dark purple) and response-*on* (subscript *on*, red). When simulating under the heterogeneous-response model, we use the exact model (**C**), optionally extended by cell death (rate *δ*) and differential mutant fitness (*ρ*), where explicitly specified. For inference, we fit the approximate heterogeneous-response model (**D**). For the homogeneous-response model, we use the same version for both simulation and inference.

### The standard model of fluctuation assays and extensions relevant to stress

Classically, fluctuation assays have been described using what we refer to as the *standard model* (Fig 1A). In the standard model, the non-mutant population is assumed to grow exponentially during the growth phase, while the occurrence of mutations and the division of mutant cells are treated stochastically. On selective plates, it is assumed that every mutant cell (but no non-mutant cell) forms a visible colony. Many extensions of this standard model have been developed, briefly reviewed, for example, in [32]. Here, we describe two extensions particularly relevant to stress: accounting for cell death and allowing mutant cells to have a different fitness than non-mutant cells during the growth phase. The latter can become important when resistance allowing growth on the selective plates also confers an advantage to the stressor (for example, due to cross-resistance) or when mutants carry a fitness cost. Together, these two extensions result in the following population dynamic model. The non-mutant population grows exponentially,
(1)
with initial population size *N*_{i} and population growth rate λ. Mutations occur according to a time-inhomogeneous Poisson process with rate *νN*(*t*). Note that *ν* describes the mutation rate to the phenotype of interest selected on the plates in the fluctuation assay (mutations per cell per unit time, also called instantaneous mutation rate). The dynamics of each mutant cell *M* are captured by a continuous-time linear birth-death process [33] with birth rate *b* and death rate *d*:
(2)
implying that mutants have a different fitness than non-mutants if *b* − *d* ≠ λ.

For such dynamics, defining the per-generation mutation rate as , the differential fitness of mutants as and the extinction probability of mutants as , the resulting distribution of the number of mutants when the population reaches a final population size *N*_{f} has been derived [34]: Assuming *N*_{f} ≫ *N*_{i} (neglecting initial population size effects), the probability-generating function (PGF) *G*(*z*), a mathematically-convenient representation of the mutant count distribution, is given by
(3)
with *F* being the hypergeometric function. Note that *z* is a dummy variable in the PGF with no physical meaning, and *G*(*z*) does not directly give the probability of observing a specific mutant count, but the probabilities can be calculated from *G*(*z*) [34]. In the case where mutants have the same fitness as non-mutants and do not undergo cell death, the equation simplifies to:
(4)

### Formalisation of the homogeneous-response model

The *homogeneous-response model* assumes that stress and stress responses impact mutation, division and death rates on a population-wide level. This implies that the dynamics under stressful conditions can, as under permissive conditions, be captured by the standard model (with optional extensions) as described above, simply substituting different parameter values (Fig 1B). Under permissive conditions (parameters denoted with a superscript *p*), assuming no cell death, the non-mutant population grows exponentially, *n*^{p}(*t*) = *n*^{p}(0)e^{γpt}, with division rate *γ*^{p}; mutations occur at rate *ν*^{p}*n*^{p}(*t*) and mutants develop according to a pure birth process with rate *ρ*^{p} ⋅ *γ*^{p}.

Under stressful conditions (parameters denoted with a superscript *s*), the population grows as *n*^{s}(*t*) = *n*^{s}(0)e^{(γs − δs)t} with a different growth rate caused by a change in division rate *γ*^{s} or a non-zero death rate *δ*^{s} or both. Mutations also occur at a different rate *ν*^{s}*n*^{s}(*t*), and the dynamics of mutants are given by a birth-death process with birth rate *ρ*^{s} ⋅ *γ*^{s} and death rate *δ*^{s}.

Therefore, the stress response results in a population-wide change in the per-division mutation rate, ; potentially the differential fitness of mutants, *ρ*^{p} → *ρ*^{s}; and potentially a non-zero extinction probability of mutants, . The PGFs for the mutant count distributions under permissive and stressful conditions in the homogeneous-response model are thus given by
(5) (6)

By applying standard mutation-rate estimation methods to both the fluctuation assay under permissive and the one under stressful conditions, studies of SIM to date have implicitly applied such a homogeneous-response model.

### Detailed description of the heterogeneous-response model

In contrast, our *heterogeneous-response model* considers within-population heterogeneity in the expression of the stress response under stressful conditions. Specifically, we suppose the population can be divided into two subpopulations: one with a low expression level of the stress response (here referred to as response switched *off*, even if strictly speaking the response is not fully *off* but very low) and the other with a high expression level (here referred to as response switched *on*). Each sub-population is associated with its own mutation rate and division rate. We adopt most of the same assumptions of the standard model while focusing on the specific effect of within-population heterogeneity upon induction of stress responses (Fig 1C).

Under permissive conditions, we assume that all cells have the response switched *off*, neglecting any stochastic switching in the absence of a stressor, and therefore, continue to use the standard model (with optional differential mutant fitness). In particular, the population grows exponentially, with growth rate given by the division rate , mutations arise at rate and mutants develop according to a pure birth process with rate . The PGF of the mutant count distribution is given by
(7)
where describes the *per-division* mutation rate, which equals the *per-generation* rate as, under permissive conditions, the population growth is solely determined by cell division.

#### Population dynamic model under heterogeneous stress responses.

Upon exposure to stressful conditions, cells induce a stress response with a constant switching rate *α*, leading to the emergence of a response-*on* subpopulation. Inducing the stress response alters the mutation rate of the cells but potentially also their division and death rates. We assume that, as long as the stress persists, cells do not switch the response *off* again.

#### Non-mutant population dynamics.

We model the population sizes over time of the non-mutant response-*off* and response-*on* subpopulations, and *n*_{on}, respectively, with coupled differential equations:
(8) (9)
Here, is the division rate of the response-*off* subpopulation under stress (which can be different than under permissive conditions, ) and its death rate, *γ*_{on} and *δ*_{on} are the division and death rates of the response-*on* subpopulation, and *α* is the switching rate. The solution to these equations is given by
(10) (11)
with and *n*_{on}(0) denoting the initial numbers of response-*off* and response-*on* cells, respectively.

This approach assumes that the non-mutants, including the initially small response-*on* subpopulation, can be treated deterministically. We test the validity of this assumption using stochastic simulations (section A in S1 File): we simulate switching *on* of the response as a time-inhomogeneous Poisson process and the growth dynamics of the response-*on* subpopulation as a continuous-time linear birth-death process. Then, we compare the resulting population size with Eq 11. We find that deviations from the deterministic prediction are negligible for a wide range of switching rates and division rates of response-*on* cells and for zero and small initial sizes of the response-*on* subpopulation (Fig A in S1 File). Therefore, throughout the rest of this study, we treat non-mutants deterministically.

#### Mutant population dynamics.

We consider mutations in the response-*off* and the response-*on* subpopulation to occur according to time-inhomogeneous Poisson processes and treat the dynamics of the resulting mutants stochastically. Mutations arise in each subpopulation at rates and *ν*_{on}*n*_{on}(*t*), respectively. Importantly, mutation is not linked to cell division, but rather to chromosome replication. Expression of the SOS response, for example, inhibits cell division, but cells continue growing, leading to filamentation. Due to the continuation of chromosome replication, filamented cells may contain multiple chromosomes [35]. We neglect the possibility that these intracellular dynamics introduce heterogeneities amongst cells within the response-on subpopulation or over time and assume that the per-cell mutation rate (*ν*_{on}) is constant. Experiments show that under prolonged low-level stress, multinucleated filamented cells can ‘bud’ viable, normal-sized progeny cells from their tips, some of which contain mutated chromosomes [35]. Although our model remains a simplification of this process, the experimental evidence indicates that response-*on* cells, even if largely non-dividing, can generate mutant offspring.

At the same time, since the selective agent on the plates is normally chosen to be unrelated to the stressor applied in the growth phase (e.g. two different antibiotics with no cross-resistance), we assume that mutation itself does not alter the stress response. For response-*off* cells, this implies that mutants can induce the response equivalently to non-mutants. Nonetheless, mutations might affect the fitness during the growth phase. Together, these assumptions result in a continuous-time two-type branching process describing the mutant response-*off* and response-*on* subpopulations, defined by the respective birth rates and *ρ*_{on}*γ*_{on}, respective death rates and *δ*_{on}, and switching at rate *α*:
(12)

On selective plates, where the stressor (which was applied during the growth phase) is no longer present, we assume that response-*on* mutant cells can resume division. Therefore, we continue to adopt the standard model assumption that every mutant cell forms a visible colony upon selective plating.

#### Derivation of the mutant count distribution.

To derive an analytical expression for the mutant count distribution, we make several approximations, resulting in the approximate heterogeneous-response model depicted in Fig 1D. First, we approximate Eq 11 as (13)

This approximation is valid when the initial population size of the response-*on* subpopulation is comparably small, *n*_{on}(0)≪*n*_{off}(0), and its growth is slower than the growth of the response-*off* subpopulation, . As a consequence of this approximation, the total population grows exponentially with a *population growth rate* of
(14)
and the response-*on* subpopulation makes up a constant fraction of
(15)

In the exact model given by the Eqs 10 and 11, the fraction of the response-*on* subpopulation changes with time until the stationary fraction is reached, but we assume that the fraction at the end of the growth phase, *f*_{on}(*t*_{f}), is a good approximation of and for the rest of this study, we refer to it as simply the fraction of response-*on* cells *f*_{on}. Note that, even if response-*on* cells have zero division rate, the response-*on* subpopulation grows exponentially with the population growth rate λ^{s} due to the induction of the stress response in response-*off* cells.

We define the *relative switching rate* as
(16)
and the *relative fitness* of response-*on* compared to response-*off* cells under stressful conditions as
(17)
and thereby obtain
(18)

This allows us to rewrite the population growth rate as a function of the division rate and the extinction probability of response-*off* cells, and the fraction *f*_{on} and relative fitness *r*_{on} of the response-*on* subpopulation
(19)
and to calculate the per-generation mutation rates of response-*off* and response-*on* cells
(20) (21)
with . Importantly, we assume here that the per-division mutation rate of response-*off* cells is the same under stressful as under permissive conditions, .

In an additional approximation to derive the mutant count distribution, we neglect the induction of the stress response in the mutants and assume that mutations have no fitness effect. For mathematical convenience, we consider switching as a reduction in the division rate of response-*off* mutants by *α* instead (birth rate equal to ). With this assumption, the dynamics of response-*off* and response-*on* mutant lineages are independent birth-death processes. For response-*on* cells, the relative fitness of mutants *ρ*_{on} (relative to the population growth rate) can be expressed via
(22)

With these approximations (see Fig 1D), the mutant counts in the response-*off* and response-*on* subpopulations are two independent stochastic processes, each following the standard model, with differential mutant fitness in the case of the response-*on* cells. Therefore, we can substitute the appropriate parameters into Eq 3 to obtain PGFs for the mutant counts, and *G*_{on}(*z*), respectively:
(23) (24)

Finally, the total mutant count distribution is given by the sum of the contributions of response-*off* and response-*on* subpopulations, with its PGF given by the product .

In the case of *γ*_{on} = 0, the contribution to the mutant count from the response-*on* subpopulation follows a Poisson distribution, and (without cell death, implying *r*_{on} = 0) the PGF of total mutant count distribution reduces to
(25) (26)

no longer depends on the mutation rate (*μ*_{on}) and fraction (*f*_{on}) of response-*on* cells separately, but rather on the composite parameter
(27)
which gives the ratio of mutation supply coming from the response-*on* compared to the response-*off* subpopulation, with implying no heterogeneity in mutation rates.

For the purpose of comparison, we also define the increase in population mean mutation rate under stressful compared to permissive conditions:
(28)
which is directly comparable to the increase in mutation rate in the homogeneous-response model, , since *μ*^{p} and *μ*^{s} are population-wide rates.

Example mutant count distributions for the homogeneous and heterogeneous-response models are shown in Fig B in S1 File.

### Model fitting and parameter estimation using maximum likelihood

We use a maximum likelihood approach to estimate the model parameters from fluctuation assay data. For a given model (homogeneous or heterogeneous), we find the set of model parameters *θ* for which the observed mutant counts are most likely. Importantly, we consider mutant count data concurrently from a pair of fluctuation assays: one under permissive and the other under stressful conditions. In our heterogeneous-response model, there is at least one shared parameter between conditions (*μ*_{off}); therefore, we consider the joint likelihood function. Note, however, that if there are no shared parameters between conditions (as is the case for some of the homogeneous-response models), the inference can be carried out separately.

We define a log-likelihood function
(29)
as the natural logarithm of the probability of observing the mutant count distributions *x*^{p} and *x*^{s} under permissive and stressful conditions, respectively, for a given model with parameters *θ*. Here, *m*^{p} and *m*^{s} represent the maximal observed numbers of mutant colonies, and *x*^{p}(*i*) and *x*^{s}(*i*) are the number of plates with *i* mutant colonies under permissive and stressful conditions, respectively. The and give the probabilities to observe *i* mutant colonies under permissive and stressful conditions, respectively, calculated from the PGFs of the mutant count distributions using recursive formulas described in [34]. Then, we use the default optimisation algorithm implemented in the Julia [36] package Optim.jl (https://julianlsolvers.github.io/Optim.jl/stable/), to find the parameters that maximise this log-likelihood function. The parameters that are estimated depend on the specific model that is considered, as described below and summarised in Table 1.

The complete documentation of all inference algorithms can be found in the file called **inference.jl** at https://github.com/LucyL-J/Quantifying-SIM.

#### Homogeneous-response model.

In the homogeneous-response model, the mutant count distributions (Eqs 5 and 6) depend on the per-division mutation rates, final population sizes, and, optionally, differential mutant fitness under permissive and stressful conditions, as well as the extinction probability of mutants under stress. All parameters must either be set as inference parameters, or set to a fixed value, which could be the default value or as measured in a separate experiment. In our implementation, the per-division mutation rates, *μ*^{p} and *μ*^{s}, are inferred to calculate the increase in mutation rate associated with the stress, that is, . The final population sizes under permissive and stressful conditions, and , are set to fixed values, as they would typically be measured through plating a few cultures on non-selective media and colony counting. Moreover, we set the extinction probability of mutants under stress, *ϵ*^{s}, to zero because we neglect cell death in the inference, which is in common with most existing approaches, but see [18].

For the differential fitness of mutants compared to non-mutants, *ρ*^{p} and *ρ*^{s}, we consider three cases corresponding to different versions of the homogeneous-response model: (a) mutants have the same fitness as non-mutants, *ρ*^{p} = *ρ*^{s} = 1; (c) mutants have a different fitness than non-mutants (*unconstrained*) and two separate values, *ρ*^{p} and *ρ*^{s}, are inferred; or (b) mutants have a different fitness than non-mutants, but the effect is constrained to be equal under permissive and stressful conditions, *ρ*^{p} = *ρ*^{s}. For the models (a) and (c), the mutant count distributions under permissive and stressful conditions have no joint parameters and can, therefore, be considered separately by using existing estimation methods: (a) corresponds to the standard model and (c) to the standard model with differential mutant fitness (implemented, for example, in [37]). Studies to date have followed such an approach to estimate the increase in mutation rate. Model (b), on the contrary (with constrained differential mutant fitness, arguably a reasonable null model), represents a new version of the homogeneous-response model, which is first implemented here. In this case, we estimate the model parameters by jointly maximising the log-likelihood function given in Eq 29. In the main Results, we consider all three homogeneous-response models (a-c); in section M in S1 File, we repeat the analysis for constrained mutant fitness only, i.e. models (a-b).

#### Heterogeneous-response model.

In the heterogeneous-response model, the mutant count distributions under permissive (Eq 7) and stressful conditions (Eqs 23 and 24), depend on the per-division mutation rates of response-*off* and response-*on* cells, the extinction probabilities of response-*off* and response-*on* mutants under stress, the fraction and relative fitness of response-*on* compared to response-*off* cells, and the total final population sizes under permissive and stressful conditions. The mutation rate of response-*off* cells, *μ*_{off}, appears as a parameter in both the mutant count distributions under permissive and under stressful conditions, and the joint inference crucially relies on our assumption that, even though the division rate itself might change under stress, the *per-division* mutation rate of response-*off* cells is the same under both conditions. As for the homogeneous-response model, we assume that the final population sizes, and , are known from plating on non-selective medium. Moreover, we neglect cell death, which implies , but test the robustness of this assumption.

For the relative fitness of response-*on* cells, we consider two different model versions of the approximate heterogeneous-response model: (d) as a default, we set *r*_{on} = 0, inspired by the SOS response in *E. coli*, which inhibits cell division, or to a small non-zero value that we assume is measured in a separate experiment (section H in S1 File); and (e) we infer *r*_{on}. Similarly, for the fraction of the response-*on* subpopulation, we (i) assume that it is a known quantity measured in a separate experiment (e.g. microscopy or flow cytometry), or (ii) set it as an inference parameter.

Ultimately, we are interested in quantifying the relative increase in mutation rate associated with induction of the stress response, that is, . To do so we need to estimate the mutation rate of response-*on* cells, *μ*_{on}. However, we do not directly infer this parameter; instead, we infer the composite parameter of the mutation-supply ratio defined in Eq 27, from which we calculate *μ*_{on}. The reason for this approach is that for *r*_{on} = 0, the mutant count distribution under stress does not depend separately on *μ*_{on} and *f*_{on} but only on (together with *μ*_{off} and ); see Eq 26. This also implies that, for *r*_{on} = 0 and when the fraction of the response-*on* subpopulation is unknown, *μ*_{on} and thus cannot be calculated. In this case, we report estimates of instead as an indicator of heterogeneity.

#### Confidence intervals using profile likelihood.

In addition to the maximum-likelihood estimates, we calculate 95% confidence intervals using a profile likelihood approach (section E in S1 File). The confidence interval for each parameter contains all values such that, after optimisation over the remaining inference parameters, the likelihood does not significantly drop according to a likelihood ratio test.

### Evaluating inference methods on simulated data

We test our estimation method using simulated fluctuation assay data: For chosen ranges of parameter values, we perform stochastic simulations of the population dynamics during the growth phases of a pair of fluctuation assays, one under permissive and the other under stressful conditions. From the resulting mutant count distributions, we infer the respective parameters under heterogeneous and homogeneous-response models and compare the estimated with the true simulated parameters, as well as perform model selection. For most of this study, we simulate under the heterogeneous-response model, but we repeat part of the analysis for simulation under the homogeneous-response model (sections K and N in S1 File).

The complete documentation of all population dynamic functions can be found in the file called **population_dynamics.jl** at https://github.com/LucyL-J/Quantifying-SIM.

#### Simulation methods.

To simulate the growth phase under permissive conditions, we consider exponential growth of the non-mutant population (Eq 1) and implement the occurrence of mutations as a time-inhomogeneous Poisson process with rate proportional to the population size using a standard approach described, for example, in [38]. We treat mutant cells stochastically by using Gillespie’s algorithm [39] to simulate the pure birth process described by Eq 2 with zero death rate. In the case of the homogeneous-response model, the population growth rate is given by the division rate, *γ*^{p}, the mutation rate per cell per unit time by *ν*^{p} and the birth rate of mutants by *ρ*^{p} ⋅ *γ*^{p}. Similarly, for the heterogeneous-response model, the rates are given by the respective rates of response-*off* cells (, and ).

For the homogeneous-response model, we simulate the growth phase under stressful conditions using the same algorithm but with different rates (*γ*^{s}, *ν*^{s}, *ρ*^{s} ⋅ *γ*^{s}). To simulate stressful conditions under the heterogeneous-response model, we use Eqs 10 and 11 (setting *n*_{on}(0) = 0) to describe the growth of the response-*off* and response-*on* subpopulations, and implement the occurrence of mutations as two independent time-inhomogeneous Poisson processes with rates proportional to the subpopulation sizes ( and *n*_{on}(*t*), respectively) and the mutation rates per cell per unit time ( and *ν*_{on}). We simulate the mutant dynamics stochastically as a two-type branching process described by Eq 12 using Gillespie’s algorithm.

In all simulations, we set the duration *t*_{f} of the growth phase such that the expected number of *mutations* (not *mutants*) is constant across the considered parameter ranges (section C in S1 File). In our main results, we take *c* = 50 parallel cultures per assay, which is readily feasible if culturing on a 96-well plate; see, for example, [16] for a protocol. In sections D and N in S1 File, we also examine the sensitivity of the results to the number of parallel cultures by considering smaller numbers *c*.

#### Accuracy and precision of parameter estimates.

Generally, we evaluate the accuracy and precision of all estimation methods by simulating pairs of fluctuation assays, estimating the parameters of the inference model and comparing the respective estimates with the true values; repeated *R* = 100 times for each parameter set simulated. We consider the deviation from the true value of the median estimate across the simulations as a measure of the accuracy of the estimation and the variation as a measure of the precision. In particular, we calculate the median of the relative error across the *R* replicates,
(30)

Here, a positive or negative relative error implies over- or underestimation, respectively. Moreover, we calculate the coefficient of variation across the *R* replicates,
(31)
where ‘std’ denotes standard deviation. Where we calculate confidence intervals (section E in S1 File), we also use the median width of the confidence intervals as a measure of precision.

To plot the estimated parameters across the *R* = 100 simulations, we use boxplots, where each box shows the median and interquartile range with whiskers extending to 1.5 times the interquartile range and any outliers outside that range represented as dots. To summarise the confidence intervals on each of the *R* = 100 estimates, we plot the median maximum likelihood estimate with error bars extending to the medians of the lower and upper bounds of the 95% confidence intervals.

#### Model selection: Heterogeneous versus homogeneous response.

We also evaluate whether it is possible to identify the heterogeneity of stress responses from mutant count data alone. In this case, we suppose we do not have separate experimental data showing heterogeneity and, therefore, do not have an estimate of *f*_{on}. For this purpose, we simulate fluctuation assays under the (exact) heterogeneous-response model and under the homogeneous-response model (sections K and N in S1 File) and, then, fit the different homogeneous (a-c) and (approximate) heterogeneous-response models (d-e). For model selection, we use a combination of likelihood-ratio test (LRT) and Akaike information criterion (AIC). The AIC is defined as
(32)
where *k* is the number of inferred parameters of the model. Within the set of models (a-c) and within the set (d-e), the models are nested and we use LRTs to determine the best heterogeneous/homogeneous-response model within each set first. However, we cannot use LRT to select between the sets (a-c) and (d-e) because these models are not nested. Therefore, between the two best models, we select the one with the lowest AIC. However, if the difference in AIC is within ±2, we say that the AICs are comparable, and neither of the models can be selected. We also consider the Bayesian information criterion (BIC) as an alternative second selection step (section N in S1 File).

## Results

We aim to estimate the increase in mutation rate associated with the induction of the stress response when this response is heterogeneously expressed across the bacterial population. In particular, we consider cases in which the population can be divided into two discrete subpopulations: one with a low expression level (response *off*) and the other with a high expression level (response *on*). The key principle of our method is to jointly infer their mutation rates from mutant count data obtained from a pair of fluctuation assays under permissive and stressful conditions. For the latter, we need to disentangle the contributions from the response-*off* and response-*on* subpopulations. The success of this method relies on the changing shape of the mutant count distribution under stress, which occurs if there is a highly mutating but slowly dividing response-*on* subpopulation (Fig B in S1 File).

To evaluate the performance of our method, we use simulated mutant count data to compare the estimated parameters with the true simulated ones. First, we explore how the accuracy and precision of our method depend on the model parameters by simulating and inferring under the same model. Then, we test the robustness of our method to model deviations by simulating under a more complex model than used in the inference. Finally, we determine under what conditions the heterogeneous-response model can be distinguished from the homogeneous-response model assumed in currently available methods by inferring under both models and comparing how well they fit simulated data.

In all simulations, we set the initial population size to 10^{4} and the initial fraction of the response-*on* subpopulation to zero. Moreover, we consider the duration of the growth phase such that the expected number of mutations equals one. This way, the resulting number of resistant mutant colonies on each selective plate is usually within an experimentally countable range of zero to a couple hundred (section C in S1 File). Table 2 summarises the default parameters used in the simulations, while parameters that vary are specified in the relevant Results section. For each parameter combination, we simulate *R* = 100 pairs of fluctuation assays under permissive and stressful conditions, with *c* = 50 parallel cultures per assay. We also test the sensitivity of our method’s performance to the number of parallel cultures (section D in S1 File) and repeat the model selection analysis for smaller numbers of parallel cultures (*c* = 20, 10) in section N in S1 File. Generally, we assume that the final population sizes in permissive and stressful conditions ( and , respectively) and the fraction of the response-*on* subpopulation under stress (*f*_{on}) are known from separate experimental measurements, except for the last Results section where we infer without a separate estimate of *f*_{on}.

### Estimation of the mutation-rate increase is accurate and precise for sufficiently large response-*on* mutation supply

First, we evaluate our novel inference method’s performance in the best-case scenario; that is, we simulate and infer under the same model: a model of heterogeneous stress responses without cell death, with mutant fitness equal to non-mutant fitness, and with zero division rate of response-*on* cells. We simulate for a range of mutation rates in response-*on* cells, *ν*_{on} ∈ [10^{−5}, 10^{−8}] *h*^{−1} and switching rates *α* ∈ [0.001, 0.1] *h*^{−1}. Note that the per-division mutation rate *μ*_{on} and the per-unit-time rate *ν*_{on} are equivalent here because we set the division rate to one.

For the inference, we consider the same model as used in the simulations with the only exception that we neglect switching *on* the stress response in mutants and initial population size effects; for a comparison of the exact and approximated heterogeneous-response model, see Model and methods. For each set of mutant count data, we infer the mutation rate of response-*off* cells (*μ*_{off}) and the mutation-supply ratio (), which defines the relative contribution of response-*on* cells to the total mutation supply (Eq 27). From these estimates, we calculate the stress-induced mutation-rate increase, i.e. . To quantify the accuracy of our method, we calculate the median of the relative error of our estimated mutation-rate increases (Eq 30). Additionally, we use the coefficient of variation across the estimates (Eq 31) to quantify our method’s precision.

Comparing the estimated with the true mutation-rate increase , we find that the accuracy and precision improve with increasing and relative switching rate (Fig 2B and 2C). For example, for , when , 95% of estimates lie within 2−fold of the true mutation-rate increase and the estimation is unbiased (Fig 2A). For smaller , on the other hand, the variation in the estimates becomes large. Nonetheless, if the mutation-rate increase is estimated to be >25, the true increase is very likely to be >10, and conversely, if the mutation-rate increase is estimated close to zero (<10^{−3}), it is very likely to be <10.

We simulate using the simplest model of heterogeneous stress responses (without cell death or differential mutant fitness and with zero division rate of response-*on* cells) and infer the mutation-rate increase assuming the same model in the inference. **A** Estimated compared to true for a range of values of and a relative switching rate of . **B** Median relative error of estimated compared to true mutation-rate increase for a range of and . Over/underestimation is shown in red/blue, and diamonds indicate a relative error greater than one. **C** Coefficient of variation across estimates. The parameter ranges used in the simulations are *ν*_{on} ∈ [10^{−5}, 10^{−8}] *h*^{−1} and *α* ∈ [0.001, 0.1] *h*^{−1}.

The mutation-supply ratio, which is defined (approximately given by the product of and for small *α* and small *r*_{on}) determines our method’s performance. For , the contribution of the response-*on* subpopulation is dominating. In contrast, for , the response-*on* subpopulation contributes very little to the total mutant count. Overall, in the best-case scenario and for the parameter regime considered here, or greater is sufficient for an accurate and precise estimate of the mutation-rate increase.

We also evaluate the sensitivity of our method’s performance to the number of parallel cultures (Fig C in S1 File). For smaller *c*, the precision of our method worsens compared to *c* = 50, but the estimation of remains accurate for sufficiently large or greater.

In addition to the maximum likelihood estimates, we calculate 95% profile likelihood confidence intervals on the estimates of the mutation-rate increase (Fig D in S1 File). We find that the median width of the confidence intervals increases with decreasing and , in a similar way as the CV of the estimates for *R* = 100 repeated simulations shown in Fig 2C. Moreover, we confirm that the true value for the mutation-rate increase lies outside of the confidence interval in < 5% of the simulations.

### Cell death has a limited impact on estimates

Our inference model accounts for changes in mutation and division rates upon induction of the stress response but neglects other potential consequences of the stress, such as cell death. Previous work showed that the occurrence of cell death, if neglected in the inference model, leads to overestimation of mutation rate [18]. Therefore, we asked whether neglecting cell death has a similar effect in the case of heterogeneous stress responses. For this purpose, we simulate fluctuation assays under an extended model of heterogeneous stress responses with cell death. We consider the cases that (i) only response-*off* cells are affected by cell death, (ii) only response-*on* cells are affected, or (iii) all cells are affected equally, using parameter ranges of *h*^{−1}.

Interestingly, we find that any biases in the estimated mutation-rate increase depend on which subpopulation is affected by cell death. If only response-*off* cells die, the mutation-rate increase is overestimated for sufficiently large death rates (Fig 3A). On the other hand, if only response-*on* cells die, the mutation-rate increase is underestimated (Fig 3B). The estimation remains unbiased if both subpopulations are equally affected by cell death. However, the variation in the estimates increases for large death rates (Fig 3C). From the contribution of the response-*on* subpopulation to the mutant count given in Eq 24, it can be seen that the effects of cell death in response-*off* and response-*on* cells partly cancel each other out. This result also holds for a smaller switching rate (Fig E in S1 File).

We simulate using the heterogeneous-response model (without differential mutant fitness and with zero division rate of response-*on* cells) but with cell death. However, we neglect cell death in the model used for inference. The black solid lines indicate the true mutation-rate increase used in the simulations. **A** Estimated mutation-rate increase when only response-*off* cells are affected by cell death, **B** when only response-*on* cells are affected by cell death, and **C** when all cells are affected by cell death equally. The parameter ranges used in the simulations are *h*^{−1} (with ).

We test another biologically realistic model deviation: the fitness of mutants differs from non-mutants during the growth phase (Fig F in S1 File). We find that neglecting this effect in the inference causes very little bias in the estimates for either a fitness advantage or a fitness cost of mutations.

### Estimation remains accurate when response-*on* cells have a small to moderate division rate

So far, we considered response-*on* cells not to divide at all during the growth phase, motivated by the SOS response. However, the division rate of response-*on* cells might be non-zero, especially if cells are exposed to a very low level of DNA damage (in the case of SOS) or for other stress responses. As a default setting, our inference method sets the relative fitness of response-*on* cells to zero (*r*_{on} = 0), but it also allows us to estimate *r*_{on} as an inference parameter. In the following, we evaluate the performance of our inference method when true *r*_{on} > 0, specifically the impact on the estimated mutation-rate increase.

We simulate under the heterogeneous-response model with a non-zero division rate of response-*on* cells, considering a parameter range of *γ*_{on} ∈ [0, 1] *h*^{−1} (with *γ*_{off} = 1 *h*^{−1}). Note that the relative fitness of response-*on* cells *r*_{on} is equivalent to their division rate *γ*_{on} because we consider no cell death here. We consider two different inference approaches. In one case, we infer the mutation rate *μ*_{off}, the mutation-supply ratio and the relative fitness of response-*on* cells *r*_{on}. Alternatively, we infer only *μ*_{off} and while setting *r*_{on} = 0. We estimate the mutation-rate increase in both cases and compare it to the true value. In the first case, we also compare the estimated with the true *r*_{on}.

We find that the estimation of the mutation-rate increase remains accurate for small to moderate relative fitness *r*_{on}. For larger *r*_{on} → 1, on the other hand, the mutation-rate increase is underestimated, yet more accurate and precise if *r*_{on} is also inferred (Fig 4A). The estimate of *r*_{on} itself is also underestimated for larger *r*_{on} (Fig 4B).

We simulate using the heterogeneous-response model (without cell death or differential mutant fitness) with *r*_{on} ≥ 0 being the relative fitness of response-*on* cells compared to response-*off* cells. We consider two cases for the inference: (i) setting *r*_{on} to zero and only inferring *μ*_{off} and , and (ii) inferring *r*_{on} in addition. **A** Estimated mutation-rate increase for both cases and a range of relative fitness of response-*on* cells. The solid black line indicates the true value of . **B** Estimated compared to true relative fitness of response-*on* cells in inference case (ii). The parameter range used in the simulations is *γ*_{on} ∈ [0, 1] in *h*^{−1}.

We also evaluate the performance when *r*_{on} is set to the true value in the inference. Interestingly, this increases the accuracy and precision of the estimate of only slightly compared to when *r*_{on} is inferred (Fig G in S1 File). The reason lies in the approximation made to derive the mutant count distribution (Eq 13), which is no longer valid as *r*_{on} → 1.

### Model selection between heterogeneous and homogeneous response is often inconclusive

In many cases, it may not be known *a priori* whether the stress response is heterogeneously expressed across the population or whether, in contrast, all cells respond similarly. We want to determine whether distinguishing these two models is possible using mutant count data from fluctuation assays alone. To do so, we simulate fluctuation assays under the heterogeneous-response model for a range of relative fitness of response-*on* cells, *r*_{on}. For the inference, we use both the heterogeneous- and homogeneous-response models and compare how well they fit the data. We use the same simulation data as in the previous section (parameter range *γ*_{on} ∈ [0, 1] *h*^{−1}). However, we suppose that the fraction of the response-*on* subpopulation *f*_{on} is unknown. Therefore, when using the heterogeneous-response model in the inference, we either set *r*_{on} = 0 (in which case *f*_{on} drops out of the equations) and infer only *μ*_{off} and , or set *f*_{on} and *r*_{on} as additional inference parameters. Note that, if *f*_{on} is not inferred, the mutation-rate increase can no longer be calculated, see Eq 26.

We perform model selection between the heterogeneous and the homogeneous-response models using a combination of the likelihood ratio test (LRT) and the Akaike Information Criterion (AIC), which consider how well the models reproduce the data while penalising the number of model parameters. For the homogeneous-response model, we consider three cases: (a) without differential mutant fitness (inference parameters: *μ*^{p} and *μ*^{s}), (b) with differential mutant fitness, but constrained to be equal under permissive and stressful conditions (inference parameters *μ*^{p}, *μ*^{s} and *ρ*^{p} = *ρ*^{s}) and (c) with unconstrained differential mutant fitness (inference parameters: *μ*^{p}, *μ*^{s}, *ρ*^{p} and *ρ*^{s}). For the heterogeneous-response model, we consider two cases: (d) zero relative fitness of response-*on* cells (inference parameters: *μ*_{off} and ) and (e) non-zero relative fitness of response-*on* cells (inference parameters: *μ*_{off}, , *f*_{on} and *r*_{on}). We use LRTs to select the best homogeneous model (a-c) and the best heterogeneous model (d-e) within each of these sets of nested models. Then, we use the AIC to select between the best homogeneous and the best heterogeneous-response model, which are not nested. If the difference in AIC is less than two, we say neither model is clearly selected.

After applying this two-step model selection, we find that the heterogeneous-response model is selected in the majority of cases (around 75% for *r*_{on} = 0) when the relative fitness of response-*on* cells is small (Fig 5A). For increasing *r*_{on}, however, the homogeneous-response model without differential mutant fitness is selected with increasing frequency until it is selected for the majority of simulations for large *r*_{on}. The other models are selected for only a small number of simulations. Over the whole parameter range, there is a substantial fraction of cases in which no model can be selected, with the highest proportion of ≈50% for intermediate *r*_{on}. This implies that heterogeneity in stress responses with sufficiently large can in principle be detected, but only if the division rate of response-*on* is very small (or zero).

We simulate under the heterogeneous-response model for a range of relative fitness of response-*on* cells, *r*_{on}. In the inference, we use (d) the heterogeneous-response model with *r*_{on} = 0 (yellow) and (e) the heterogeneous-response model with *r*_{on} and *f*_{on} as inference parameters (dark green) to infer the mutation rate of response-*off* cells, *μ*_{off}, and mutation-supply ratio, . We also use the homogeneous-response model (a) without differential mutant fitness (*ρ*^{p} = *ρ*^{s} = 1; light purple), (b) with constrained differential mutant fitness (*ρ*^{p} = *ρ*^{s} inferred; dark purple) and (c) with unconstrained differential mutant fitness (*ρ*^{p}, *ρ*^{s} inferred; blue) to infer the population-wide mutation rates under permissive and stressful conditions, *μ*^{p} and *μ*^{s}, and, for (b) and (c), additionally *ρ*^{p} and *ρ*^{s}. **A** Model selection using LRT and AIC. **B** Estimated mutation-supply ratio, , by the best heterogeneous-response model. **C** Estimated increase in mutation rate, , by the best homogeneous-response model. The black lines in **B** and **C** indicate the true values of and the increase in population mean mutation rate, , respectively. The parameter range used in the simulations is *γ*_{on} ∈ [0, 1] *h*^{−1}.

Comparing the mutation-supply ratio estimated by the best heterogeneous-response model with its true value, we find that the estimate is accurate and precise for small *r*_{on}, but with a slight loss in precision for larger *r*_{on} (Fig 5B). This means that even without a separate estimate of *f*_{on}, the magnitude of the heterogeneity in mutation rates (in the form of ) can be captured.

We also compare the estimated increase in mutation rate from the best homogeneous-response model with the true increase in population mean mutation rate under the simulated heterogeneous-response model, given by . Interestingly, the inferred is an accurate and precise estimate of over the whole range of *r*_{on}, with only a slight underestimation for large *r*_{on} (Fig 5C). For *r*_{on} → 1 and assuming no cell death, accurate estimation of is expected because the probability generating function (PGF) of the mutant count distribution reduces to
(33)

This distribution is equivalent to the homogeneous-response model without differential mutant fitness (Eq 6), which is selected as the best homogeneous-response model for most simulations. For small *r*_{on}, on the other hand, homogeneous-response models with differential mutant fitness are selected more often, and they (falsely) infer an increasingly severe mutant cost under stressful conditions as *r*_{on} → 0 (Fig H in S1 File), despite mutations not having a cost in the simulations.

We repeat the analysis for a smaller mutation-rate increase, , and find that the heterogeneous-response model is selected less often, only in around 25% of the simulations for *r*_{on} = 0 (Fig I in S1 File), implying that small mutation-rate increases are most likely not picked up through model selection. We also perform model selection when using smaller numbers of parallel cultures (*c* = 20, 10) in the inference, and find that, overall, model selection is less informative for smaller *c* (Fig M in S1 File).

Finally, we check for the rate of false positives where the heterogeneous-response model is incorrectly selected in the absence of heterogeneity, by simulating under versions of the homogeneous-response model and performing model selection. We find that, when simulating under the homogeneous-response model with constrained mutant fitness, the homogeneous-response model is selected in almost all cases independent of the increase in mutation rate (Fig N in S1 File). Therefore, if the mutant fitness is the same under permissive and stressful conditions, the risk of false positives is negligible. When simulating under the homogeneous-response model with unconstrained mutant fitness, on the other hand, there are more cases in which no model or the heterogeneous-response model is selected (Fig O in S1 File).

When simulating under the homogeneous-response model without an increase in mutation rate () and with small mutant fitness costs, no model can be selected in most cases, but both heterogeneous and homogeneous-response models correctly infer that there is no increase in mutation rate, corresponding to and , respectively (Fig J in S1 File).

## Discussion

Since its introduction 80 years ago, the standard model behind the fluctuation assay has been extended numerous times to overcome limitations and make it more biologically realistic. Extensions particularly relevant for quantifying stress-induced mutagenesis include considering cell death [18, 43] and differential mutant fitness [44]. In this study, we addressed a so-far overlooked limitation of fluctuation analysis: heterogeneity in the expression of stress responses, which single-cell studies have recently demonstrated. Our population dynamic model considers that only a subpopulation of cells (fraction *f*_{on}) have the stress response switched *on* and the remainder of the population switched *off*. This allows us to estimate the relative increase in mutation rate associated with the induction of the stress response, .

We tested our estimation method with simulated mutant count data, which confirmed accurate and precise estimation of for sufficiently large mutation-supply ratio defined as (Fig 2). depends on the mutation-rate increase itself and the fraction of the response-*on* subpopulation. While is inherent to the stress response, *f*_{on} could potentially be increased through experimental design. For example, increasing the antibiotic concentration has been shown to increase the rate of switching *on* the stress response and thus the fraction of the response-*on* subpopulation [27]. Our results suggest that mutation rate estimates would be more accurate at higher antibiotic concentrations, all else being equal. Increasing antibiotic concentration could, however, also increase cell death. We neglect cell death in our inference, but we showed that our method is robust to this model deviation up to moderate death rates when cell death affects response-*off* and response-*on* subpopulations equally (Fig 3C).

We used model selection with a combination of likelihood-ratio tests and AIC to evaluate whether a signal of mutation-rate heterogeneity can be detected from fluctuation assays alone. The chances of detecting heterogeneity are highest when response-*on* cells are non- or only slowly-dividing (*r*_{on} ≪ 1). For moderate switching rates and a mutation-rate increase of , the heterogeneous-response model is selected over homogeneous-response models in the majority of the simulated experiments (Fig 5A). However, with increasing *r*_{on} (> 0.25), the heterogeneous-response model is only rarely selected. A smaller mutation-rate increase also cannot effectively be detected by model selection (Fig I in S1 File). Generally, model selection with fewer than *c* ≈ 50 parallel cultures per fluctuation assay will be very difficult to interpret even for the best-case parameter range (Fig M in S1 File).

Our results suggest that heterogeneity in stress responses may have been overlooked when using fluctuation assays, and these data should be complemented with additional experiments to support or rule out alternative explanations. For example, mutants arising in the fluctuation assay can be isolated and their relative fitness compared to non-mutants measured with a pair of competitive fitness assays under permissive and stressful conditions. This measurement would allow researchers to check whether mutant fitness values estimated from the homogeneous-response model fit (*ρ*^{p} and *ρ*^{s}) are reasonable. In particular, a large difference in estimated *ρ*^{p} and *ρ*^{s} may alternatively indicate the presence of a slowly-dividing and highly-mutating subpopulation (Fig H in S1 File). Constraining *ρ*^{p} = *ρ*^{s}, arguably a reasonable null model, increases the fraction of simulated experiments in which the heterogeneous model is selected (Fig L in S1 File).

If there is reason to suspect heterogeneity in the stress response, experimentalists can test this hypothesis directly by engineering fluorescent reporters into the bacterial strain of interest and measuring the response on a single-cell level, e.g. by flow cytometry [8, 9, 24] or microscopy [8, 19, 20]. These experiments would provide an independent estimate of the fraction of the response-*on* subpopulation to further constrain the heterogeneous-response model and allow calculation of . In reality, multiple factors causing deviation from the standard fluctuation assay model (e.g. heterogeneous stress responses, differential mutant fitness, and cell death) will likely operate simultaneously. Since it is not feasible to reliably estimate a large number of parameters from fluctuation assay data alone, separate experiments become important to decide which deviation(s) are most relevant to incorporate into the fluctuation analysis.

Interestingly, the homogeneous-response model performs well in estimating the increase in population mean mutation rate (Fig 5C). Therefore, mutation rate estimates from previous studies that neglect heterogeneity in stress-induced mutagenesis, such as [8, 11, 14], can simply be interpreted as population-wide averages. However, these studies may underestimate the true extent of mutagenesis associated with the expression of the stress response if it is only induced by a subpopulation of cells. Estimating not only the increase in population mean but also heterogeneity in mutation rate, as is possible with our method, could be important for parameterising evolutionary models, such as predictions of antibiotic resistance evolution. Theoretical modelling suggests that single-locus adaptation can be accurately captured by the population mean mutation rate, but within-population variation (even for a fixed population mean) can speed up multi-locus adaptation [45]. However, this previous model did not incorporate any coupling of changes in mutation rate to changes in cell division or death rates, as would be expected in the case of stress responses. Therefore, an important direction for future work is to assess when the pleiotropic effects of realistic stress responses truly accelerate evolution.

Our approach to quantifying stress-induced mutagenesis assumes that the expression of the stress response is bimodal and can reasonably be modelled as either switched *off* or *on*. To a reasonable approximation, this expression pattern has been observed for the SOS response, particularly in slow-growth conditions [27]. In other conditions or for other stress responses, it might be more appropriate to model the expression as a unimodal distribution. We expect, however, that this increase in model complexity would make parameter inference more challenging. Similarly, for simplicity, we neglect stochastic induction of the stress response under permissive conditions. Low levels of stress-response expression have been reported, for example, due to spontaneous DNA breakage [46, 47]. We expect our method to be robust to low levels of stress response induction under permissive conditions since, in this case, the subpopulation with elevated stress response level will be negligibly small. This also implies, though, that with our method we cannot effectively quantify heterogeneity in mutation rates in unstressed conditions.

To be able to derive an analytical expression for the mutant count distribution, we made a series of approximations, the most important one being that cells with stress response switched *on* have a net growth rate (*γ*_{on} − *δ*_{on}) much lower than that of *off* cells (). For the SOS response, this approximation is valid, as induction of the response inhibits cell division. However, it might be violated for other stress responses, particularly if they protect cells from death, resulting in *δ*_{on} < *δ*_{off}. In this case, our approximation is no longer valid, and therefore, parameter estimation using our method is expected to be less accurate. Nonetheless, the estimated mutation-rate increase is robust to relative fitness of response-*on* cells up to ≈75% and is only marginally improved by inferring *r*_{on} rather than setting it to zero (Fig 4A).

We also assume response-*on* cells do not switch the response *off* so long as the stress remains present during a fluctuation assay’s comparably short growth phase. In particular, this assumption implies that the model cannot capture stress responses that are transiently expressed and associated with pulse-like mutagenesis even under continued exposure to the stressor, such as the oxidative stress response [48]. Our model could be adapted for stress responses where induction of the response is associated with a *decrease* in mutation rate along with increased cell viability, such as the adaptive (Ada) response to DNA alkylation damage, which also exhibits within-population heterogeneity in timing of induction [22, 23]. However, this situation would require a different parameterisation of the model, in which our current analytical approximations break down and the potential for parameter inference would need to be re-tested. Overall, developing models tailored to other stress responses offers an interesting direction for future work.

In summary, we have presented and validated a new method for inferring stress-induced increases in mutation rate from fluctuation assays. Importantly, however, both a heterogeneous stress response and a homogeneous response with mutant fitness costs can generate similar patterns in fluctuation assay data (Fig B in S1 File), which calls for further experiments to distinguish these models. While the homogeneous-response model can estimate the increase in population mean mutation rate, our new method of inferring heterogeneous mutation rates would be crucial for accurately characterising the mutagenic effects of stress responses and parameterising models of multi-locus adaptation. In future work, we aim to incorporate our new method into user-friendly tools for application to experimental data, similar to existing R packages [37, 49] and web tools [16, 32, 50, 51] for fluctuation analysis.

## Supporting information

### S1 File. Supplementary information.

Mathematical derivations, example mutant count distributions, sensitivity analysis, 95% confidence intervals, parameter estimation and model selection for additional parameter ranges, and comparison of model selection procedures.

https://doi.org/10.1371/journal.pcbi.1012146.s001

(PDF)

## Acknowledgments

The authors are grateful for helpful feedback on the mathematical results received from Tibor Antal and for the discussion and inspiration provided by the Alexander and El Karoui labs.

## References

- 1.
Friedberg EC, Walker GC, Siede W, Wood RD, Schultz RA, Ellenberger T. Mutagenesis and Translesion Synthesis in Prokaryotes. In: DNA Repair and Mutagenesis. Washington, DC, USA: ASM Press; 2014. p. 509–568. Available from: http://doi.wiley.com/10.1128/9781555816704.ch15.
- 2. Foster PL. Stress-Induced Mutagenesis in Bacteria. Critical Reviews in Biochemistry and Molecular Biology. 2007;42(5):373–397. pmid:17917873
- 3. Bjedov I, Tenaillon O, Gérard B, Souza V, Denamur E, Radman M, et al. Stress-Induced Mutagenesis in Bacteria. Science. 2003;300(5624):1404–1409. pmid:12775833
- 4. Tenaillon O, Denamur E, Matic I. Evolutionary significance of stress-induced mutagenesis in bacteria. Trends in Microbiology. 2004;12(6):264–270. pmid:15165604
- 5. Ram Y, Hadany L. The evolution of stress-induced hypermutation in asexual populations. Evolution. 2012;66(7):2315–2328. pmid:22759304
- 6. Ram Y, Hadany L. Stress-induced mutagenesis and complex adaptation. Proceedings of the Royal Society B: Biological Sciences. 2014;281 (1792). pmid:25143032
- 7. Cirz RT, Chin JK, Andes DR, de Crécy-Lagard V, Craig WA, Romesberg FE. Inhibition of Mutation and Combating the Evolution of Antibiotic Resistance. PLoS Biology. 2005;3(6):e176. pmid:15869329
- 8. Pribis JP, García-Villada L, Zhai Y, Lewin-Epstein O, Wang AZ, Liu J, et al. Gamblers: An Antibiotic-Induced Evolvable Cell Subpopulation Differentiated by Reactive-Oxygen-Induced General Stress Response. Molecular Cell. 2019;74(4):785–800.e7. pmid:30948267
- 9. Zhai Y, Pribis JP, Dooling SW, Garcia-Villada L, Minnick PJ, Xia J, et al. Drugging evolution of antibiotic resistance at a regulatory network hub. Science Advances. 2023;9(25):eadg0188. pmid:37352342
- 10. WU YL, SCOTT EM, PO ALW, TARIQ VN. Development of resistance and cross‐resistance in Pseudomonas aeruginosa exposed to subinhibitory antibiotic concentrations. APMIS. 1999;107(1-6):585–592. pmid:10379686
- 11. Gillespie SH, Basu S, Dickens AL, O’Sullivan DM, McHugh TD. Effect of subinhibitory concentrations of ciprofloxacin on Mycobacterium fortuitum mutation rates. Journal of Antimicrobial Chemotherapy. 2005;56(2):344–348. pmid:15956099
- 12. Henderson-Begg SK, Livermore DM, Hall LMC. Effect of subinhibitory concentrations of antibiotics on mutation frequency in Streptococcus pneumoniae. Journal of Antimicrobial Chemotherapy. 2006;57(5):849–854. pmid:16531433
- 13. Baharoglu Z, Mazel D. Vibrio cholerae Triggers SOS and Mutagenesis in Response to a Wide Range of Antibiotics: a Route towards Multiresistance. Antimicrobial Agents and Chemotherapy. 2011;55(5):2438–2441. pmid:21300836
- 14. Kohanski MA, DePristo MA, Collins JJ. Sublethal Antibiotic Treatment Leads to Multidrug Resistance via Radical-Induced Mutagenesis. Molecular Cell. 2010;37(3):311–320. pmid:20159551
- 15.
Gutierrez A, Laureti L, Crussard S, Abida H, Rodríguez-Rojas A, Blázquez J, et al.
*β*-lactam antibiotics promote bacterial mutagenesis via an RpoS-mediated reduction in replication fidelity. Nature Communications. 2013;4. pmid:23511474 - 16. Krašovec R, Richards H, Gomez G, Gifford DR, Mazoyer A, Knight CG. Measuring Microbial Mutation Rates with the Fluctuation Assay. Journal of Visualized Experiments. 2019;2019(153):1–9. pmid:31840662
- 17. Foster PL. Methods for determining spontaneous mutation rates. Methods in enzymology. 2006;409:195–213. pmid:16793403
- 18. Frenoy A, Bonhoeffer S. Death and population dynamics affect mutation rate estimates and evolvability under stress in bacteria. PLOS Biology. 2018;16(5):e2005056. pmid:29750784
- 19. McCool JD, Long E, Petrosino JF, Sandler HA, Rosenberg SM, Sandler SJ. Measurement of SOS expression in individual Escherichia coli K‐12 cells using fluorescence microscopy. Molecular Microbiology. 2004;53(5):1343–1357. pmid:15387814
- 20. Mrak P, Podlesek Z, van Putten JPM, Žgur-Bertok D. Heterogeneity in expression of the Escherichia coli colicin K activity gene cka is controlled by the SOS system and stochastic factors. Molecular Genetics and Genomics. 2007;277(4):391–401. pmid:17216493
- 21. Kamenšek S, Podlesek Z, Gillor O, Žgur-Bertok D. Genes regulated by the Escherichia coli SOS repressor LexA exhibit heterogenous expression. BMC Microbiology. 2010;10(1):283. pmid:21070632
- 22. Uphoff S, Lord ND, Okumus B, Potvin-Trottier L, Sherratt DJ, Paulsson J. Stochastic activation of a DNA damage response causes cell-to-cell mutation rate variation. Science. 2016;351(6277):1094–1097. pmid:26941321
- 23. Uphoff S. Real-time dynamics of mutagenesis reveal the chronology of DNA repair and damage tolerance responses in single cells. Proceedings of the National Academy of Sciences. 2018;115(28):E6516–E6525. pmid:29941584
- 24. Woo AC, Faure L, Dapa T, Matic I. Heterogeneity of spontaneous DNA replication errors in single isogenic Escherichia coli cells. Science Advances. 2018;4(6):2–10. pmid:29938224
- 25. Jones EC, Uphoff S. Single-molecule imaging of LexA degradation in Escherichia coli elucidates regulatory mechanisms and heterogeneity of the SOS response. Nature Microbiology. 2021. pmid:34183814
- 26. Vincent MS, Uphoff S. Cellular heterogeneity in DNA alkylation repair increases population genetic plasticity. Nucleic Acids Research. 2021;49(21):12320–12331. pmid:34850170
- 27. Jaramillo‐Riveri S, Broughton J, McVey A, Pilizota T, Scott M, El Karoui M. Growth‐dependent heterogeneity in the DNA damage response in Escherichia coli. Molecular Systems Biology. 2022;18(5):1–14. pmid:35620827
- 28. Choudhary D, Lagage V, Foster KR, Uphoff S. Phenotypic heterogeneity in the bacterial oxidative stress response is driven by cell-cell interactions. Cell Reports. 2023;42(3):112168. pmid:36848288
- 29. Lagage V, Uphoff S. Pulses and delays, anticipation and memory: seeing bacterial stress responses from a single-cell perspective. FEMS microbiology reviews. 2020;44(5):565–571. pmid:32556120
- 30.
Friedberg EC, Walker GC, Siede W, Wood RD, Schultz RA, Ellenberger T. The SOS Responses of Prokaryotes to DNA Damage. In: DNA Repair and Mutagenesis. Washington, DC, USA: ASM Press; 2014. p. 463–508. Available from: http://doi.wiley.com/10.1128/9781555816704.ch14.
- 31. Baharoglu Z, Mazel D. SOS, the formidable strategy of bacteria against aggressions. FEMS Microbiology Reviews. 2014;38(6):1126–1145. pmid:24923554
- 32. Łazowski K. Efficient, robust, and versatile fluctuation data analysis using MLE MUtation Rate calculator (mlemur). Mutation Research—Fundamental and Molecular Mechanisms of Mutagenesis. 2023;826(April). pmid:37104996
- 33. Feller W. Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahrscheinlichkeitstheoretischer Behandlung. Acta Biotheoretica. 1939;5(1):11–40.
- 34. Keller P, Antal T. Mutant number distribution in an exponentially growing population. Journal of Statistical Mechanics: Theory and Experiment. 2015;2015(1):P01011.
- 35. Bos J, Zhang Q, Vyawahare S, Rogers E, Rosenberg SM, Austin RH. Emergence of antibiotic resistance from multinucleated bacterial filaments. Proceedings of the National Academy of Sciences. 2015;112(1):178–183. pmid:25492931
- 36. Bezanson J, Edelman A, Karpinski S, Shah VB. Julia: A Fresh Approach to Numerical Computing. SIAM Review. 2017;59(1):65–98.
- 37. Mazoyer A, Drouilhet R, Despréaux S, Ycart B. flan: An R Package for Inference on Mutation Models. The R Journal. 2017;9(1):334.
- 38.
GABBIANI F, COX SJ. Stochastic Processes. In: Mathematics for Neuroscientists. Elsevier; 2010. p. 251–266. Available from: http://www.worldscientific.com/doi/abs/10.1142/9789813148963_0008 https://linkinghub.elsevier.com/retrieve/pii/B9780123748829000162.
- 39. Gillespie DT. Exact stochastic simulation of coupled chemical reactions. The Journal of Physical Chemistry. 1977;81(25):2340–2361.
- 40. Garibyan L. Use of the rpoB gene to determine the specificity of base substitution mutations on the Escherichia coli chromosome. DNA Repair. 2003;2(5):593–608. pmid:12713816
- 41. Lee H, Popodi E, Tang H, Foster PL. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(41). pmid:22991466
- 42. Marinus MG. DNA methylation and mutator genes in Escherichia coli K-12. Mutation Research/Reviews in Mutation Research. 2010;705(2):71–76. pmid:20471491
- 43. Vasse M, Bonhoeffer S, Frenoy A. Ecological effects of stress drive bacterial evolvability under sub-inhibitory antibiotic treatments. ISME Communications. 2022;2(1). pmid:37938266
- 44. Zheng Q. Estimation of Rates of Non-neutral Mutations When Bacteria are Exposed to Subinhibitory Levels of Antibiotics. Bulletin of Mathematical Biology. 2022;84(11):131. pmid:36178523
- 45. Alexander HK, Mayer SI, Bonhoeffer S. Population Heterogeneity in Mutation Rate Increases the Frequency of Higher-Order Mutants and Reduces Long-Term Mutational Load. Molecular Biology and Evolution. 2016;34(2):244.
- 46. Jeanine M P, Susan M R. Spontaneous DNA breakage in single living Escherichia coli cells. Nature Genetics. 2007;39(6):797–802.
- 47. Vincent MS, Uphoff S. Bacterial phenotypic heterogeneity in DNA repair and mutagenesis. Biochemical Society Transactions. 2020;48(2):451–462. pmid:32196548
- 48. Lagage V, Chen V, Uphoff S. Adaptation delay causes a burst of mutations in bacteria responding to oxidative stress. EMBO reports. 2023;24(1). pmid:36397732
- 49. Zheng Q. rSalvador: An R package for the fluctuation experiment. G3: Genes, Genomes, Genetics. 2017;7(12):3849–3856. pmid:29084818
- 50. Hall BM, Ma CX, Liang P, Singh KK. Fluctuation AnaLysis CalculatOR: a web tool for the determination of mutation rate using Luria–Delbrück fluctuation analysis. Bioinformatics. 2009;25(12):1564–1565. pmid:19369502
- 51. Gillet-Markowska A, Louvel G, Fischer G. bz-rates: A Web Tool to Estimate Mutation Rates from Fluctuation Analysis. G3 Genes|Genomes|Genetics. 2015;5(11):2323–2327. pmid:26338660