Phenotypic-dependent variability and the emergence of tolerance in bacterial populations

Ecological and evolutionary dynamics have been historically regarded as unfolding at broadly separated timescales. However, these two types of processes are nowadays well-documented to intersperse much more tightly than traditionally assumed, especially in communities of microorganisms. Advancing the development of mathematical and computational approaches to shed novel light onto eco-evolutionary problems is a challenge of utmost relevance. With this motivation in mind, here we scrutinize recent experimental results showing evidence of rapid evolution of tolerance by lag in bacterial populations that are periodically exposed to antibiotic stress in laboratory conditions. In particular, the distribution of single-cell lag times—i.e., the times that individual bacteria from the community remain in a dormant state to cope with stress—evolves its average value to approximately fit the antibiotic-exposure time. Moreover, the distribution develops right-skewed heavy tails, revealing the presence of individuals with anomalously large lag times. Here, we develop a parsimonious individual-based model mimicking the actual demographic processes of the experimental setup. Individuals are characterized by a single phenotypic trait: their intrinsic lag time, which is transmitted with variation to the progeny. The model—in a version in which the amplitude of phenotypic variations grows with the parent’s lag time—is able to reproduce quite well the key empirical observations. Furthermore, we develop a general mathematical framework allowing us to describe with good accuracy the properties of the stochastic model by means of a macroscopic equation, which generalizes the Crow-Kimura equation in population genetics. Even if the model does not account for all the biological mechanisms (e.g., genetic changes) in a detailed way—i.e., it is a phenomenological one—it sheds light onto the eco-evolutionary dynamics of the problem and can be helpful to design strategies to hinder the emergence of tolerance in bacterial communities. From a broader perspective, this work represents a benchmark for the mathematical framework designed to tackle much more general eco-evolutionary problems, thus paving the road to further research avenues.

Consider the dynamics of a population of N individual bacteria cyclically exposed to antibiotics, as described in the main text. Each bacteria is characterized by its intrinsic lag time τ i ∈ P = R + , where P denotes the continuous one−dimensional phenotype space of possible lag time values. In addition, each individual can be in two different states, as indicated by the variable y i : if y i = 1 the individual is in the awake state, such that it can reproduce, while if y i = −1 it is in dormant state, where no activity happens. Finally, the environment, represented by the parameter η, varies between two states characterized by the presence (η = −1) or absence (η = 1) of antibiotics. The environmental dynamics is considered to be cyclical with period T max , η(t) = η(t + T max ), and within each cycle: if T a < T ≤ T max (S1) where T ∈ [0, T max ] is the time within a cycle and T a is the duration of the antibiotic phase. The whole population state is thus represented by the vectors τ N = (τ 1 , τ 2 , .., τ i , .., τ N ) and y N = (y 1 , y 2 , .., y i , .., y N ). The vector subindex indicates the dimensionality, i.e., the number of bacteria: dim(τ N ) = N, dim(τ i,ĵ N −1 ) = N − 1. To further proceed, let us consider that all individuals are indistinguishable, i.e., that the system state is invariant under individual permutations; for example: τ N = (τ 1 , τ 2 , .., τ i , .., τ N ) = τ N = (τ i , τ 1 , τ N , . . . , τ 2 , . . . , τ 3 ). The following notation is implemented in order to account for changes in the population; consider a reference state τ N = (τ 1 , τ 2 , . . . , τ i , . . . , τ j , . . . , τ N ); then a "modified" state with respect to τ N is defined asτ i,ĵ N −1 = (τ 1 , τ 2 , . . . ,τ i , . . . , τ j−1 , τ j+1 , .., τ N ) = (τ 1 , τ 2 , . . . ,τ i , . . . , τ N −1 ), where the i individual has a different trait,τ i = τ i , and the j one is absent, while the rest of individuals have the same traits in both states. Note that the property of indistinguishability has been used to derive the last equality.
Having defined the states, one can define the dynamics. In particular, each individual cell can perform diverse stochastic "reactions" (using the jargon of stochastic processes), that are illustrated in Fig. 1 of the main text.
Any individual i in the growing state(y i = 1) with traitτ i is exposed to a demographic-adaptive process (∆ D ): • It can try to reproduce asexually at rate g(τ i , τ N ) = b ∈ R + . On the one hand, if the antibiotics are present (η = −1), the reproduction does not happen and the individual is killed. On the other hand, in fresh environment, (η = 1), the reproduction is successful and the individual is replaced by two offspring with phenotypes τ i =τ i +ξ i and τ j =τ i +ξ j , where ξ i , ξ j are the variations, which are sampled from probability distribution β(ξ i = τ i −τ i ;τ i ) and β(ξ j = τ j −τ i ;τ i ) respectively. If the population size N has reached the carrying capacity K this reproduction event must be accompanied by the death of individual j with trait τ j , chosen at random (i.e. with rate µ = 1 K−1 ), much as in the Moran process. In principle, β is an arbitrary probability distribution normalized in the jump amplitude (the first argument): ∞ −τ β(τ −τ ;τ )d(τ −τ ) = 1, but it can depend on the initial phenotype, that is indicated with the second argument. In the following, for the sake of notation, the second argument will be omitted when not necessary.
• If in the awake state, it can die by natural causes at rate µ(τ i , τ N ) = d ∈ R + . Furthermore, any individual with trait τ i can switch on and off from dormancy with the following processes (∆ GD ): • If dormant (y i = 1), it can wake up at rate A(τ i , τ N ) = 1/τ i .
• If awake (y i = 1) and in the presence of antibiotics (η = −1) it can sense their presence and switch to dormancy at rate S(τ i , τ N ) = s ∈ R + .

A. Master equation
The general trait-dependent expression of the rates will be used for the sake of theoretical comprehensiveness, and it will be reduced to the particular one used in the main text when necessary. From these processes it is possible to write down a master equation for probability of the system configurations P (τ N , y N , t): where ∆ BD stands for the time change with respect to the demographic part of the dynamics, and the ∆ GD for the growing-dormant state transitions. In addition, it is useful to separate the demographic change into two contributions, the birth-death part, i.e. the set of processes that change the population size N, and the Moran processes part that drives the dynamics when the carrying capacity K is reached and does not change N : Birth-death contribution To take into account the change in population size, it is assumed that the probability of a state (τ N , y N , t) can be written as the ratio of some population state function ψ(τ N , y N , t), ψ : P → R 0,+ , and its normalization: ψ is non-negative, and its integral is heuristically assumed not to diverge, but it is not a probability function since it is not normalized. It can be thought as a scalar field generalizing, to population configurations, the concept of the number of individuals of a particular species in a discrete set of them (e.g. in evolutionary game theory, where the frequency of species i is written as p i = ni N , where n i is the number of individual of species i and N the population size). The time evolution of the probability density is specified by: where we have defined the normalizing factor: Note that in (S5) the probability is conserved, so once the equation of motion for ψ is written down, one has an equation for P . Thus, for the ∆ BD term it is mandatory to take into account this normalization, whereas it is not necessary for the ∆ GD and ∆ M terms, which do not change N. Let us consider the various processes that change the population size: • Birth, that increases the population's size of one, e.g. from a stateτ i,ĵ N −1 to τ N , with rate: where the i individual with traitτ i produces two offspring with traits τ i , τ j . Naturally, to reproduce individual i must be in the growing state and antibiotics must be absent.
• Death of the i individual by the effect of antibiotics, which reduces the population size from N to N − 1 at rate: where the indexî represent the absence of the i individual with respect to τ N .
• Natural death of individual i at rate: Thus, by adding these two rates (eq.S8-S9) one gets the full death rate: Putting together all these contributions, ∆ BD can be written as: (S11) Note that in the first three lines it is necessary to normalise the sums by the initial population size to respect individuals' indistinguishability. Consider for example the first line: it quantifies the probabilities that the state τ N is the result of a birth event in the stateτ i,ĵ N −1 where the i individual has reproduced. One of the offspring is indicated as x i but there are N − 1 possibilities of assigning the index j to the second one. Due to individual indistinguishability all these possibilities are equivalent, and therefore the sum must be normalized by N − 1.
To rewrite this expression in terms of probabilities, we normalize it in the following way: where the re-scaling factor is defined as: . (S13) Thanks to this definition, ∆ BD reads: Let us now calculate explicitly the normalization factor Λ N (t): The expression can be split in four terms: and each of them can be calculated separately. The first term is: (S17) Using the fact that individuals are indistinguishable the sums N i=1 , N j=1,i =j can be replaced with a factor N ·(N −1) by fixing i and j, e.g. i = 1, j = 2: ,τ 1,2 N −1 )δ η,1 δỹ 1 ,1 δ y1,1 δ y2,1 P (τ 1,2 N ,ỹ 1,2 N ).
(S18) By using the normalization of the β's in the variable δ = τ −τ , ∞ 0 dτ 1 β(τ 1 −τ 1 ) = ∞ −τ1 dδβ(δ) = ∞ 0 dτ 2 β(τ 2 −τ 1 ) = 1, one obtains: By performing a marginalization y∈{±1} P (τ 1,2 N ,ỹ 1,2 N ) = P (τ 1,2 N ,ỹ 1 ) and defining the average birth and death rates: one readily obtains: Similarly, one can work out the second term: where i has been fixed to N + 1. The third one reads: where the indices has been fixed as i = 1, j = N + 1. Finally, the fourth one term can be written as: Thus, by collecting the four terms (S22, S23, S24, S26) one finally obtains an explicit expression for the normalization factor: Moran process contribution. When the systems reaches the carrying capacity K each birth must be compensated by a random death. This part of the dynamics follows a master equation that conserves the system size N = K: where the transition rate is given by: Growing and Dormant state contribution The part representing the transitions between growing and dormant states is relatively easy to define given that N is conserved: where the transition rateỹ i → y i is given by: the δ stand for Kronecker deltas introduced because each term only contributes in a specific medium or state.

B. Gillespie algorithm
To integrate the master equation (S2) numerically we use the Gillespie algorithm. The Gillespie algorithm is a kinetic Monte Carlo Method [7] initially proposed in the field of chemistry to simulate individual trajectories of a set of chemical reactions [3]. However, since its initial formulation, the Gillespie algorithm has gained great importance e.g. statistical physics and biophysics [4]. In general each individual simulation of the Gillespie algorithm represents an exact trajectory of the probability density function that is ruled by the master eq. (S2) [7].
At each time step one possible event is randomly chosen with a probability proportional to its rate. In our case there are four different events in the antibiotic-exposure phase and five events in the fresh medium phase. Moreover, each bacteria can be in two different states: growing state and dormant state.
Antibiotic-exposure phase: 1) A bacteria in the growing state can be randomly chosen to die with rate d = const. Overall, bacteria death at rate ϕ death = N G · d, where N G is the number of growing individuals. 2) A bacteria in growing state can be randomly chosen to die while trying to reproduce with rate b = const. Overall Fresh medium phase: 1) A bacteria in the growing state can be randomly chosen for die at a constant rate d. Overall, bacteria death at rate ϕ death . 2) A bacteria in the growing state can be randomly chosen to reproduce at constant rate b. Overall, bacteria reproduce at rate ϕ birth = b · N G . Bacteria reproduce asexually by duplication, so both offspring will inherit the same phenotype τ i lag plus a variation. Here we explore two possibilities for the variation σ(τ ), additive amplitude σ A (τ ) ∼ α A = const and multiplicative amplitude σ m (τ ) ∼ α M · τ , as defined in S3. . Importantly, there is also a maximum carrying capacity K, so once it is reached for each reproduction event , a randomly chosen member of the population is deleted (following a Moran process).
Since the rate ϕ i exit d.s. is specific to each bacteria i, the search of the randomly chosen reaction could be computationally expensive, thus a binary search is implemented. In the worst case, binary search needs O (Ln (N )) comparisons, whereas direct search needs O (N ) comparisons.

A. Marginalization
The master equation (S2,S14, S28, S30) rules the system's evolution at the microscopic level (individual-based). Our aim now is to derive the "one-cell" probability density [2], φ (1) (τ, y, t): given a certain pair (τ, y), φ (1) gives the probability of finding an individual in that state at a time t, i.e. it is the one-particle probability density in the jargon of statistical physics. φ (1) is an experimentally measurable magnitude that provides a macroscopic description (population-based) of the system. To obtain the one-cell density one needs to marginalize P (τ N , y N , t), i.e. to remove the dependence of P in all individuals but one [5]. Mathematically, the one-cell probability density is defined as: and, by writing down explicitly the average operator . . . , it does coincide with the marginalized phenotypic distribution for the first individual: In the same way the n−cells probability density can be defined: Taking the time derivative of φ (1) , by using the master eq.(S2) one obtains: Here on, we consider separately the birth-death (∆ BD , S14), Moran (∆ M , S28) and growing-dormant (∆ GD , S31) contributions.
Birth and death contribution. The Birth-Death is defined as: The birth-death contribution of the master eq.(S14) is composed by five terms: the four positive/negative contribution of birth and death, plus the normalizing factor. This last one does not depend on the variables, and thus it is not relevant for the marginalization. Therefore, we focus here on the first four terms one at the time: The first one takes into account the positive contribution of reproduction: (S39) Let us split the i, j sums in three contributions: N j=2 =i , to decompose the expression in three contributions: The first two terms, B a 1 y B b 1 give a similar contribution: in the first sum, the individual 1 duplicates and gives birth to the new individual 1 and the j one, while the second term is just the reversed process. Indeed, there are (N − 1) ways of doing so, because when the first individual is chosen there are other N − 1 that could be its offspring. Hence, the first term is then computed as follows: and using the fact that individuals are indistinguishable, one can rewrite the N j=2 by fixing j = 2 and pulling out a N − 1 factor: The second contribution of the sum has a very similar form: where the N i=2 pulls out a (N − 1) factor by fixing i = 2. By renamingτ 2 →τ 1 one obtains: Finally, the third contribution of the sum, N i=2 N j=2 =i reads: where first sum, N i=2 , pulls a N − 1 factor by fixing i = 2, whereas the second sum, N j=2 =i , pulls a N − 1 factor by fixing j = 3. By splitting the following factor dτ1 N −1 as dτ1 ,2 N −3 dτ 2 dτ 3 and performing the integrals P dτ 2 β(τ 2 −τ 2 ) = P dτ 3 β(τ 3 −τ 2 ) = 1 one gets to: By summing all the three contributions (S42, S44, S46) one obtains the marginalization of the positive birth contribution: The marginalization of second term of the birth-death process, i.e. positive contribution of death, is straightforward: where the index i has been fixed to i = N + 1. The third term of the master equation quantifies the negative contribution of reproduction where the index j has been fixed to N + 1. By splitting the sum , and integrating the variations kernels one obtains: where in the second term we have fixed i = 2. Finally, the fourth term, i.e. the negative contribution of death, is very similar to the second one: by splitting the sum as , one obtains: By putting all terms together (S38, S46,S47, S49, S51) and marginalizing also on the y 1 variable one finally obtains the marginalized birth-death contribution: Moran Process Contribution The marginalization of the Moran process contribution (when N = K) is analogous to the birth-death one. It is necessary to consider separately the positive and negative contributions, The first one reads: As done before, one can split the sum as N j=2,j =i to respectively separate three contributions: The calculation is identical to the first term of the birth-death part (S46). The three contributions read: By summing them all, one obtains: The marginalization of the negative term is analogous to the birth-death one (S49), leading to: By summing the two terms and summing over y 1 one obtains the marginalized contribution of the Moran process: Growing and Dormant state contribution Using the definition of this part of the process from the master eq.(S30, S31): and splitting the sum in i as , the expression is composed by two terms: as defined now. Let us consider first the second term, which takes into account the transitions in which individual 1 is not involved: Using individual indistinguishability, the index i is fixed to i = 2 and removing the sum N i=2 by pulling out a factor (N − 1): and by marginalizing this expression in y j , j = 3, . . . , N one obtains: Obviously, if the individual before the transition was in a state,ỹ 2 in {±1}, after it the state is opposite one y 2 in {±1}, implying that the sum of the contribution is zero: Thus, the first term N i=1 δ i,1 , which takes in count the transitions in which individual 1 is involved, is the only net contribution to the growing-dormant part: This is because index 1 is the only one that remains when function P (τ N , y N , t) is marginalized, i.e., τ = τ 1 . It is possible to separate the previous equation in two contributions: where and Given that the rates do not depend on the variable y, one can marginalize in the variables using the sum yj in{±1},j∈ [2,N ] and obtain: where the growing and dormant densities have been defined as: while their evolution is given by:

B. Mean Field Approximation
To proceed further, one can first apply the individual rate approximation, i.e the rates depend just on the single individual traits: Thanks to this approximation it is possible to integrate over most of individuals traits, but it leaves the dependence on the two-cells density as a kind of "correlation kernel" in the birth-death contribution eq.(S52): where we have used the definition of growing/dormant densities (S73), the 2 − cells density function (S35) and the fact that 1 = δ η,1 + δ η,−1 .
Similarly for the Moran process contribution (S61) reads: On the contrary, in the marginalized growing-dormant part, eq.(S75-S76) , the rate approximation (S77) is enough to reach the final expression: To simplify further the birth-death (S78) and Moran (S79) part one proceeds with a second mean field approximation. Specifically, we assume that the correlation between individuals is negligible. This is expected to be true in the thermodynamic limit N → ∞ [2,6]. In such a case one can factorize the probability distribution: and the 1-cell density can be replaced by its mean-field approximation (removing also the individual-subindexτ 1 ): An important consequence of this simplification is that the normalization coefficients a N,M fulfills the relationship: and, moreover, the different marginal averages no longer dependent on the system size: Hence, the birth-death term (S78) becomes: In practical terms φ(τ, t) is a frequency, and, for this reason, one can evaluate the normalization constant a as 1 a = N −1 N . Furthermore, by implementing the mean-field approximation it has been assumed the thermodynamic limit N → ∞, i.e. infinite population size, hence the the normalization constant can be approximated as: N −1 N ≈ 1. Thanks to all of these, equation (S87) simplifies to: Importantly, if one applies the mean-field approximation (S82) to the Moran process contribution (S79) it coincides with the contribution of the birth-death process in the fresh medium (see [5]): Thus, in the mean-field description, i.e. in the thermodynamics limit, it is not necessary to take into account if the population size is constant or varying: ∆ D φ(τ, t) = ∆ BD φ(τ, t). By using the decomposition in growing and dormant densities (S73) in the birth-death part, it is possible to separate such contribution in two equations: By summing the birth-death (S90)and the growing-dormant (S80) contributions, and by rewriting the deltas as one finally obtains the general mean-field equations : By adding these two eqs. one can easily compute the total marginalized distribution : Finally, fixing the rates as and by inserting them in the mean-field equation (S93) one finds the equations (1) and (2) of the main text.

S3. SMALL VARIATION APPROXIMATION
When variations are small, it is possible to perform a diffusive approximation to equation (S95). Specifically, we consider the term proportional to the variation kernel, β(τ −τ ;τ ), which regulates the probability of changing from traitτ to τ during reproduction, and expand it aroundτ ∼ τ . Note that β depends on the magnitude of variation, δ = τ −τ , but can also depend explicitly on the value of the starting trait,τ , (i.e. state dependent, or multiplicative variation). Given that β is a probability distribution of variation, hence: ∞ −τ dδβ(δ;τ ) = 1. Furthermore, one can assume it to be symmetric in δ: β(δ;τ ) = β(−δ;τ ); and with finite first and second moments: Let us separate the mean-field equation for the birth-death change of the growing individual as(S90) in two parts: where the first part involves the local variation events: and the second involves the remaining rates: where the deltas have been replaced as in eq.(S92). Considering now ∆ 1 (S99), which does conserve the probability and has the typical structure of a master equation [8]): with the "effective" stochastic transition rate from phenotypeτ to τ : Thus it is possible to perform a Kramers-Moyal expansion of eq.(S101) around τ =τ . The final trait can be written as the initial one plus the "jump" amplitude δ: τ =τ + δ, allowing one to rewrite the transition rate as function just of the initial trait and of the jump amplitude: Hence, equation(S99) can be rewritten as: where we have performed the following change of variable: One can consider δ as a small change in the first variable of the rate, and Taylor expand uo to second order: Inserting it back in the equation and noting that W (τ ; δ) = W (τ ; −δ), one obtains: where Plugging the eq. for ∆ 1 (S107) into eq.(S98), and then inserting it back in the mean-field equation for the growing population (S95), one obtains: By adding ∂ t φ G (S110) and ∂ t φ D (S94) one obtains the following expression for ∂ t φ(τ, t) within the small-variation approximation: By choosing the rates as in main text(S96), one obtains the equations for the growing and dormant densities: In addition, one can define an effective fitness function f (τ, t) ≡ b φ G (τ,t) φ(τ,t) , leading to the following expression of eq.(S111) : This expression for the marginal probability distribution is a version of the celebrated Crow-Kimura selection-mutation equation for "infinite alleles" in population genetics [1,5]. From here on, we refer eq. (S114) as the generalized Crow-Kimura equation (GCK). The first term of this equation, φ (τ, t) f (τ, t) − f , is a replicator-like term for continuous phenotypes. The replicator dynamics captures the essence of selection. If a phenotype τ has fitness which is below the average fitness f it will tend to disappear, whereas if it is above f it will tend to expand through the population. The second term, that resembles the divergence of a current from a Fokker-Planck eq.( [8]), captures the effects of mutation/variation: Note that this term has the typical structure of a conserved local probability current of a Fokker-Planck equation, modeling variations as a reaction-diffusion dynamics [5] and then allowing φ(τ, t) to move across the phenotype space. Let us note that the main structural difference with the classic Crow-Kimura eq. is the presence of the fitness function in the variation terms. This dependence is typical of phenotypic evolution and couple together ecological and evolutionary timescales.
Consider the two types of variation kernel presented in the main text. The additive case has no direct dependence on the initial phenotype; so we choose : where, due to the truncation in the domain [−τ , ∞], the normalization function Z A presents a dependence on the phenotypic state and Erf c stands for the complementary error function. Its first two cumulants are: (S119) For sufficiently small α it is possible to ignore the Erf c function, such that: (S120) On the other hand, is the multiplicative case we consider a Gaussian with the variance proportional to the phenotype: measures, τ G.M.F. j and τ G.C.K. j are the mean of the general mean field and the generalized Crow-Kimura equation respectively at time j · ∆t, and super-index as. in δ as. (α) points out that comparison is performed in the asymptotic state (see fig. (S4 B) for a visual definition of δ as. ).
On the one hand, the numerical integration in the additive amplitude case (S117) is straightforward. Fig. S1 shows the time evolution of the first three cumulants (K 1 ,K 2 and K 3 ) of φ(t, τ ) along a cycle in the asymptotic state both for the general mean field and the GCK equation. Moreover, in fig. S2 the parameter δ is plotted for different variation amplitudes α A , showing the range of applicability of the small-variation approximation. On the other hand, the multiplicative case presents some subtleties since the multiplicative variation kernel (S121) FIG. S2. Range of validity of the small variation approximation in the additive scenario. Systematic comparison of equations (S114) and (S95) via parameter (S124) for different αA values, both axis in log−scale. Note that the deviation monotonically increases with αA. In particular for αA = 0.16h, the one used in the main text results, the deviation δst. = (3.5 ± 0.2) · 10 −2 h, (S124) is small enough to use equation (S114). Parameter values: Ta = 3h, the other as in the main text.
has a singularity in τ = 0. In addition, since its variance is proportional to α ·τ , one has to numerically impose that: (S125) Individuals in the vicinity ofτ = 0 experience either negligible or null variations after reproduction, thus one can safely decompose the variation contribution of the general mean field equation (S95) in two terms: where is a small parameter that delimits the vicinity ofτ = 0, such that variations of individuals withτ < can be considered as null. The initial condition φ(t = 0, τ ) = φ G (t = 0, τ ) is defined by cases: it is a truncated Gaussian (see methods in the main text), with the peak µ , and standard deviation is σ = 1h 16min for τ ≥ , and it is set to zero for 0 < τ < . Fig. (S3) shows the integration of the mean-field equation with this numerical prescription and it is compared with the GCK eq. In conclusion, the validity of the small-variation approximation both in the additive and multiplicative case is confirmed.
Border effects. We also study the importance of border effects, i.e the dependence onτ (initial phenotype) in the normalization/moments of β given by the truncation of the probability density in ] −τ , ∞[. Equivalently, we check if their negligence, eqs.(S120,S123), is a good approximation in the range of variation amplitudes α A and α M that fits the experimental results. In fig. (S4) the exact moments, (S118-S122), are compared with the approximated ones,(S120,S123), both in the additive and multiplicative case. Clearly, the results indicate the border effects are considerable in the additive case, but negligible in the multiplicative one.
where M + 1 = Tmax ∆t + 1 is the number of measures taken through a whole cycle (note that +1 is added since measures starts at t = 0), T max = 23h is the total duration of one single experimental cycle, ∆t = 0.5h is the waiting interval between measures, τ sim. j and τ theo. j are the mean lag time calculated from simulations and the generalized Crow-Kimura equation at time j · ∆t, respectively and the super-index as indicates that it is calculated in the asymptotic state.

S5. SPONTANEOUS SHIFTING TO THE DORMANT STATE
In this section we investigate how the main results are changed when there is a non-vanishing spontaneous rate to enter the dormant state in the absence of antibiotics, e.g. when starving. Let us call such a generalized rate S(τ i , τ N ); in the main text we use where the sub-indexes k and f denote "killing medium" and "fresh medium", respectively. Given that bacteria can enter in dormant state when they sense lack of resources, i.e. when they approach starvation (i.e. the system carrying capacity), we consider a small, but non-zero, value for the rate to enter the dormant state in the fresh medium. In particular, we keep s k constant as in the main text but consider two possible scenarios for s f : (i) a constant rate s f =s = 0.01 all across the fresh-medium phase, or (ii) a state-dependent rate, increasing as the carrying capacity is approached; in particular we consider a sigmoidallike function s f =s (tanh[−cN G /K]), (s = 0.01).
Computational results for the variants of the model obtained by implementing these types of rate are displayed in Figures S6 and S7, corresponding to the end of the 10th exposure cycle (results are presented only for the multiplicative version of the model, but similar curves and conclusions can be obtained for the additive version). Observe that, on the one hand, a non-vanishing s f increases the values of K 1 , K 2 , K 3 , but maintains the functional form of the lag-time distributions, see S6 (b) and (c). Thus, one can recover the results of the main text by just tuning the mutational amplitude parameter, α M . On the other hand, the dynamics of the total number of bacteria in both growing, N G , and dormant state, N D , along the cycles changes significantly as shown in Fig. S7. In particular, there is a larger number of bacteria in the dormant state -with respect to the scenario in the main text, with s f = 0-when the killing phase starts, as bacteria enter the dormant state during the fresh medium of the previous cycle. Indeed, N D decreases until a minimum is reached, as bacteria die in presence of the antibiotics. N G maintains a very similar dynamics, but there is an overall reduction as the number of bacteria approaches the maximum carrying capacity. A similar behavior is observed in Fig.S6. In conclusion, considering s f = 0 -though not a completely realistic assumption-is a reasonable approximation which eases the tractability of the model by reducing the computational times without significantly modifying the qualitative results. At the beginning of the cycle ND is non-vanishing, since bacteria entered this state during the fresh phase of the previous cycle. During the antibiotic exposure phase both, ND and NG, decrease to a minimum as the bacteria die. When the antibiotic is removed and a fresh medium is added, NG grows towards the system's carrying capacity. Observe that the bacteria still enter the dormant state, but the reproduction rate is much higher, in such a way that an overall reduction of NG is only observed as the system approaches the carrying capacity. 16h for one single cycle. In particular, the mean lag time, K1, is shown along a cycle in the asymptotic regime. At T = 0 the antibiotic is added and the system enters in the "killing phase" (T ∈ [0, Ta]). When the antibiotic is present, the system experiences a selection pressure towards longer lag times, in consequence, K1 increases. At T = Ta the antibiotic is washed and the fresh medium is added (T ∈ [Ta, Tmax]). In this regime, the selection pressure is towards shorter lag times and K1 relaxes back to the initial value. (C) Lag-time probability distribution at T = 0 (leftmost curve) and T = Ta = 3h (rightmost curve) as derived theoretically (Eq.(S114), dashed lines) and computationally (dots). In the asymptotic state the system oscillates between these two limiting probability distributions, both of them exhibiting weak tails. (D) Evolution of the first three cumulants, K1, K2 and K3, (mean, variance , and skewness respectively) along a cycle in the asymptotic state (both theoretical and computational results are displayed). Observe that in C/D the theory correctly predicts the properties of the distribution, however there are deviations due to finite size effects. FIG. S11. Comparison of experimental and simulated M DK99 of the evolved population after 10 cycles of exposure. For both the additive and multiplicative cases, we observe that the simulated M DK99 falls within -or it is very close-the experimental values (i.e. the mean plus error) for Ta = 3h and 5h, but outside for the case of Ta = 8h. This result is to be expected given the higher noise of the experimental measurements. In particular the experimental mean is τ exp. Ta=8h = 10 ± 1h higher than the theoretical prediction of 8h.

S7. MOVIES
We provide three videos to help visualize the system dynamics: • (V1) https://github.com/MCMateu/Phenotypic-dependent-variability/blob/main/AsymptoticAdditi ve.mp4 • (V2) https://github.com/MCMateu/Phenotypic-dependent-variability/blob/main/AsymptoticMultip licative.mp4 • (V3) https://github.com/MCMateu/Phenotypic-dependent-variability/blob/main/TransitoryAdditi ve.mp4 The first two videos show the evolution of φ(τ, t) along an antibiotic exposure cycle in the asymptotic state, calculated both by the simulation via Gillespie algorithm and by numerical integration of generalized Crow-Kimura equation (S111) in the additive (V1) and multiplicative (V2) case respectively. Note that, as in fig. 4 of the main text, there exists a small deviation between theory and simulation, due to finite-size effects. The duration of the killing phase is T a = 3h, and the duration of the whole cycle is T = 23h. The full length of the video is 23 s, thus each second approximately matches 1h of the cycle. During the first three seconds the distributions move towards longer lag times (to the right side), since individuals with longer lag times are the one that survives, until the mean lag time reaches a maximum value when antibiotics are removed. Then, the distribution moves back towards the initial configuration. Numerical value the of parameters: α A = 0.16h, α M = 0.048, b = 2.4h −1 , s = 0.12h −1 , d = 3.6 · 10 −5 h −1 .
On the other hand, the third video (V3) shows the transient dynamics of φ(τ, t) as calculated by the numerical integration of the general mean-field equation (S95) and the generalized Crow-Kimura (S111) in the additive case. The full length of the video is 50s at a speed of 25f ps, the full video counts with 1250 frames. Each cycle has 50 frames, so the total duration of each cycle is approximately 2s. The duration of the killing phase is T a = 5h. The oscillations observed in the video correspond with the killing phase, when the density moves to the right, and with the growing phase, when it moves to the left. Note that the two distributions matches almost perfectly. Numerical value the of parameters: α A = 0.035h , b = 2.4h −1 , s = 0.12h −1 , d = 3.6 · 10 −5 h −1 .