Effective polyploidy causes phenotypic delay and influences bacterial evolvability

Whether mutations in bacteria exhibit a noticeable delay before expressing their corresponding mutant phenotype was discussed intensively in the 1940s to 1950s, but the discussion eventually waned for lack of supportive evidence and perceived incompatibility with observed mutant distributions in fluctuation tests. Phenotypic delay in bacteria is widely assumed to be negligible, despite the lack of direct evidence. Here, we revisited the question using recombineering to introduce antibiotic resistance mutations into E. coli at defined time points and then tracking expression of the corresponding mutant phenotype over time. Contrary to previous assumptions, we found a substantial median phenotypic delay of three to four generations. We provided evidence that the primary source of this delay is multifork replication causing cells to be effectively polyploid, whereby wild-type gene copies transiently mask the phenotype of recessive mutant gene copies in the same cell. Using modeling and simulation methods, we explored the consequences of effective polyploidy for mutation rate estimation by fluctuation tests and sequencing-based methods. For recessive mutations, despite the substantial phenotypic delay, the per-copy or per-genome mutation rate is accurately estimated. However, the per-cell rate cannot be estimated by existing methods. Finally, with a mathematical model, we showed that effective polyploidy increases the frequency of costly recessive mutations in the standing genetic variation (SGV), and thus their potential contribution to evolutionary adaptation, while drastically reducing the chance that de novo recessive mutations can rescue populations facing a harsh environmental change such as antibiotic treatment. Overall, we have identified phenotypic delay and effective polyploidy as previously overlooked but essential components in bacterial evolvability, including antibiotic resistance evolution.

culture, which was roughly 2% of the initial volume. The samples were diluted in PBS before plating on non-selective media (for determining total cell density and genotypic mutant frequency) or selective media plates (for determining phenotypic mutant frequency). The dilution factor was adjusted accordingly. At the very first sampling for example, 90 µl of 10,000-fold diluted culture could yield a few hundred CFUs on non-selective media that allowed optimal counting on a normal 90 mm petri dish. This CFU-count from non-selective media was used to determine the bacterial population size in the culture, based on the known dilution factor and plating volume. The population size at each sampling time point was then compared to the population size sampled at time zero (right after the 30-min incubation following electroporation) to determine the number of population doublings, or bacterial generations, elapsed.
For selective plating from the first sampling time point, 90 µl of 10-fold diluted culture may or may not yield even a single phenotypic mutant due to phenotypic delay. At later time points we diluted the samples more as cell density increased and genotypic mutants eventually developed their phenotype. We recommend testing the exact dilution factor and plating volume in smaller pilot experiments beforehand. (For the resistance mutations, we performed five pilot/prior experiments that showed consistent patterns of phenotypic delay, before we proceeded to the experiments reported here, with more sampling time points, samples and replicates per experiment. Similarly, for the lacZ-reporter assay, we established the method with two pilot experiments beforehand.) When working with the lacZ-reporter assay we used non-selective media plates containing 1 mM IPTG and 40 µg/ml X-gal. As visible colonies formed after overnight incubation, the plates were left at 4 • for about a week for the blue colour to fully develop. Genotypic mutants could be directly visualised by colonies that contain blue colour. Phenotypic lacZ-mutants were scored on M9 agar plates with 0.4% lactose as carbon source.
All plates were incubated at 30 • C. The non-selective media plates were incubated overnight and used to quantify the population size of the bacteria as well as assessing the frequency of genotypic mutants (see Day 4). Selective plates were incubated up to 48 hours, thus giving the mutants sufficient time to grow in the presence of selection before counting the phenotypic mutant colonies. At the same sampling time points we also plated cells from the two controls onto selective media. We observed no colonies resulting from either spontaneous mutations or drug-tolerance in these controls. Day 4: CFU on non-selective plates incubated from Day 3 were counted to assess the population size of the bacteria at each sampling time point. The change in bacterial cell density over time was translated into growth expressed in number of bacterial generations. After counting CFU, entire plates of colonies were replica-plated onto selective media plates. Since the MIC of fully-developed phenotypic mutants are at least ∼100 times higher than that of wild-type cells, colonies were replicated onto selective media with the drug concentration of 8x MIC rather than 2x MIC. The fraction of colonies that subsequently grew on selective media among all colonies from the same plate was recorded as the frequency of genotypic mutants (see Day 5).
We replica-plated 100-1000 colonies from each replicate at each sampling time point to ensure a decent estimate of genotypic mutant frequency. The replicated plates were incubated at room temperature overnight as doing so allows sufficient growth for replicated mutant colonies but limits the growth of rare, spontaneous mutants in wild-type colonies. Finally, the 48-hour time point was sampled on Day 5. Procedures for scoring genotypic and phenotypic mutants, as described above, were repeated as required on the following days for the 24-hour and 48-hour samples.
2 Fluctuation tests 2.1 Detailed simulation methods Spontaneous de novo mutations: We adopted the standard "Lea-Coulson" formulation 4;5 to model the appearance of mutants during culture growth. In time t f , the wild-type population grows deterministically from size N 0 to N f = N 0 2 t f = N 0 e βt f , where β = log 2 is the exponential growth rate per unit time, with one time unit taken as the population doubling time. De novo mutants appear according to a non-homogeneous Poisson process with instantaneous rate βµN 0 e βt at time t. The mutation rate µ to be estimated has the interpretation of mutations generated per wild-type cell division in this standard approach 6 . (Note that our µ corresponds to µ β in certain key references 7;6 .) We then applied the results 8;9 that, to a good approximation, the total number of mutation events occurring is Poisson-distributed with mean µN 0 (2 t f − 1), and each clone descending from a de novo mutant has "developing time" (time remaining until t f ) exponentially distributed with mean 1/β = 1/ log 2. We drew random values from these distributions to determine the number of mutant clones to simulate and for how long. To account for polyploidy, we deconstructed the mutation rate as µ = cµ c where c is ploidy and µ c is mutation rate per target copy. We assumed that a mutation arises on a single chromosome within a cell, and neglected the chance (of order µ 2 ) that more than one chromosome mutates either simultaneously in one cell or later in the same mutant clone.
Unlike recombineering, which introduces mutations in the anti-sense DNA strand (see section 1), natural mutations may arise by a number of mechanisms. For instance, mutations immediately affecting both strands may arise when DNA double-strand breaks occur and introduce genome rearrangements such as inversions, insertions or deletions upon repair of the break. Mutations can also arise by damage or copying error affecting only a single strand. This results in a mismatch which, if not repaired, will lead to a fully fledged double-stranded mutation in one of the two descendant chromosomes only after another round of DNA replication.
By assuming that all descendants of a de novo mutant are also mutants, the standard model implicitly supposes that mutations arise in double-stranded form. In order to have agreement with the standard model in the monoploid case, and thus focus our investigation on the effect of ploidy, we also assumed that mutations arise in double-stranded form in our simulation model.
If mutations actually arise in single-stranded form, in most cases this will give equivalent results to our double-stranded polyploid model with adjusted parameter settings (see section 2.3 below).
Mutant clone dynamics: We tracked every cell in a mutant clone individually, including its chromosome composition and timepoints at which it is produced by division and in turn divides again. The interdivision time of each cell was either taken to be constant, or drawn independently at random from an exponential distribution. By assuming independence, we neglect any possible correlations of interdivision time between sister cells or across generations.
Other interdivision time distributions could readily be implemented, but these two cases cap-ture the extremes of variance between which realistic interdivision time distributions will lie.
The exponential case corresponds to the standard Lea-Coulson formulation, in which the development of mutant clones follows a Yule process. A more complete investigation of interdivision time distributions in the context of fluctuation tests, though limited to monoploidy, has been undertaken by other authors 9;10 , and our observation of similar estimated mutation rate for the two different interdivision time distributions (S2 Fig) is consistent with these previous studies.
Also as in the standard model, we assumed that the mutant has the same fitness as the wild-type in non-selective media, in the sense that expected population doubling time is equal.
To achieve this, we fixed median interdivision time of mutant cells to one time unit (requiring interdivision time to be exactly one in the constant case, or exponentially distributed with rate β in the exponential case). Mutant clones were simulated until no more divisions before t f occurred, then cells existing at time t f were scored for phenotype according to their chromosome composition and assumed dominance.
Our method of tracking cell "lifetimes" computationally was in part inspired by a previous approach 9 , and these authors' consideration of general interdivision time distributions, along with other extensions such as different growth rates of wild-type and mutant in non-selective media, has recently been implemented in an R package 10 . Our consideration of multiple chromosome copies here is (to the best of our knowledge) novel, and could potentially be integrated into R packages in the future.
Maximum likelihood estimation: The calculation of the likelihood under the standard model has been described previously 11;7 . In brief, for any value of the composite parameter m = µ(N f −N 0 ) (representing the mean number of mutation events occurring during culture growth), the probability mass function of the number of mutants in a culture at the end of growth can readily be calculated by recursive equations (Eq. 8 in ref. 7 ), which we implemented in R. We thereby computed the likelihood as a function of m, given the number of phenotypic mutants observed in each parallel culture. Numerical optimization yielded the maximum likelihood estimate of m. The endpoints of the 95% profile likelihood confidence interval were found by solving numerically for the values of m at the threshold for rejection in a likelihood ratio test at 5% significance level. Finally, the estimate of m was converted to an estimate of µ by dividing by (N f − N 0 ), assumed as usual to be a known and fixed value.

Mathematical explanation for the distribution of phenotypic mutants
While it has occasionally been recognized in the literature that the existence of multiple chromosome copies and resulting "segregation lag" could affect fluctuation analysis 12;13;14;15 , the precise impact on the observed total mutant counts and resulting effects on mutation rate estimation under modern maximum likelihood methods have, to the best of our knowledge, never been elucidated. Our simulation study numerically illustrated the impact of effective polyploidy on mutation rate estimation. Here we provide a more detailed mathematical explanation of how effective polyploidy affects the distribution of the number of phenotypic mutants, thereby clarifying the numerical results.

Recessive mutations
When mutations are recessive, we observed numerically that the estimated mutation rateμ was close to the actual per-copy mutation rate µ c , regardless of ploidy, and that the distribution of mutant counts predicted by the standard model parameterized byμ provided a good match to the observed distribution from simulations of polyploid populations. These observations can be explained mathematically.
After the appearance of a mutation in one out of c copies in a cell, according to our model of segregation there will be a delay of n = log 2 c cell divisions until the first homozygous mutant appears; all other descendants up to this time are heterozygotes or homozygous wild-type cells (S1 Fig). Since the mutation is recessive, no mutants are "visible" until homozygosity is achieved. (Recall that phenotypically wild-type cells are assumed to have no chance of dividing and forming visible colonies on selective plates in the fluctuation test.) After this lag, a phenotypic mutant lineage appears in which all further descendants are phenotypic mutants, regardless of ploidy. That is, the number of phenotypic mutants doubles in each subsequent generation, the same pattern as observed (without any lag) in a monoploid population.
Let T clone denote the developing time of a randomly chosen mutant clone, i.e. the time between a de novo mutation arising in a single chromsome copy and the end time of culture growth (t f ) at which phenotypic mutants are counted. Since the timing of mutations is stochastic, T clone is a random variable. The distribution of T clone is approximately exponential with rate parameter β = log 2 (ref. 8 ). In our model, exactly n divisions are required to achieve homozygosity; however, the actual time between divisions can be variable. Thus the lag time until a clone achieves homozygosity, say L, is in general a random variable, independent of T clone and equal in distribution to the sum of n independent draws from the interdivision time distribution. A clone will produce visible descendants by t f if and only if T clone ≥ L; in this case, we need to determine the number of visible descendants produced in the remaining time Firstly, we determine the probability that a randomly chosen clone produces visible descendants by t f , i.e. Pr(T clone ≥ L). Recall that T clone is approximately exponentially distributed, Secondly, we require the distribution of remaining developing time in clones once they cross their lag time. Recalling that T clone is (approximately) exponentially distributed, however, we can simply apply the memoryless property: In words, the remaining developing time after achieving homozygosity (T clone − L), conditioned on homozygosity being reached before the end of the developing time (T clone −L ≥ 0), is equal in distribution to the total developing time of a randomly chosen clone (T clone ). In this remaining time, the mutant clone produces only homozygous mutant descendants, with a pattern of growth that is independent of ploidy. Therefore, provided a mutant clone does produce visible descendants by time t f , the number of these descendants is statistically no different to the monoploid case.
In summary, a ploidy-c population is expected to produce a Poisson-distributed number of "visible" clones with mean µ c (N f − N 0 ) by observation time t f , each consisting of a number of phenotypic mutants equal in distribution to the monoploid case. (All other mutation events do not produce any phenotypic mutants by t f and can thus be ignored.) The total number of phenotypic mutants counted is thus equal in distribution to a monoploid population with mutation rate µ c in the standard model, explaining why we consistently estimateμ close to µ c , regardless of ploidy, and find that the standard model provides a good fit to the observed mutant count distribution across simulated parallel cultures. In the case of recessive mutations, then, effective polyploidy leaves no signature in fluctuation assay data.
Proof for variable interdivision times: Here we derive Pr(T clone ≥ L) for variable interdivision times. We impose the constraint that the asymptotic population growth rate (i.e. once the population reaches a stable age distribution) is equal to β, yielding a population doubling time of one unit, equal to the wild-type. Allowing an otherwise arbitrary interdivision time distribution, with probability density function f (t), we thus require 16 : By comparison to Equation 1, the left hand side of Equation 2 is seen to be equivalent to Pr(T clone ≥ X), where X is a random interdivision time. That is, Next, the lag time can be expressed as L = n i=1 X i where X i are independent and identically distributed (iid) random variables with density f (t). Then we have: where we have used the memoryless property of the exponential distribution of T clone in the last step. Continuing inductively yields: We can readily put bounds on the range of "apparent" mutation rate, which would be estimated by the standard model, in a polyploid population. Firstly, whereas a recessive mutation yields a clone containing zero phenotypic mutants for the first n generations, a dominant mutation yields one phenotypic mutant during this segregation lag phase. Subsequently, the number of phenotypic mutants increases according to the same pattern regardless of dominance. Thus the apparent mutation rate in the dominant population must be higher than in the recessive population, which was determined above to be µ c . Secondly, due to the segregation lag before the number of phenotypic mutants starts doubling, the number of phenotypic mutants observed in a ploidy-c population must fall short of a monoploid population with mutation rate cµ c .
Our numerical results indicated that the estimated mutation rate (μ) for a dominant mutation in a ploidy-c population does indeed typically fall between µ c and cµ c (S2 Fig). However, there was no constant re-scaling factor that would allow a clear interpretation ofμ in terms of the true per-copy or per-cell mutation rate. When the actual per-cell rate cµ c was low (either low c or low µ c ), the estimateμ was closer to cµ c , but as cµ c increased,μ fell increasingly short (S2 Fig). Furthermore, the confidence intervals on the estimate became larger as ploidy increased. These results can be traced to the fundamental difference in clone size distribution: due to the segregation phase, polyploidy leads to an excess of singletons in individual clones relative to the standard model. The standard model thus provided a poor fit to the observed mutant counts, even when parameterized byμ (S3 Fig).
The dynamics of a dominant mutation in effectively polyploid cells, along with implications for observed mutant distributions and estimated mutation rates, were actually pointed out quite early in the literature 12 , but have rarely been addressed since. A more precise mathematical description of the distribution of phenotypic mutant counts for a dominant mutation in a polyploid population, and thus a basis for accurately estimating mutation rate, is a subject of our ongoing work.

Effect of single-stranded mutations
Recall that our model, like the standard model for fluctuation analysis, assumes mutations arise in double-stranded form. In reality, as explained in section 2.1 above, mutations may also arise in single-stranded form (i.e. mismatches), requiring one additional generation before producing one daughter cell with a fully fledged double-stranded mutation and one non-mutant daughter cell. To understand what difference this would make to our results, we must take into account the strand on which the mismatch arises. mRNA is transcribed from the anti-sense DNA strand. There is evidence that transcription can occur even from a mismatched heteroduplex, producing mutant mRNA and protein in the progenitor cell in which the mismatch arose 17 .
As usual, we suppose that the phenotype of the cell is fully determined by its present genome content (more specifically on the anti-sense DNA strand of each chromosome copy), i.e. we neglect any additional delays in mRNA and protein turnover.
We must consider several cases depending on ploidy, dominance, and the strand on which the mismatch arises. Suppose that mismatches arise in the sense strand at per-copy rate µ s and in the anti-sense strand at per-copy rate µ a . For comparison to our polyploid model where mutations arise in double-stranded form, see S1 Fig.
• Cells are monoploid (c = 1) and the mismatch arises on the antisense strand. Then the progenitor cell is a phenotypic mutant, but there is a delay of one generation until the mutation segregates into double-stranded form in one daughter lineage, while the other daughter lineage is wild-type. This is equivalent to our ploidy-2 model with dominant mutations arising at per-copy rate µ c = µ a /2.
• Cells are monoploid and the mismatch arises on the sense strand. Then the mutation will not be expressed in the progenitor, but only in the daughter lineage inheriting the doublestranded mutation. This is equivalent to our ploidy-2 model with recessive mutations arising at per-copy rate µ c = µ s /2.
• Cells are polyploid (c ≥ 2) and mutations are recessive. Then regardless of the strand on which they arise, mismatches will not have a phenotypic effect on the progenitor due to masking by the other genome copies. The necessity to first segregate into double-stranded form adds one generation until achieving homozygosity and thus phenotypic expression. This is equivalent to our ploidy-2c model with recessive mutations arising at per-copy rate µ c = (µ s + µ a )/2.
• Cells are polyploid, mutations are dominant and arise on the anti-sense strand. Then mismatches have immediate phenotypic effects, but there is a delay of 1 + log 2 (c) = log 2 (2c) generations until achieving homozygosity. This is equivalent to our ploidy-2c model with dominant mutations arising at per-copy rate µ c = µ a /2.
• Cells are polyploid, mutations are dominant and arise on the sense strand. Then the mutant phenotype is not expressed in the progenitor, but will be expressed in the following generation in the daughter inheriting the double-stranded mutation. That is, there is a delay of one generation to produce phenotypic mutants, but log 2 (c) further generations until reaching homozygosity. This is not precisely equivalent to any case of our model.
To summarize, in all but the last case, a model in which mutations initially arise as singlestranded mismatches can be converted to some form of our polyploid model in which mutations arise in double-stranded form. In reality, a mixture of mutation types will arise and contribute to the apparent mutation rate that would be estimated in a fluctuation test.
3 Effects of polyploidy on mutation rate estimation via whole-genome sequencing Other than fluctuation tests, in recent years, bacterial mutation rates have also been estimated by whole-genome sequencing (WGS) 18;19;20;21;22 . We therefore asked whether effective poly-ploidy also impacts WGS-based mutation rate estimates, in particular whether WGS yields the per-cell or per-copy rates. Mutation rates estimated by WGS are typically referred to as "per-genome" rates, which equal per-cell rates under the tacit assumption that bacteria are monoploid. With ploidy level taken into account, if WGS truly estimates the per-genome rates, then it has the same issue as fluctuation tests using recessive mutations. However, given that WGS-based mutation rates can be several times (up to one order of magnitude) higher than rates from fluctuation tests 18;19;20 , it is also plausible that WGS actually yields per-cell rates that reflect the underlying ploidy level. To our knowledge, however, the impact of polyploidy on WGS-based mutation rates has never been explored. In the MA assay, as described above, a single colony (assumed to be founded by a single cell) is picked at each bottleneck to continue propagation, and at the final time point for sequencing.
Thus, one cell is chosen, presumably uniformly at random, among all existing cells in the population at each sampling time. In order to quantify the mutations accumulated along its direct lineage, we would need to follow this lineage backwards in time to the common ancestor that founded the population. In general, a typical "backward lineage" does not show the same statistical properties -particularly the number of cell divisions elapsed -as a typical "forward lineage" 23 . However, if we make the important simplifying assumptions that interdivision time is fixed (in particular, not affected by any mutations) and there is no cell death, then it is equivalent to choose a lineage at random either backwards or forwards in time. Thus, we simplify the process by considering cell divisions forwards in time along a single focal lineage throughout the entire experiment, from the founder cell to the cell forming the colony picked for sequencing. At each cell division, we thus randomly choose one daughter cell to be the ancestor of the chosen cell, i.e. to remain in the sampled lineage, and "discard" the other daughter cell from consideration (S4 Fig, A).
In this model, the probability of a cell acquiring a mutation on any of its genome copies within one generation is given by cµ g , with µ g the per-genome-copy mutation rate and c the effective ploidy. For a mutation to be registered in WGS, it must fix in the sampled lineage.
If cells are monoploid, then the mutant progenitor and its offspring are all homozygous, and all mutations occurring in the sampled lineage will therefore be passed down the lineage and registered in WGS. For a polyploid cell that acquires one mutation, however, only a fraction 1/c of its descendants will become homozygous mutants while the rest become genetically wildtype, due to asymmetric inheritance that we showed experimentally (main text Fig 3). To register a mutation requires that one picks up a homozygous mutant descendant of the mutant progenitor (S4 Fig, B-C), which occurs with probability 1/c. Therefore, although polyploidy increases the influx of mutations by a factor c, this is cancelled out by a decrease of factor c in the probability that a mutation fixes in the sampled lineage, leaving µ g as the apparent mutation rate. This result may not be exact if cell division times are variable, since our model of sampling one cell at random at each cell division forwards in time is then not precisely equivalent to sampling at random from all cells existing at a given time point in the real MA assay. However, the argument shows that we will not in general recover the per-cell mutation rate, and instead expect something close to the per-genome-copy mutation rate.
In the end, then, mutation rate estimates derived by WGS applied to MA assays represent (approximately) per-genome-copy rates, comparable to fluctuation tests with recessive mutations returning per-target-copy rates. Effective polyploidy thus does not explain the aforementioned discrepancy of up to an order of magnitude between WGS-based and fluctuation test-based estimates of mutation rate. Although both methods are impacted similarly by effective polyploidy, the mechanisms are fundamentally different. In fluctuation tests, effective polyploidy introduces phenotypic delay and masks heterozygous mutants from selection. In mu-tation accumulation, mutations are assumed to be neutral, but asymmetric inheritance caused by effective polyploidy decreases their fixation probability.
Although we focused on MA, both of the two aforementioned alternative WGS methods, LTEE-and MDS-based mutation rates, yield mutation rate estimates of the same order of magnitude as MA-based rates 18;19 . In LTEE, counting neutral mutations actually yields slightly lower mutation rates than MA and MDS 18 . In MDS, since mutation rates are estimated based on polymorphism in whole-population samples 19 , the resulting per-nucleotide mutation rates are literally per-nucleotide, which translates readily to per-genome rates for any given genome size. However, as we have shown here, these estimated rates must be interpreted carefully as they do not reflect the per-cell mutation rate.

Standing genetic variation and rescue 4.1 General model notes
In this section we develop the models we use to investigate the evolutionary consequences of effective polyploidy (main text Fig 7). Recall that in our framework, every cell contains c (= 1, 2, 4, 8, . . .) copies of the locus of interest, which for brevity we refer to as "ploidy", though chromosomes may only be partial. At each generation these chromosomes double and are evenly divided between the two daughter cells, splitting at the deepest point in their genealogy. A cell's type is defined by the number of mutant chromosomes (out of c) that it carries: type j = 0 is homozygous wild-type; types j = 2 i , i = 0, 1, . . . , n − 1 are heterozygous; type j = 2 n = c is homozygous mutant.
We model population dynamics in discrete generations. To capture fitness differences for bacteria undergoing binary fission, we suppose a type j cell successfully divides before dying, i.e. produces two offspring rather than zero, with probability p j . These probabilities will differ between the old environment (to calculate mutation-selection balance) and the new environment (to calculate the probability of rescue). In the deterministic calculation of the mutationselection balance below, only the proportion of offspring of each type matters; thus, the results are not actually particular to organisms reproducing by binary fission, but do depend on the types of offspring that can be produced, determined here by the assumed pattern of chromosome segregation. In the stochastic model for rescue, the entire probabilistic distribution of offspring numbers matters, not only expected values.
Consistent with the population genetics literature, here we define mutation rate (per cell, per generation) as the proportion of offspring that are mutant (in a deterministic model) or the probability that each offspring independently is mutant (in a stochastic model). Again, we suppose mutations arise in double-stranded form (cf. discussion in sections 2.1 and 2.3 above).
For ploidy c, a wild-type cell has mutation rateμ = cμ c = 2 nμ c whereμ c is the mutation rate per copy. Although it is possible that more than one copy mutates simultaneously in this model, these second-order events will drop out of the approximations we derive.
We choose our notation (μ) to make a subtle distinction from the mutation rate parameter (µ) used in the fluctuation test model (section 2). Recall that the standard definition of mutation rate in the fluctuation analysis literature is the number of mutants produced per wild-type cell per division; that is, a mutation event, occurring with probability µ, retains one wild-type cell and produces one new mutant 6 . In contrast, in the population genetics approach, each offspring has a probabilityμ to become a mutant. Assuming two offspring as in binary fission, the probability of producing one mutant upon wild-type cell division is thus, to first order, 2μ. Therefore, a bacterial mutation rate estimated by standard methods from a fluctuation test should be divided by two in order to parameterize a typical population genetics model. This discrepancy in the interpretation of mutation rate has not, to our knowledge, been explicitly recognized, as such models are rarely treated side by side. In order to maintain consistency with each existing body of literature, such that we recover standard results in the monoploid case, we use the two different definitions of mutation rate in our respective modeling approaches.

Mutation-selection balance (deterministic)
Here we suppose mutations are costly (where this cost may be masked in heterozygotes, depending on the dominance assumption) and calculate the frequency of the mutant types at equilibrium.

Mathematical derivations
Let x j (t) denote the frequency (proportion) of type j in the population at generation t, and w j the relative fitness of type j, where fitness is given by the expected number of surviving offspring, and relative fitnesses are normalized by the wild-type. (Under our binary fission model where individuals have two offspring with probability p j or zero offspring with probability 1 − p j , we have w j = p j /p 0 .) We denote population mean fitness byw(t) := j w j x j (t). A proportionμ of type 0 individuals' offspring mutate to type 1, while the rest remain type 0. According to our model of chromosome segregation (see main text Materials and Methods), type 2 i individuals (0 ≤ i ≤ n − 1) produce half type 0 and half type 2 i+1 offspring, while type c = 2 n individuals produce only type 2 n offspring. Then for c > 1 the type frequencies can be described by the following recursions (with census after mutation and before selection): Provided the mutation rate is small compared to fitness costs, mutants are rare and the wildtype dominates the population: x 0 (t) ≈ 1 andw(t) ≈ 1. Then the equilibrium frequencies, x * j , approximately satisfy: This yields the general solution for the mutation-selection balance in polyploids under our model of chromosome segregation: We assume the mutation has fitness cost s in a homozygous mutant relative to a homozygous wild-type; that is, w 2 n = 1 − s. The condition that mutants are rare amounts toμ s ≤ 1.
We can then consider various dominance models to determine the fitness of the heterozygotes.
When the mutation is completely recessive, only homozygous mutants bear the cost, while heterozygotes have the same fitness as wild-types: w 2 i = w 0 = 1 for 0 ≤ i ≤ n − 1. Substituting into the general solution yields: Then the total frequency of the mutant allele, i.e. the proportion of all alleles in the population that are mutant, is: When the mutation is completely dominant, all heterozygotes have the same fitness as homozygous mutants: Substituting into the general solution yields: The total mutant allele frequency is then: For comparison, taking the same solution approach for monoploids (c = 1 and w 1 = 1 − s) recovers the standard result for mutant frequency,

Results
The key expressions in the cases of complete dominance or complete recessivity (Equations 7-11) are summarized in Table 1 in the main text, and an example of mutant allele frequency as a function of cost is plotted in S5 Fig. In the completely recessive case, the cost is masked in heterozygotes, and thus the mutant allele is maintained at a higher overall frequency than in a monoploid population. Meanwhile, the frequency of homozygous mutants is independent of ploidy.
In the completely dominant case, both heterozygotes and homozygotes are subject to selection and thus present at a lower frequency than in the recessive case. The total mutant allele frequency turns out to be exactly the same as in the monoploid case. However, since heterozygote cells contribute less than 100% mutant chromosomes, the frequency of cells containing at least one mutant chromosome must necessarily be larger than in the monoploid case, where every mutant cell is weighted by a 100% contribution. Indeed, the total frequency of heterozygous and homozygous mutants among all cells is given by: The factor in brackets can be shown to be larger than one for any s ∈ (0, 1] and n ≥ 1, and increasing with n. That is, the total frequency of hetero-and homozygous mutants increases with ploidy (c = 2 n ). However, a c-fold increase in ploidy yields less than a c-fold increase in this total mutant cell frequency. This is because mutational influx is proportional to c, but half of heterozygous mutants' offspring are wild-type and thus "lost" from this pool.
For comparison, Otto and Whitton 24 gave the equilibrium frequency of a costly mutant allele to beμ c /σ c , where σ c is "the fitness effects of a single mutation in an organism of ploidy level c" (p. 417, ref. 24 ), allowing any level of dominance except the completely recessive case. This result is given for monoploids, diploids, or tetraploids, for either asexual or randomly mating sexual reproduction (with particular assumptions on chromosome segregation in tetraploids).
Our result, for asexual reproduction with any ploidy of the form c = 2 n , is thus consistent in the special case of complete dominance. Furthermore, these authors found that the mutant allele frequency "increases with ploidy level for deleterious mutations that are partially recessive and masked" (p. 417, ref. 24 ). Our result in the limiting case of completely recessive mutations is consistent with this trend.

Rescue (stochastic)
Suppose a population is exposed to a harsh new environment (e.g. antibiotics), such that the basic reproductive number (R 0 ) of phenotypically wild-type ("sensitive") cells does not exceed one, and thus the population will decline to extinction unless it is rescued by phenotypically mutant ("resistant") cells with R 0 larger than one. We are interested in the probability that the population is rescued, due to the establishment of at least one resistant lineage, from the standing genetic variation (mutant types that already arose in the old environment) and/or de novo mutations that occur in the new environment.

Mathematical derivations
Recall that in our framework, a type j cell has two daughters with probability p j , and otherwise dies without leaving offspring. Thus R 0,j = 2p j and the rescue scenario amounts to taking p j = p S ≤ 1/2 for all types j that are sensitive in the new environment, and p j = p R > 1/2 for all types j that are resistant. (Recall that j = 0 corresponds to the homozygous wild-type; j = 2 i for 0 ≤ i ≤ n − 1 correspond to heterozygotes; and j = 2 n = c corresponds to the homzygous mutant.) We now model population dynamics stochastically using a multi-type branching process.
The offspring distributions of the various types are described by the following probability generating functions, where z = (z 0 , z 1 , . . . , z 2 n ) is a dummy variable: The extinction probabilities ζ j starting from a single type j are given by the smallest nonnegative fixed point 25 , f ( ζ) = ζ. That is, The equation for ζ 2 n is decoupled and can readily be solved for the two roots, ζ 2 n = 1 and ζ 2 n = 1/p 2 n − 1. The former solution gives the extinction probability when p 2 n ≤ 1/2, while the latter applies when p 2 n > 1/2, which is the case of interest in the rescue scenario. Thus, Next, working backwards from i = n − 1 to i = 0 to successively substitute the expression for ζ 2 i+1 into the equation for ζ 2 i , we arrive at the general form, The terms in this sum reflect all possible paths to extinction, whereby either a cell in the mutant lineage fails to divide at some step before complete segregation, or cell divisions are successful up to complete segregation but the homozygous mutant fails to establish a surviving lineage, while all homozygous wild-type descendants (one produced per cell division until complete segregation) also lead to extinction.
These equations could in general be solved numerically. Forμ 1, we obtain the analytical approximation: whereζ 2 i denotes the zero-order term in ζ 2 i as given in (17a). The approximation for ζ 2 i essentially neglects the chance that another mutation occurs and escapes extinction among the wild-type descendants produced before complete segregation. The approximation for ζ 0 (and thus also for all types 2 i ) will break down as p 0 → 1/2, i.e. the type 0 branching process approaches criticality. On the other hand, the special case p 0 = 0 leads to the exact solutions In the special case of complete recessivity or dominance, all heterozygotes have the same division probability, p 2 i = p het for 0 ≤ i ≤ n − 1, where p het = p S for recessivity or p het = p R for dominance. Substituting Equation 15 for ζ 2 n , the expressions for ζ 2 i can then further be simplified to the exact form: or, forμ 1, the approximate form: Rescue from standing genetic variation: We first consider the contribution to rescue from mutant types (both hetero-and homozygous) that are already present when the environment switches. For this purpose, we neglect de novo mutations in the new environment, thus setting µ = 0 in the solutions for the extinction probabilities ζ j . The probability of rescue from standing genetic variation, denoted P SGV , is calculated as the the complement of the probability that all starting types ultimately fail to establish surviving lineages, that is: (20) where N j is the number of type j individuals present at the time the environment switches. If we assume the population begins at the deterministic mutation-selection balance calculated in the previous section, with total population size N independent of ploidy (see ref. 24 for a discussion of possible links that we neglect), then N j = N x * j where x * j ∝μ c , with the proportionality constant depending on dominance in the old environment and on ploidy (main text Table 1 and Equation 19a (with p het = p S andμ = 0) for ζ 2 i to obtain: where m := Nμ c is a ploidy-independent measure of mutational influx in the population, given by the product of initial population size and per-copy mutation rate. In the case of complete dominance in both environments, we substitute Equation 9 for the approximation of x * 2 i and Equation 19a (with p het = p R andμ = 0) for ζ 2 i to obtain: A caveat is that we have taken all types to be at their deterministic frequencies. This assumption will tend to be unrealistic when mutational influx (m) is small. Stochastic effects imply that mutant cell types could be absent all together, and the realized frequencies of various types will be correlated through time. In particular, rescue could fail even if p R = 1 due to the chance that resistant cells are absent at any given time, whereas our current approach yields P SGV = 1 when p R = 1, since the average frequency of resistant cells is non-zero. In general, stochasticity in realized mutant frequencies is expected to reduce P SGV 26 . Nonetheless, we do not expect changes to the general qualitative effects of ploidy and other model parameters on rescue probability. We thus leave a stochastic treatment of the SGV for future theoretical work.
Rescue from de novo mutations: We now consider the contribution to rescue from mutations that arise in the new environment. For this purpose, we suppose the population is initially composed of N homozygous wild-type (type 0) individuals. The probability of rescue from de novo mutations, denoted P DN , is then calculated as: Clearly, P DN can only be non-zero when p S > 0, such that existing wild-type cells can produce offspring. Assuming thatμ 1 (such that 1 − ζ 0 1) and that N is large, we can approximate 27 : where α is the coefficient ofμ in the approximation of ζ 0 (Equation 17b or 19b), and again m := Nμ c . In particular, when the mutation is completely recessive in the new environment, we have while when the mutation is completely dominant in the new environment, we have

Results
Here we explore the effects of model parameters, ploidy, and dominance on the probability of rescue. We assume throughout that the mutation rate is small (μ = cμ c 1 for reasonable c), allowing us to use analytical approximations derived above. For simplicity we will only illustrate the cases of complete recessivity or dominance. Furthermore, in the case of SGV, we suppose here that dominance is the same in both the old and new environments, although there is empirical evidence that dominance can in fact be environment-dependent 28 and this situation could also readily be considered using our general mathematical results.
With these simplifying assumptions, we apply Equations 21 and 22 to calculate P SGV , and Equations 25 and 26 to calculate P DN . Recall that P SGV covers the contribution of all heteroand homozygous mutant types existing at the time of the environmental shift, while P DN covers the contribution of de novo mutations arising in the new environment from homozygous wildtype cells. We also consider the overall probability of rescue from either (not mutually exclusive) source: The plots in S6 Fig -S9 Fig illustrate the effects of model parameters on the aforementioned probabilities (P SGV , P DN , P tot ) as well as the ratio P SGV /P DN , for various ploidy levels.
Effect of probabilities that sensitive and resistant cells divide before death in the new environment (p S , p R ): P SGV is independent of p S in the dominant case, since both heterozygous and homozygous mutant types are resistant. In the recessive case, since heterozygotes are sensitive, P SGV is increasing with p S . In the extreme when p S = 0, rescue can only occur from pre-existing homozygous mutants, whose frequency is independent of ploidy (main text Table 1). P DN increases with p S in both the recessive and dominant cases, since increasing the chance that wild-type cells produce surviving offspring increases the opportunities for de novo mutations. However, P DN is more sensitive to p S in the recessive case, since changes in p S affect survival of heterozygotes as well as homozygotes. Taken together, P tot is clearly increasing with p S . Furthermore, P SGV , P DN and P tot are all increasing with p R , by reducing the chance that a resistant lineage goes extinct. The relative contribution of SGV (P SGV /P DN ) is greatest at low p S , where wild-type cells are unlikely to survive to produce new mutants after the environment changes, with this dependency stronger in the recessive case. The dependency on p R is weak and not in a consistent direction. Effect of ploidy (c = 2 n ): As described in the main text, P SGV increases with ploidy in the recessive case (for p S > 0) and in the dominant case. Meanwhile, P DN decreases with ploidy in the recessive case and increases in the dominant case. These patterns are robust to variations in parameter values (S6 Fig -S9 Fig).
Our results for rescue from standing genetic variation, if we assume a deterministic mutationselection balance prior to environmental change, interestingly qualitatively agree with and extend our results for fluctuation analysis, which models the stochastic accumulation of genetic variation in an exponentially growing population with mutations assumed to be cost-free. In a fluctuation test, obtaining mutant colonies growing on selective plates corresponds to rescue, with the assumption that every phenotypic mutant produces a visible colony (p R = 1) while every phenotypically wild-type cell fails to produce a visible colony (p S = 0).
In the recessive case, both the deterministic frequency of homozygous mutants at mutationselection balance (section 4.2.2) and the distribution of number of homozygous mutants in an exponentially growing culture (section 2.2) turn out to be independent of ploidy. If p S = 0, as in fluctuation analysis and a special case of our rescue model, the observed mutant distribution in the former or P SGV,R/R in the latter thus reflect only the contribution of homozygous mutants that are immediately resistant in the new environment. Thus the result is unaffected by ploidy.
As p S increases in our more general rescue model, heterozygous mutants can also contribute to rescue. Their frequency in the SGV (main text Our results for rescue from de novo mutations also reflect the counteracting effects of higher mutational influx but a longer segregation lag as ploidy increases. In the case of a completely recessive mutation, the net effect is a reduction of P DN with ploidy. Consider the expression from Equation 25: The factor 2 n reflects the gain in mutational influx with ploidy, while the factor p n S reflects the number of successful divisions of sensitive cells required for complete segregation of mutant chromosomes. Since p S < 1/2 by assumption, the net effect is that (2p S ) n and in turn P DN,R are decreasing with n. Intuitively, each doubling of ploidy doubles the mutational influx, but requires one additional cell division, occurring with probability less than one half, to produce a resistant cell.
On the other hand, for a completely dominant mutation, the net effect is an increase of P DN with ploidy. Consider Equation 26: The detrimental effect of the segregation lag even on dominant mutants is reflected by p n R , which decreases with n (unless p R = 1). However, this is now more than compensated by the gain in mutational influx (factor 2 n ). By assumption, p R > 1/2, thus (2p R ) n and in turn P DN,D increase with n.
Putting together the effects of ploidy on both SGV and de novo sources, we find that the overall probability of rescue, P tot , depends only weakly on ploidy in the recessive case, but is increasingly contributed by SGV at higher ploidy. In the dominant case, P tot is clearly increasing with ploidy. The relative contribution of SGV (P SGV /P DN ) tends to be decreasing over reasonable ploidy levels, but this trend can reverse at higher ploidy, especially when s and/or p S are large. We briefly note that if we were to account for stochasticity in the number of mutants in the SGV, just as we already accounted for stochasticity in the number of de novo mutations occurring, we would expect the contribution of SGV to decrease (see derivation of P SGV above).
It is interesting to compare our results on rescue to previous results on the rate of substitutions or adaptation in a constant-sized population. As a baseline, in the case of neutral mutations, any individual chromosome has equal probability of ultimately fixing in the population. In a population of N individuals of ploidy c, where each chromosome mutates at rate µ c , the overall substitution rate is thus (mutation influx rate) × (fixation probability) = (cµ c N ) × 1/(cN ) = µ c , which is independent of ploidy, as in the mutation accumulation assay (section 3). On the other hand, selection either against deleterious mutations (as examined in the mutation-selection balance, section 4.2) or for beneficial mutations (as examined here) introduces dependencies on ploidy. Previously, Otto and Whitton 24 considered how the "rate of adaptation", defined as the rate of fitness increase due to adaptive substitutions, depends on ploidy. For adaptation from a single, previously deleterious allele in the SGV that becomes beneficial, they found that the rate of adaptation increases with ploidy if the selective benefit of the allele when in a single copy is sufficiently large relative to the prior selective disadvantage, and the population size is not too large. In the rescue situation, the population is declining and mutations are strongly favoured. Thus this previous result is in qualitative agreement with our results for rescue, where P SGV increases with ploidy for both recessive and dominant mutations.
For adaptation from ongoing, beneficial de novo mutations, in an asexual population, Otto and Whitton found that adaptation rate in a small population increases with ploidy for sufficiently dominant mutations, but decreases with ploidy for more recessive mutations. Again, this is in qualitative agreement with our finding that P DN,D increases but P DN,R decreases with ploidy.
Effect of dominance: P DN is always larger for the dominant case than for the recessive case, since in the former, heterozygotes are already capable of rescuing the population, whereas in the latter, only homozygous mutants can rescue. The idea that recessive adaptive mutations are at a disadvantage in establishment dates back to Haldane 29 and is now commonly known as "Haldane's Sieve" 30 . Moreover, in our model, P DN increases with ploidy for dominant mutations and decreases for recessive mutations; thus the advantage of dominance clearly is greater at higher ploidy. These effects of ploidy, and thus the magnitude of the difference between recessive and dominant mutations, are weakest at high p S (as this increases the chance that recessive mutations survive until homozygosity) and low p R (as this reduces the advantage of dominant mutations being expressed in heterozygotes).
In contrast, the effect of dominance on P SGV can go in either direction. Indeed, Orr and Betancourt 30 have previously pointed out that Haldane's Sieve does not hold for adaptation from standing genetic variation, since more dominant mutations have higher per-copy probability of establishment but are present at lower frequency. Specifically, in our model P SGV,R/R can exceed P SGV,D/D when p S is sufficiently high and p R sufficiently low (for the same reasons that the advantage of dominance weakens in P DN ), or when s is high (so that masking this cost in heterozygotes yields a greater advantage to recessive mutations). Our results are in qualitative agreement with Orr and Betancourt 30 , who found (with the implicit assumption that dominance coefficients are not environment-dependent) that completely recessive mutations