Quantifying the Role of Population Subdivision in Evolution on Rugged Fitness Landscapes

Natural selection drives populations towards higher fitness, but crossing fitness valleys or plateaus may facilitate progress up a rugged fitness landscape involving epistasis. We investigate quantitatively the effect of subdividing an asexual population on the time it takes to cross a fitness valley or plateau. We focus on a generic and minimal model that includes only population subdivision into equivalent demes connected by global migration, and does not require significant size changes of the demes, environmental heterogeneity or specific geographic structure. We determine the optimal speedup of valley or plateau crossing that can be gained by subdivision, if the process is driven by the deme that crosses fastest. We show that isolated demes have to be in the sequential fixation regime for subdivision to significantly accelerate crossing. Using Markov chain theory, we obtain analytical expressions for the conditions under which optimal speedup is achieved: valley or plateau crossing by the subdivided population is then as fast as that of its fastest deme. We verify our analytical predictions through stochastic simulations. We demonstrate that subdivision can substantially accelerate the crossing of fitness valleys and plateaus in a wide range of parameters extending beyond the optimal window. We study the effect of varying the degree of subdivision of a population, and investigate the trade-off between the magnitude of the optimal speedup and the width of the parameter range over which it occurs. Our results, obtained for fitness valleys and plateaus, also hold for weakly beneficial intermediate mutations. Finally, we extend our work to the case of a population connected by migration to one or several smaller islands. Our results demonstrate that subdivision with migration alone can significantly accelerate the crossing of fitness valleys and plateaus, and shed light onto the quantitative conditions necessary for this to occur.

Natural selection drives organisms towards higher fitness (i.e.reproductive success), but crossing fitness valleys or plateaus may be necessary to make progress up a fitness landscape.Fitness valleys can arise from interactions between genes, called epistasis: for instance, two mutations together can yield a benefit while each alone is detrimental [1,2].While the high dimensionality of genotype space makes it challenging to probe the structure of fitness landscapes [3,4], evidence has been accumulating for frequent epistasis and landscape ruggedness [1,[4][5][6][7][8].Additionally, fitness valleys often occur in the evolution of antibiotic resistance by bacteria [9,10].
Population structure can play an important role in evolution [11][12][13][14][15][16][17][18][19].The time taken to cross a valley or a plateau depends on population size, since stochastic effects such as genetic drift have an increased importance in small populations, allowing neutral and deleterious mutations to fix with increased probability [20,21].Hence, while beneficial mutations spread faster in larger populations, population subdivision into demes can allow the maintenance of larger genetic diversity and facilitate valley or plateau crossing locally.Migration can then spread beneficial mutations.This idea was first discussed by Wright in his shifting balance theory [22,23].Here, we investigate subdivision with migration alone, without additional effects such as extinction and refounding of demes, which played an important part in Wright's theory, specific geographic structure [11][12][13][14][15], or spatially heterogeneous environments [16][17][18][19], on which much recent work focuses.
Studying quantitatively the effect of subdivision on evolution in a rugged fitness landscape can help in inferring the statistics of epistasis from evolution experiments [24].Work on structured populations has been used as qualitative proofs of landscape ruggedness [11].Current experiments investigating the evolution of subdivided populations at various migration rates have produced mixed results, some finding evidence for faster adaptation of metapopulations [25], and others not [26].
It is therefore practically important to determine conditions under which subdivision is beneficial.Additionally, population subdivision is widespread and can play a crucial role in events such as speciation [27].For instance, evidence has recently been found for compartmentalization and subspecies of HIV in different organs of a single patient [28,29].
We investigate under what conditions demes of identical size coupled through migration cross fitness valleys and plateaus faster than a single population.We show that subdivision significantly accelerates fitness valley or plateau crossing over a wide parameter range, both with respect to a non-subdivided population and to a single deme.We predict the existence of a parameter range where valley or plateau crossing by a metapopulation is as fast as that of its fastest deme, and verify this prediction using stochastic simulations.Similar results hold for slightly beneficial (but effectively neutral) intermediates.Finally, we extend our work to a population connected by migration to one or several smaller population islands.
We consider asexual individuals, characterized by their genotype and associated fitness f , which determines division rate.In our simulations, the carrying capacity K of a deme is fixed (see Supplementary Material), and each individual divides at a rate f (1 − N/K), where N is the number of individuals in the deme to which the individual belongs: this corresponds to logistic growth.Simultaneously, each individual dies at a constant rate d.Hence, N fluctuates weakly around its equilibrium value of K(1 − d/f ).In practice, we choose d = 0.1, and fitnesses of order one, thus the deme size is N ≈ 0.9K.In our analytic estimates, we treat N as constant, and denote the equal death and average division rates by d.We consider an ensemble of Π demes of identical size, and treat migration as a random exchange of two individuals between two different demes, occurring at a rate of 2m per individual (see Fig. 1(a)).There is no geographic structure in our model, i.e. exchange between any two demes is equally likely.We focus on the simplest fitness valley, involving three successive genotypes, denoted by 0, 1 and 2 (see Fig. 1(b)).The initial genotype is taken as reference for fitness: f 0 = 1.The subsequent genotypes have fitness f 1 = 1−δ and f 2 = 1+s, with s > 0; the final mutation is thus beneficial.For a fitness valley, δ > 0, while for a plateau, δ = 0. We also consider the case of a weakly beneficial intermediate (δ < 0) that remains effectively neutral.We only allow forward mutations, and note that including back mutations only changes crossing times by a numerical factor [21].Finally, for the sake of simplicity, we assume that all mutations have probability µ per division, but it is straightforward to generalize our work to different mutation rates.We are interested in the average time τ m required for the whole metapopulation to cross the fitness valley, i.e. to fix mutation '2' in all demes, starting from an initial state where all individuals have genotype '0'.Fig. 1(c) shows an example plot of τ m from simulations, as a function of the migration rate parameter, m (see Supplementary Material for simulation methods).We observe a broad range of migration rates for which the metapopulation crosses the valley faster than both (i) a single nonsubdivided population with the same carrying capacity as the entire metapopulation, and (ii) a single isolated deme.This example shows that subdivision with migration can significantly accelerate fitness valley crossing.
Successful valley or plateau crossing can occur either through sequential fixation of mutations, or by stochastic tunneling, where the beneficial mutation arises in a small fluctuating minority population of first-mutants that never fixes.Sequential fixation dominates for small populations, while tunneling is faster for larger ones [21].In the regime dominated by sequential fixation, valley (resp.plateau) crossing time increases (resp.remains constant) with the number N of individuals in the population, while in the regime dominated by tunneling it decreases as 1/N [21].Here, we assume that isolated demes are in the sequential fixation regime, but do not make any such assumption for the total population.We will show later that if isolated demes are in the tunneling regime, there is no benefit of subdivision.
In the best case, the crossing time of the whole metapopulation is determined by that of the best deme, i.e. the one that crosses the fitness valley or plateau fastest.We now determine the conditions under which this occurs, focusing on migration rates much smaller than division and death rates, 2m d, such that fixation or extinction of a mutant lineage in a deme is not perturbed by migration.First, migration should be rare enough for demes to remain effectively shielded from migration events while they have fixed the intermediate mutation, until the final beneficial mutation arises.Second, migration should also be frequent enough for the spreading time of the final beneficial mutation through the whole metapopulation to be negligible with respect to the crossing time of the best deme.These two criteria provide upper and lower bounds on the ratio of migration rate to mutation rate for which subdivision is most beneficial, which we now express explicitly.
(I) Upper bound on the ratio of migration rate to mutation rate.The average time τ 12 required for a deme of 1-mutants to fix the beneficial mutation '2' is τ −1 12 = N µdp 12 [21], where N µd is the total mutation rate in the deme, and p 12 = (1−e −(s+δ) )/(1−e −N (s+δ) ) denotes the probability of fixation of a single mutant with genotype '2' in a background of 1-mutants [30].τ 12 must be smaller than the average time, t d , for a deme of 1-mutants to be wiped out by migration from the other demes, which still have genotype '0'.The total rate of migration events in the metapopulation is ΠN m, so that t d = n d /(ΠN m), where n d is the average number of migration events required for the 1-mutants to be wiped out by migration.
To determine n d , we study the evolution of the number i ∈ [0, Π] of demes that have fixed genotype '1', while other demes have genotype '0'.The possible outcomes of a migration event and their probabilities depend on the current value of i but not on its history, and i ∈ {0, Π} are absorbing states, so i evolves according to a finite Markov chain.The only migration events that can affect i are those that exchange individuals with different genotypes.As a result of one such relevant migration event, i can increase by one with probability p(1−p ), where p denotes the probability that the '1' migrant fixes in the '0' deme, and p the probability that the '0' migrant fixes in the '1' deme.Both p and p can be expressed in a similar fashion as p 12 above [30].Similarly, i decreases by one with probability p (1−p), and otherwise does not change.These probabilities, multiplied by the probability that a migration event is relevant, 2i(Π − i)/[Π(Π − 1)], yield the transition matrix of our finite Markov chain, which is tri-diagonal (or continuant).Hence, n d , the average number of steps until absorption starting from i = 1, conditional on absorption at i = 0, can be expressed analytically [30].Its full expression is provided in the Supplementary Material, for valleys (δ > 0) and plateaus (δ = 0).The first condition, τ 12 < t d , thus yields (II) Lower bound on the ratio of migration rate to mutation rate.We also require the average spreading time, t s , for the final beneficial mutation to fix in the entire metapopulation after it has fixed in the best deme, to be smaller than the average valley/plateau crossing time, τ b , of the best deme.The above discussion on Markov chains can be adapted to the calculation of t s = n s /(ΠN m), where n s is the average number of migration events required for the 2-mutants to spread by migration.Here, the relevant variable is the number i ∈ [0, Π] of demes that have fixed genotype '2', while other demes still contain genotype '0'.(Some may have genotype '1', but this is rare since fixation of mutation '1' is the slowest step in valley/plateau crossing and, moreover, this would not change the spreading time for a plateau and would even make it shorter for a valley.)The transition matrix of the Markov chain is the same as before, where p now represents the probability of fixation of a genotype '2' migrant in a deme of genotype '0', and p the probability of the opposite process.The full expression of n s , the average number of steps until absorption at i = Π, starting from i = 1, is provided in the Supplementary Material.
We next calculate τ b , the average shortest crossing time of Π independent demes (i.e. the average of the smallest order statistic of the deme crossing time among a sample of size Π [31]).Since demes are assumed to be in the sequential fixation regime, valley/plateau crossing involves two successive steps.The first step, fixation of a '1'-mutant, occurs with rate r 01 = N µdp 01 , where p 01 denotes the probability of fixation of a '1'-mutant in a deme with genotype '0'.If δ > 0, p 01 = (1−e δ )/(1−e N δ ), and if δ = 0, p 01 = 1/N [30].The second step has rate r 12 = τ −1 12 , with τ 12 given above.Moreover, since mutation '1' is deleterious or neutral, while mutation '2' is beneficial, for a broad range of parameter values, p 01 p 12 , and thus the first step dominates.In this limit, the distribution of crossing times is approximately exponential with rate r 01 , and the minimum among Π independent realizations is distributed exponentially with rate Πr 01 (see Supplementary Material for details, and a discussion of the general case).In this case, we simply have τ b ≈ τ id /Π, where τ id ≈ r −1 01 is the average crossing time for an isolated deme.Hence, the best deme crosses the valley Π times faster on average than an isolated deme.The second condition, t s < τ b , finally yields Together, Eqs. 1 and 2 yield the interval of m/µd over which we expect subdivision to be most beneficial.For we expect the valley/plateau crossing time of the whole metapopulation to be dominated by the crossing time of the best deme, so that τ m /τ id ≈ 1/Π.With the parameter values from Fig. 1, the interval of Eq. 3 is 6.4 × 10 −9 m 6.2 × 10 −7 .The broad minimum in Fig. 1 agrees remarkably well with this theoretically predicted optimal interval.Moreover, the simulation results in Fig. 1 yield a minimum τ m = (4.9± 0.8) × 10 7 (for m = 10 −7.5 ), while τ id = (4.8± 0.8) × 10 8 : it satisfies τ m /τ id ≈ 1/10 = 1/Π.
Our above analysis requires isolated demes to be in the sequential fixation regime.If instead they are in the tunneling regime, their crossing time involves a single rate [21].Hence, in the ideal case where crossing by the metapopulation is dominated by that of the best deme, we again expect τ m /τ id ≈ 1/Π.Since the average crossing time via tunneling is proportional to 1/N [21], this would yield τ m /τ ns ≈ 1, where τ ns is the average crossing time of a non-subdivided population (with ΠN individuals).Thus, isolated demes being in the sequential fixation regime is a necessary condition for subdivision to accelerate fitness valley/plateau crossing.for which an optimal effect of subdivision is expected, i.e., τm/τ id → 1/Π = 0.1.Dashed line: value of δ above which a non-subdivided population crosses the valley faster than an isolated deme.Dotted line: value of δ above which an isolated deme is in the tunneling regime.(b) Similar heatmap but for the ratio τm/τns of the average valley crossing time of a metapopulation to that of a non-subdivided population (with K = 500), τns.
Fig. 2 shows heatmaps of the valley crossing time of a metapopulation, compared to an isolated deme and to a non-subdivided population, as a function of the ratio of migration rate to mutation rate, m/(µd), and of the fitness valley depth, δ.Fig. 2(a) shows that the interval of Eq. 3 describes well the region where τ m /τ id is smallest and tends to 1/Π.For lower migration rates, τ m /τ id increases.Indeed, if m = 0, τ m is determined by the valley crossing time of the slowest independent deme.Conversely, for high migration rates, τ m tends to the nonsubdivided case, τ ns (see Fig. 1(c)).As δ increases, the critical population size above which tunneling dominates over sequential fixation decreases [21].Due to this, above a critical value of δ (dashed line in Fig. 2), τ ns becomes smaller than τ id , in which case large values of m give a low τ m /τ id (see Fig. 2(a)).Above another, larger critical value of δ (dotted line in Fig. 2), isolated demes are in the tunneling regime.In Fig. 2(b), which plots the ratio τ m /τ ns , we see that when δ sufficiently exceeds the latter threshold, the metapopulation no longer crosses the valley faster than the non-subdivided one, as predicted above.Fig. 2 shows that, for the parameter values chosen, subdivision accelerates valley crossing over a large range of valley depths and migration rates, and that the metapopulation can cross the valley orders of magnitude faster than a single large population.As mentioned previously, our results hold for fitness plateaus, such that mutation '1' is neutral (δ = 0), of relevance for neutral mutation networks [32].Moreover, for |δ| < max( √ µs, 1/N ), mutation '1' is effectively neutral in a finite population with N individuals [21].Hence, our arguments hold even for weakly beneficial intermediates, as demonstrated in Fig. 3(a).Thus, while sufficient magnitude epistasis is needed for subdivision to be beneficial, there is no stringent requirement for sign epistasis.Thusfar, we focused on demes of equal size for simplicity, but demes of different sizes are relevant in practice.Let us consider a large population connected by migration to S smaller satellite populations of identical size, assumed to be in the sequential fixation regime.Migration only occurs between the large population and one of the smaller islands (chosen randomly at each migration event), and the total migration rate is denoted by M .It is straightforward to adapt our work to this case (see Supplementary Material).We again find an interval of M/(µd) over which the crossing time for the large population is dominated by the crossing time of the best island.This is corroborated by our simulations (see Fig. 3(b)).
For subdivision to accelerate valley or plateau crossing, isolated demes must be small enough to be in the sequential fixation regime.Although this hypothesis looks stringent, the transition between sequential fixation and tunneling occurs at larger population sizes if the mutation probability per division µ and the valley parameters s and δ are smaller [21].For E. Coli, the mutation probability per base pair per division is µ ≈ 8.9×10 −11 [33].Consider a total population size of 10 6 individuals subdivided into 100 demes, which is experimentally reasonable [26,34].For s = 10 −2 and δ = 10 −3 , isolated demes are in the sequential fixation regime.Provided that 2m d (see above), Eq. 3 predicts an optimal effect of subdivision for 2.3 × 10 −3 m/(µd) 5.5, a wide range, where τ m ≈ τ id /100 (= 2.5 × 10 12 ).Comparing this to the valley crossing time of the non-subdivided population τ ns = 1.3 × 10 14 demonstrates a significant benefit of subdivision for valley crossing: τ m /τ ns = 2.0 × 10 −2 .
Our quantitative assessment of the conditions under which subdivision significantly speeds up valley or plateau crossing can aid in inferring characteristics of fitness landscapes from evolution experiments, such as those in Refs.[25,26], and help optimally design future experiments.Further directions include investigating the evolution of a metapopulation with a distribution of deme sizes on a more general rugged landscape, as well as assessing the impact of specific geographic structure.Our work could also be extended to sexual populations, where recombination plays an important role in valley or plateau crossing [35].The interplay between recombination and subdivision, which respectively alleviate and exacerbate clonal interference, will be interesting to study.
We thank Ned S. Wingreen for stimulating discussions.This work was supported in part by National Science Foundation Grant PHY-0957573 and National Institute of Health Grant R01 GM082938.DJS was supported by

I. NUMBER OF MIGRATION EVENTS FOR EXTINCTION OR SPREADING IN A METAPOPULATION
In the main text, we presented upper and lower bounds on the ratio of migration rate to mutation rate, defining an interval over which subdivision significantly reduces valley or plateau crossing time (see Eqs. (1-3)).The upper bound involves n d , the average number of migration events required for the '1'-mutants to be wiped out by migration, starting from a state where one deme has fixed genotype '1', while all other demes have genotype '0'.Similarly, the lower bound involves n s , the average number of migration events required for the '2'-mutants to spread by migration to the whole metapopulation, starting from a state where one deme has fixed genotype '2', while all other demes have genotype '0'.In this section, we provide explicit expressions for n d and n s , both for fitness plateaus and for fitness valleys.Throughout this section, we consider a metapopulation of Π demes composed of N individuals each, and we assume that individual demes are in the sequential fixation regime (see the main text for a discussion of this point).
As mentioned in the main text, n s and n d can be determined from finite Markov chain theory.For this, we study the evolution of the number i ∈ [0, Π] of demes that have fixed the mutant genotype ('1' for the calculation of n d ; '2' for that of n s ), while other demes have genotype '0': i evolves according to a finite Markov chain, each step being a migration event, and i ∈ {0, Π} are the two absorbing states.The transition matrix of this Markov chain is continuant (or tri-diagonal), with for i ∈ [1, Π − 1], and P 0→0 = P Π→Π = 1.Here, P j→k denotes the probability that i varies from j to k as the final outcome of one migration event (i.e., the outcome after fixation or extinction of both migrants' lineages in the demes where they have migrated).In these expressions, 2i(Π − i)/[Π(Π − 1)] is the probability that a migration event exchanges individuals with different genotypes, while p denotes the probability that the mutant migrant fixes in the '0' deme, and p the probability that the '0' migrant fixes in the mutant deme.
Here, we do not account for independent mutations arising and fixing in other demes during the process of spreading (or extinction) of the mutant's lineage in the metapopulation.Indeed, our aim is to compare the timescales of migration and mutation processes, and thus treat them separately.Note that, in practice, this hypothesis is reasonable if mutations that fix are sufficiently rarer than migration events.We also consider that the time between two successive migration events is large enough for fixation to occur in the demes affected by migration before the next migration event occurs, which is true in the low-migration rate regime (2m d, where 2m is the migration rate per individual, while d is the death rate and average division rate per individual) that we study in our work.
n s and n d can be directly expressed as the average number of steps until absorption in a particular absorbing state.Let us present general expressions of these average numbers of steps, before using them to obtain explicit expressions of n s and n d .These expressions of n d and n s will enable us to study the parameter dependence of the bounds of the interval of m/(µd) over which subdivision significantly accelerates valley or plateau crossing (see Eqs. (1-3) of the main text).
A. Some results regarding finite Markov chains with continuant probability matrices [30] We are interested in the average number of steps ν a until the system reaches each of the absorbing states a ∈ {0, Π}, starting from the state i = 1: where s j,a is the average number of steps that the system spends in the state i = j before absorption, given that it starts in the state i = 1 and finally absorbs in state i = a.It can be expressed as [30] s j,a = π j,a π 1,a s j , where s j is the average number of steps the system spends in state i = j before absorption in either of the two absorbing states, given that it started in state i = 1, and π j,a is the probability that the system finally absorbs in state i = a if it starts in state i = j.
Using the explicit expressions given in Ref. [30] for s j and π j,a for the case of a continuant probability matrix, we obtain: where we have introduced B. Explicit expression of n d n d corresponds to ν 0 , where p is the probability that a '1'-mutant fixes in a deme of '0' individuals and p is the probability that a '0'-individual fixes in a deme of '1'-mutants.Hence, it can be expressed explicitly from Eqs. ( 9), (4), and (5).Since the expressions of p and p depend whether mutation '1' is neutral or deleterious, we obtain different expressions for the fitness plateau and for the fitness valley.

Fitness valley
For a fitness valley, p is the probability of fixation of the deleterious '1'-mutant, with fitness 1 − δ, in a deme where all other individuals have genotype '0' and fitness 1.It can readily be shown [21,30] that p takes the form Similarly, Hence, Eqs. ( 11) and (4, 5) yield and Eq. ( 9) gives: C. Explicit expression of ns n s corresponds to ν Π , where p is the probability that a '2'-mutant fixes in a deme of '0' individuals and p is the probability that a '0'-individual fixes in a deme of '2'-mutants.Hence, it can be expressed explicitly from Eqs. ( 10), (4), and (5).For n s , there is no difference between the valley and the plateau, since genotype '1' is not involved.
Since the fitness of '2'-mutants is 1 + s while that of genotype-'0' individuals is 1, we have and As above, Eqs. ( 11), (4), and ( 5) yield ρ k = r k , with r defined in Eq. ( 16).Thus, Eq. (10) gives D. Simplified expressions for upper and lower bounds of m/(µd) In the main text, we have shown that the benefit of subdivision is highest when m/(µd) is situated between a lower bound, and an upper bound, (see Eqs. (1-3) of the main text), where denotes the probability of fixation of a single mutant with genotype '2' in a background of '1'-mutants.Now that we have given general explicit expressions for n s and n d , we present simplifications of these expressions in relevant parameter regimes.This will yield better insight into the dependence of L and U on the various parameters.Throughout this section, we will focus on the regime where N s 1 but s 1, such that mutation '2' is substantially, but not overwhelmingly, beneficial [21].We then have p 20 1 (see Eq. ( 19)).To leading (i.e.zeroth) order in p 20 , we obtain from Eq. (20) that where p 02 is given by Eq. ( 18).

Fitness plateau
For a fitness plateau, combining Eqs. ( 13) and ( 22), the upper bound U reads Additionally, Eqs. ( 24) and ( 21) can be combined to write the lower bound L as again to lowest order in p 20 .Here, we have used p 01 = 1/N , since in the case of the plateau, mutation '1' is neutral.
For N 1 and Π 1, Eqs. ( 25) and (26) give where we have used p 02 = p 12 , since mutation '1' is neutral.The last expression follows from p 02 ≈ s, since we have assumed N s 1 and s 1.This simple expression of the ratio U/L demonstrates that the range over which subdivision significantly accelerates plateau crossing increases as the deme size N becomes larger, and that this range is quite wide as long as the number of demes satisfies Π (N s) 2 , which is realistic (recall that we are in the regime N s 1).

Fitness valley
Let us focus on valleys such that δ 1 but N δ 1. (Note that, in the opposite limit N δ 1, mutation '1' is effectively neutral, and the above discussion on the fitness plateau applies.)Then, p 01 ∼ δe −N δ 1 (see Eq. ( 14)).To lowest (i.e.zeroth) order in p 01 , Eq. ( 17) becomes so that from Eq. ( 22), the upper bound U is where we used the conditions δ 1, N δ 1, s 1 and N s 1 to simplify the expressions of p 12 and p 10 .Meanwhile, from Eq. ( 21) and (24), the lower bound L takes the form where we used the conditions δ 1, N δ 1, s 1 and N s 1 to simplify the expressions of p 01 and p 02 .
Eq. ( 29) shows that U is independent of N and Π in this regime, while Eq.(30) shows that L decreases exponentially with N (this dependence on N comes from that of p 01 ).Hence, for valleys as well, the interval of m/(µd) where subdivision most accelerates crossing becomes wider as N increases.Note that, conversely, L increases when Π increases.
In Fig. 2 of the main text, we have plotted U and L versus δ (solid lines).It can be seen that U decreases weakly when δ increases (upper solid line), while L decreases faster with δ (lower solid line), so that the interval of m/(µd) where subdivision most accelerates crossing becomes wider when δ increases.This behavior is consistent with the δ dependences of Eqs.(29)(30).Note, however, that the conditions δ 1, N δ 1, s 1 and N s 1 are not sufficiently well satisfied in Fig. 2 for Eqs.(29)(30) to be very precise approximations of U and L, and therefore the exact expressions were used to plot the lines in Fig. 2.

II. CROSSING TIME OF THE BEST DEME
In this section, we give more details on the calculation of the average valley or plateau crossing time τ b by the best deme amongst Π independent ones.We show in the main text that the crossing time of the whole metapopulation can be determined by this time.
τ b is the average shortest crossing time of Π independent demes.This minimum crossing time, which we denote by t b , is also called the smallest (or first) order statistic of the deme crossing time among a sample of size Π [31].
Let us denote by p(t) the probability density function of valley or plateau crossing time for a single deme, and let us introduce P(t) = ∞ t p(t )dt (it satisfies P(t) = 1 − C(t) where C(t) is the cumulative distribution function of valley or plateau crossing by a single deme).The probability that t b is larger than t is equal to the probability that the crossing times of each of the Π independent demes are all larger than t: P (t b ≥ t) = [P(t)] Π .By differentiating this expression, one obtains the probability density function p b (t b ) of the crossing time t b by the best deme (see e.g.Ref. [31]): We now express p(t) explicitly.Since demes are assumed to be in the sequential fixation regime, valley or plateau crossing involves two successive steps.The first step, fixation of a '1'-mutant, occurs with rate r 01 , and the second step, fixation of a '2'-mutant, occurs with rate r 12 (see the main text for expressions of these rates).The total crossing time is thus a sum of two independent exponential random variables, with probability density function given by a two-parameter hypoexponential distribution [31]: Combining Eqs.(31) and (32), we obtain with p(t b ) given by Eq. (32).τ b can then be obtained for any value of the parameters by computing the average value of t b over this distribution.
As explained in the main text, r 01 r 12 over a broad range of parameter values.In this limit, we can approximate p(t) with a simple exponential distribution, Eq. ( 31) then yields i.e. t b is distributed exponentially with rate Πr 01 .In this case, we simply have τ b = 1/(Πr 01 ), which can be written as τ b ≈ τ id /Π, where τ id ≈ 1/r 01 is the average crossing time for an isolated deme in the limit where r 01 r 12 .Hence, in this limit, the best deme crosses the valley Π times faster on average than an isolated deme.

III. A LARGE POPULATION CONNECTED BY MIGRATION TO SMALLER POPULATION ISLANDS
Let us consider a population of N individuals connected by migration to S smaller population islands with N < N individuals each.These islands of identical size are assumed to be in the sequential fixation regime.For the sake of simplicity, we consider that migration only occurs between the large population and the islands: a migration step is a random exchange of two individuals between the large population and one of the islands (chosen at random at each migration event), and the total migration rate is denoted by M .Here, we focus on the valley or plateau crossing time of the large population.We demonstrate that the evolution of a large population can be driven by that of satellite islands.
In the optimal case, the crossing time of the large population is determined by that of the best island, i.e., that which crosses the fitness valley or plateau fastest.We now determine the conditions under which this optimum is achieved, focusing on migration rates much smaller than division and death rates, M min(dN S, dN ), such that fixation or extinction of a mutant lineage in either the large population or an island is not significantly perturbed by migration.Again, migration should be rare enough for islands to remain effectively shielded from migration events while they have fixed the intermediate mutation, until the final beneficial mutation arises.Second, migration should also be frequent enough for the spreading time of the final beneficial mutation from the best island to the large population to be negligible with respect to the crossing time of the best island.These two criteria again provide upper and lower bounds on M/(µd).
The average time τ 12 = 1/(N µdp 12 ) (with p 12 from Eq. ( 23)) required for an island of '1'-mutants to fix the beneficial mutation '2' must be smaller than the average time, t d , for an island of '1'-mutants to be wiped out by migration from the large population, which still exhibits genotype '0'.The rate of migration events between the island of '1'-mutants and the large population is M/S.Hence, t d = S/(M p 10 ), where p 10 is the probability of fixation of the lineage of a single migrant with genotype '0' in an island where all other individuals are '1'-mutants: for valleys, it is given by Eq. ( 15), while for plateaus, it is equal to 1/N .The first condition, τ 12 < t d , thus yields The second condition is that the average spreading time, t s , for the final beneficial mutation to fix in a large population after it has fixed in the best island, must be smaller than the average valley/plateau crossing time, τ b , of the best island.Similar to t d previously, we obtain t s = S/(M p l 02 ), where p l 02 = (1 − e −s )/(1 − e −N s ) is the probability of fixation of a migrant with genotype '2' in the large population (assumed to exhibit genotype '0' before migration).τ b is the average of the minimum crossing time among S independent islands.We again focus, for simplicity, on the limit where the first step of valley/plateau crossing, which occurs at rate r 01 , is much longer than the second.Then, we simply have τ b ≈ τ ii /S (see Section II).In this expression, τ ii ≈ r −1 01 = 1/(N µdp 01 ) (with p 01 in Eq. ( 14)) is the average crossing time for an isolated island.Hence, the best island crosses the valley S times faster on average than a single isolated island.The second condition, t s < τ b , finally yields Together, Eqs. ( 36) and (37) yield the interval of M/µd over which we expect subdivision to be most beneficial: In this range, we expect the valley/plateau crossing time τ l of the large population to be dominated by the crossing time of the best island, so that τ l /τ ii ≈ 1/S.This prediction is confirmed by our simulations (see Fig. 3(b) of the main text).

IV. SIMULATION METHODS
Our simulations are based on a Gillespie algorithm [36,37] that we coded in the C language.Here we will describe our algorithm for the case of a metapopulation of Π demes of identical size, which is the primary situation discussed in our work.In our simulations, each deme has a fixed carrying capacity K-we discuss this choice in Sec.B.

A. Algorithm
A number of different events occur in our simulations, each with an independent rate: • Each individual divides at rate f g (1−N i /K), where f g is the fitness associated with the genotype g ∈ {0, 1, 2} of the individual, and N i is the current total number of individuals in the deme i ∈ [1, Π] to which the individual belongs.
• If a dividing cell has g < 2, upon division, its offspring (i.e., one of the two individuals resulting from the division) mutates with probability µ, to have genotype g + 1 instead of g.
• Each individual dies at rate d.
• Migration occurs at total rate m Π i=1 N i .Two different demes are chosen at random, an individual is chosen at random from each of these two demes, and the two individuals are exchanged.There is no geographic structure in our model, i.e. exchange between any two demes is equally likely.
In practice, the number of individuals with each genotype in each deme is stored, as well as the corresponding division rate.This data fully describes the state of the metapopulation, and allows determination of the rates of all events.For each event in the simulation, the following steps are performed: • A timestep dt is drawn from an exponential distribution with rate equal to the total rate R t of events (i.e., the sum of all rates), and time is increased from its previous value, t, to t + dt.In other words, the next event occurs at time t + dt.
• The event that occurs at t + dt is chosen randomly, such that the probability of an event with rate r is equal to r/R t : either a cell divides, or a cell dies, or a migration event occurs.
• The event is performed, and the relevant data is updated.Since we store the number of individuals with each genotype in each deme, only one or two of these numbers need to be updated at each step.In addition, the division rates of the affected deme i must be updated upon division and death because N i is modified.Note, however, that this represents only three numbers at most (one for each genotype).
The advantage of the Gillespie algorithm is that it is exact, and does not involve any artificial discretization of time.

B. Working at fixed carrying capacity
In our simulations, demes have a fixed carrying capacity, and the number of individuals per deme fluctuates weakly around its equilibrium value.This approach, also used in, e.g., Ref. [17] has the advantage of realism.Alternatively, we could impose a constant number of individuals per deme.(i) First, we could choose a dividing individual in the whole metapopulation with probability proportional to its fitness, and simultaneously suppress another individual, chosen at random in the same deme.However, in this case, individuals in demes of higher fitness would exhibit shorter lifespans, which is not realistic and may introduce a bias.(ii) A second possibility would be to choose a dividing individual (according to fitness) in each of the demes, and to simultaneously suppress another individual, chosen at random, in each deme.However, in this case, unless migration events are far less frequent than these collective division-death events (i.e., these Π division-death events), the time interval between them becomes artificially discretized.This introduces biases unless the total migration rate mΠN is much smaller than N d, i.e. unless m d/Π.Consequently, while imposing a constant number of individuals is a good simulation approach for a nonsubdivided population (see, e.g., Ref. [21]), it tends to introduce biases in the study of metapopulations.While we chose to perform simulations with fixed carrying capacities in order to avoid any of these biases, we checked that, for small enough migration rates, our results are completely consistent with simulation scheme (ii) described above.This consistency check also demonstrates that it is legitimate to compare our simulation results obtained with fixed carrying capacities to our analytical work carried out with constant population size per deme.

FIG. 2 .
FIG. 2. (a)Heatmap of the ratio τm/τ id of the average valley crossing time τm of a metapopulation to that τ id of an isolated deme, with Π = 10 and K = 50, as a function of valley depth δ and migration-to-mutation ratio m/(µd).Parameter values same as Fig.1, except that δ and m are now varied.All numerical results averaged over 100 runs, and the heatmap is interpolated.Solid lines: bounds of the interval in Eq. 3, for which an optimal effect of subdivision is expected, i.e., τm/τ id → 1/Π = 0.1.Dashed line: value of δ above which a non-subdivided population crosses the valley faster than an isolated deme.Dotted line: value of δ above which an isolated deme is in the tunneling regime.(b) Similar heatmap but for the ratio τm/τns of the average valley crossing time of a metapopulation to that of a non-subdivided population (with K = 500), τns.