## Figures

## Abstract

Natural selection drives populations towards higher fitness, but crossing fitness valleys or plateaus may facilitate progress up a rugged fitness landscape involving epistasis. We investigate quantitatively the effect of subdividing an asexual population on the time it takes to cross a fitness valley or plateau. We focus on a generic and minimal model that includes only population subdivision into equivalent demes connected by global migration, and does not require significant size changes of the demes, environmental heterogeneity or specific geographic structure. We determine the optimal speedup of valley or plateau crossing that can be gained by subdivision, if the process is driven by the deme that crosses fastest. We show that isolated demes have to be in the sequential fixation regime for subdivision to significantly accelerate crossing. Using Markov chain theory, we obtain analytical expressions for the conditions under which optimal speedup is achieved: valley or plateau crossing by the subdivided population is then as fast as that of its fastest deme. We verify our analytical predictions through stochastic simulations. We demonstrate that subdivision can substantially accelerate the crossing of fitness valleys and plateaus in a wide range of parameters extending beyond the optimal window. We study the effect of varying the degree of subdivision of a population, and investigate the trade-off between the magnitude of the optimal speedup and the width of the parameter range over which it occurs. Our results, obtained for fitness valleys and plateaus, also hold for weakly beneficial intermediate mutations. Finally, we extend our work to the case of a population connected by migration to one or several smaller islands. Our results demonstrate that subdivision with migration alone can significantly accelerate the crossing of fitness valleys and plateaus, and shed light onto the quantitative conditions necessary for this to occur.

## Author Summary

Experimental evidence has recently been accumulating to suggest that fitness landscape ruggedness is common in a variety of organisms. Rugged landscapes arise from interactions between genetic variants, called epistasis, which can lead to fitness valleys or plateaus. The time needed to cross such fitness valleys or plateaus exhibits a rich dependence on population size, since stochastic effects have higher importance in small populations, increasing the probability of fixation of neutral or deleterious mutants. This may lead to an advantage of population subdivision, a possibility which has been strongly debated for nearly one hundred years. In this work, we quantitatively determine when, and to what extent, population subdivision accelerates valley and plateau crossing. Using the simple model of an asexual population subdivided into identical demes connected by gobal migration, we derive the conditions under which crossing by a subdivided population is driven by its fastest deme, thus giving rise to the maximal speedup. Our analytical predictions are verified using stochastic simulations. We investigate the effect of varying the degree of subdivision of a population. We generalize our results to weakly beneficial intermediates and to different population structures. We discuss the magnitude and robustness of the effect for realistic parameter values.

**Citation: **Bitbol A-F, Schwab DJ (2014) Quantifying the Role of Population Subdivision in Evolution on Rugged Fitness Landscapes. PLoS Comput Biol 10(8):
e1003778.
doi:10.1371/journal.pcbi.1003778

**Editor: **Nir Ben-Tal, Tel Aviv University, Israel

**Received: **March 18, 2014; **Accepted: **June 29, 2014; **Published: ** August 14, 2014

**Copyright: ** © 2014 Bitbol, Schwab. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper.

**Funding: **AFB acknowledges the support of the Human Frontier Science Program (http://www.hfsp.org/). DJS was supported by National Institutes of Health Grant K25 GM098875 (http://www.nih.gov/). This work was also supported in part by National Science Foundation Grant PHY-0957573 (http://www.nsf.gov/) and National Institutes of Health Grant R01 GM082938 (http://www.nih.gov/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Natural selection drives populations towards higher fitness (i.e. reproductive success), but crossing fitness valleys or plateaus may facilitate progress up a rugged fitness landscape. Rugged fitness landscapes arise from epistasis, i.e. interactions between genetic variants. For instance, two mutations together can yield a benefit while each of them alone is detrimental: such reciprocal sign epistasis can give rise to a fitness valley [1], [2]. While the high dimensionality of genotype space makes it challenging to probe the structure of fitness landscapes [3], [4], evidence has been accumulating for frequent landscape ruggedness, especially in recent years [1], [2], [4]–[15].

Population structure can play an important role in evolution [16]–[24]. In particular, the time taken to cross a fitness valley or plateau depends on population size since stochastic effects such as genetic drift have an increased importance in small populations, allowing neutral and deleterious mutations to fix with increased probability [25]–[28]. Population subdivision into demes can allow the maintenance of larger genetic diversity due to increased genetic drift as well as to the quasi-independent explorations of the fitness landscape that are run in parallel by each deme. Subdivision may thereby facilitate valley or plateau crossing locally and subsequent migration can then spread beneficial mutations throughout the entire subdivided population ("metapopulation''). This idea was first discussed by Wright in his shifting balance theory [29]–[32] and the importance of this effect has been the subject of a long debate [33]–[42]. In this work, we investigate the role of subdivision with global migration alone, without additional effects such as strong dependence of deme size on fitness, including extinction and refounding of demes, which played a crucial role in Wright's theory. Our generic and minimal model enables us to quantatively determine the conditions under which population subdivision accelerates fitness valley or plateau crossing.

Studying quantitatively the effect of subdivision on evolution may help in inferring fitness landscape structure from evolution experiments [43]. Work on structured populations has been used as qualitative proof of landscape ruggedness [16]. Current experiments investigating the evolution of subdivided populations at various migration rates have produced mixed results, some demonstrating faster adaptation of subdivided populations [44], and others not [45]. It is therefore important to determine under what conditions subdivision accelerates fitness valley or plateau crossing. Additionally, population subdivision is extremely common in natural systems. For instance, evidence has recently been found for compartmentalization of HIV in different organs of a single patient [46], [47].

Here we show that subdivision can significantly accelerate fitness valley or plateau crossing over a wide parameter range, both with respect to a non-subdivided population and with respect to a single deme. Intuitively, deleterious or neutral intermediate mutations may fix in individual demes, allowing for the maintenance of a larger proportion of these mutants in a metapopulation than in a well-mixed population. We first determine the optimal speedup of valley or plateau crossing by subdivision, in the best possible scenario where valley or plateau crossing by the metapopulation is driven by that of its fastest deme. This enables us to demonstrate that isolated demes must be in the sequential fixation regime for subdivision to significantly accelerate crossing. We then determine the conditions under which the best possible scenario can be realized. Using Markov chain theory, we obtain analytical expressions for the parameter range where valley or plateau crossing by a metapopulation is as fast as that of its fastest deme. Our analytical predictions are verified using stochastic simulations. Furthermore, we discuss the effect of varying the degree of subdivision of a population, and investigate the trade-off between the magnitude of the optimal speedup and the width of the parameter range over which it occurs. Finally, we extend our work to weakly beneficial mutations and to a population connected to smaller islands, and we discuss the magnitude and robustness of the effect for realistic parameter values.

## Results

Our results are organized as follows. First, we specify our model for the evolutionary dynamics of a subdivided population with migration. Then, we focus on the ‘best possible' scenario where the metapopulation is driven by its fastest deme. We calculate the ratio of the valley-crossing time for the metapopulation to the valley-crossing time for an equally-sized well-mixed population under this strong assumption. This yields the optimal speedup that may be obtained by subdivision, and enables us to demonstrate that sequential fixation in individual demes is necessary to achieve a significant speedup. Then, we determine the range of parameter values for which the best possible scenario is attained, i.e. the valley-crossing time for the metapopulation is indeed dominated by the valley-crossing time of its fastest deme. Qualitatively, migration has to be both rare enough to enable demes to cross the fitness valley or plateau quasi-independently and frequent enough to allow fast spreading of the final beneficial mutation to the whole metapopulation once it has fixed in the fastest deme: these conditions yield an optimal window of migration rates. Finally, we compare our analytical predictions with results from stochastic simulations.

### Model of evolutionary dynamics in a subdivided population

We focus on asexual individuals, characterized by their genotype and associated fitness . Each individual has a division rate proportional to , and a death rate , which is the same for all. We consider an ensemble of identical demes, each with a constant number of individuals. The division rate averaged over the individuals of a deme is thus equal to the death rate . We treat migration as a random exchange of two individuals between two different demes, occurring at rate per individual. In our model, exchange between any two demes is equally likely, as in Wright's "island model'' [29]. This constitutes a generic and minimal model of subdivision with migration, without any dependence of migration rate on the average fitness of a deme (in contrast with models where demes containing beneficial mutants increase significantly in size and migrate more rapidly [30], [33]), or additional effects of extinction and re-founding of demes [30], [32], [33], specific geographic structure [16], [17], [19]–[21], or spatially heterogeneous environments [18], [22]–[24], on which previous studies focused.

We consider the simplest fitness valley or plateau, involving three successive genotypes denoted by ‘0', ‘1' and ‘2' (see Fig. 1A). The initial genotype is taken as reference for fitness: . We denote the fitnesses of the subsequent genotypes by and . The first mutation is assumed to be either neutral (), which yields a fitness plateau, or deleterious (), which corresponds to a fitness valley, while the second mutation is assumed to be beneficial (). We focus on first mutations that are not too strongly deleterious: . We only allow forward mutations, and note that including back mutations does not qualitatively affect crossing times [28]. Finally, we assume that all mutations have probability per division, but generalization to different mutation probabilities is straightforward.

**A.** Fitness valley: fitness versus genotype . **B.** Schematic representation of the best possible scenario, for a metapopulation with demes. Each square represents a deme of identical size, and a row represents the metapopulation. Colors represent genotypes, with the color-code defined in A. Initially (top row), all demes have genotype '0'. The demes explore the fitness landscape described in A quasi-independently, and one of them, the "champion'' deme (second from the left here), crosses the fitness valley first (second and third row). Individual demes are assumed to be in the sequential fixation regime, so this deme fixes first mutation '1' and then mutation '2'. The beneficial mutation '2' then spreads by migration, which is modeled by random exchange of individuals between demes (arrow on the fourth row), leading to fixation of mutation '2' in the whole metapopulation (fifth row). **C.** Average valley crossing time of a non-structured population, as a function of its carrying capacity , in logarithmic scale. Dots are simulation results, averaged over 1000 runs for each value of ; error bars represent 95% confidence intervals (CI). Theoretical predictions from Ref. [28] are plotted for the sequential fixation regime (blue line) and for the tunneling regime (red line), using (see text) to make the correspondence. The transition between these two regimes is indicated by a dotted line. The carrying capacities at stake in D are highlighted in green (: isolated deme; : non-subdivided population). Parameter values: , , and . **D.** Average valley crossing time of a metapopulation composed of demes each with carrying capacity (total carrying capacity: ), plotted versus the migration-to-mutation rate ratio , in logarithmic scale. Parameter values are the same as in C, and only the migration rate is varied. Dots represent simulation results averaged over 1000 runs for each value of , and error bars are 95% CI. Black vertical lines represent the limits of the interval of in Eq. 14. Blue (resp. red) line: valley crossing time for an isolated deme () with (resp. a non-subdivided population () with ) for the same parameter values, averaged over 1000 runs; shaded regions: 95% CI. Dashed blue (resp. red) lines: corresponding theoretical predictions from Ref. [28] (see C).

In this paper, we focus on the average time required for the whole metapopulation to cross the fitness valley or plateau, i.e. to fix mutation ‘2' in all demes, starting from an initial state where all individuals have genotype ‘0’.

### The best possible scenario

For small enough migration rates, each deme in the metapopulation performs a quasi-independent trial at crossing the valley or plateau. At best, the valley or plateau crossing time of the whole metapopulation is dominated by that, , of the “champion'' deme in the metapopulation, i.e. the deme that crosses the fitness valley or plateau fastest.

We now focus on this best possible scenario, which is illustrated schematically in Fig. 1B: first, the champion deme crosses the valley or plateau by sequential fixation, and then the beneficial mutation rapidly spreads by migration of through the whole metapopulation. Once this best possible scenario is characterized, the crucial question will be whether, and under what conditions, it can be attained: this point will be addressed in the following section.

#### Determination of .

Valley or plateau crossing by a non-structured, well-mixed population can occur by two different mechanisms: sequential fixation and tunneling. The former corresponds to fixation of mutation ‘1' in the whole population, and to subsequent fixation of the beneficial mutation ‘2'. Conversely, the latter occurs when the beneficial mutation arises in a small fluctuating minority of first-mutants, and fixes directly: tunneling does not involve fixation of the intermediate mutation ‘1' [28]. For given values of the parameters , , and , sequential fixation is the fastest process for small populations, where genetic drift plays an important part. Tunneling becomes the dominant process of valley or plateau crossing when the number of individuals per deme exceeds a threshold value , which depends on , , and (see Ref. [28] for a full discussion of this threshold value). Fig. 1C shows simulation results for the valley crossing time of a non-subdivided population versus its size, and illustrates these two different regimes and the transition between them. Note that in our simulations (described in Methods, Sec. 1), we hold fixed the carrying capacity of populations (or demes) instead of the number of individuals . This softer constraint is more realistic and avoids some possible biases in the metapopulation case (see Methods, Sec. 1.2). In practice, each individual divides at a rate and dies at a constant rate : hence, at steady-state, . We choose , and fitnesses of order one, thus .

We now consider independent demes with no migration, and we determine the crossing time of the fastest of these demes, both for demes in the sequential fixation regime and for demes in the tunneling regime.

#### Demes in the sequential fixation regime.

Let(1)denote the probability of fixation of genotype ‘', with fitness , starting from a single individual with genotype ‘' in a deme where all other individuals initially have genotype ‘' and fitness [25], [28]. If , the probability of fixation of genotype ‘' reads . Valley or plateau crossing by sequential fixation involves two successive steps. The first step, fixation of the intermediate mutation ‘1', occurs with rate , where is the total mutation rate in the deme. (Recall that the deme size is fixed, and that represents the birth/death rate. Note that the correspondence with Ref. [28] is obtained by multiplying by all the timescales in this reference, which are expressed in numbers of generations.) Similarly, the second step, fixation of the final beneficial mutation ‘2', has rate . The first step is longer than the second one since mutation ‘1' is neutral or deleterious, while mutation ‘2' is beneficial. If the first step dominates, the distribution of crossing times is approximately exponential with rate . The shortest crossing time among independent demes is then distributed exponentially with rate (see Methods, Sec. 2). Thus, the average crossing time of the champion deme reads . Denoting by the average crossing time for an isolated deme, we obtain

(2)Hence, the champion deme crosses the valley times faster on average than a single deme. This simple result holds for . For simplicity, we restrict ourselves to this regime in the main text, but we provide the general method for calculating in Methods, Sec. 2. We use this general method to calculate numerically the exact value of in our examples below.

#### Demes in the tunneling regime.

Assuming that , so that there is no competition between different mutant lineages, valley or plateau crossing by tunneling involves a single event with constant rate, namely the appearance of a “successful'' ‘1'-mutant, whose lineage includes a ‘2'-mutant that fixes [28]. Crossing time is thus exponentially distributed. Therefore, in this case too, the crossing time of the champion deme among isolated demes is times smaller than that of an average isolated deme (see Methods, Sec. 2): Eq. 2 is valid in the tunneling regime too.

#### Sequential fixation in individual demes is necessary for significant speedups.

In the best possible scenario, where crossing by the metapopulation is dominated by that of the champion deme, i.e. , the previous paragraph shows that , both when isolated demes are in the sequential fixation regime and when they are in the tunneling regime. Hence, it is necessary to have(3)where is the average crossing time of the non-subdivided population, for subdivision to speed up valley or plateau crossing in the best scenario (i.e. for to be smaller than ). This necessary condition is general since it holds *a fortiori* beyond the best scenario. Graphically, in Fig. 1C, which is a logarithmic plot of crossing time versus population size for a non-structured population, the slope of the line joining the isolated deme to the non-subdivided population has to be less negative than -1 in order for speedups to be possible. Recall indeed that the nonsubdivided population is times larger than an isolated deme. The necessary condition in Eq. 3 leaves the possibility of significant speedups in the non-trivial case where a single isolated deme crosses slower than a non-subdivided population (). Fig. 1D demonstrates a significant speedup by subdivision obtained in this regime where .

Let us consider a metapopulation such that isolated demes are in the tunneling regime. Then, the larger non-subdivided population with individuals is also in the tunneling regime [28]. Assuming that , valley or plateau crossing by this non-subdivided population follows the same laws as crossing by the demes. Since the average crossing time by tunneling is inversely proportional to population size (see Ref. [28] and Fig. 1C), we obtain , in contradiction with Eq. 3. This implies that, even in the best possible scenario, subdivision cannot accelerate crossing if isolated demes are in the tunneling regime (since here, ). Thus, having isolated demes in the sequential fixation regime is a necessary condition for subdivision to accelerate crossing. Importantly, however, the non-subdivided population is not required to be in the sequential fixation regime. For instance, in Fig. 1D, the non-subdivided population is in the tunneling regime. Note that when , the population enters the semi-deterministic regime [28] and the average crossing time need not be proportional to . Minor speedups may exist in this regime, but such effects are beyond the scope of this work. In all the following, we will focus on the regime .

#### Maximal possible speedup by subdivision.

The speedup gained by subdividing a population of a given total size is directly described by the ratio of the valley crossing time of a metapopulation to that of a non-subdivided population. Here, we discuss the values this ratio can take in the best possible scenario, where valley crossing by the metapopulation is dominated by that of the champion deme, and we determine the valley depth for which the highest speedups are obtained (i.e. for which this ratio is smallest).

Let us first focus on the case where both the non-subdivided population and the isolated deme are in the sequential fixation regime. The average valley crossing time by the champion deme reads (see our calculation of above). In the best possible scenario, . The average valley crossing time by the non-subdivided population is , where is the fixation probability of an individual with genotype ‘1' in a population of individuals where all the others initially have genotype ‘0' (see Eq. 1). Hence, we obtain(4)

In the case of a plateau, this reduces to . These results demonstrate that if both the non-subdivided population and the isolated deme are in the sequential fixation regime, then subdivision significantly accelerates crossing in the best scenario. The speedup by subdivision becomes larger (i.e. becomes smaller) when the number of demes is increased at fixed valley depth and fixed deme size (or fixed total population size ). Besides, for , the ratio in Eq. 4 decreases when is increased at fixed and : the highest speedups are obtained for the deepest valleys. However, as is increased, the non-subdivided population will eventually enter the tunneling regime (see Fig. 1C).

Let us now consider the alternative case, where the non-subdivided population is in the tunneling regime, while the isolated demes are in the sequential fixation regime. In this case, , where is the fixation probability of a ‘1'-mutant in an isolated deme (see Eq. 1), while , where is the probability that a ‘1'-mutant is “successful'' in the tunneling process, i.e. that its lineage includes a ‘2'-mutant that fixes in the non-subdivided population [28]. Hence, in the best scenario, where , we obtain(5)

Since is independent from population size [28], it also represents the probability of successful tunneling *in an isolated deme*. For isolated demes in the sequential fixation regime, by definition [28]. Hence, Eq. 5 entails . Thus, speedups always exist in the best scenario, provided that the necessary condition that isolated demes cross the plateau by sequential fixation is satisfied. In the case of a fitness plateau, [28], while . Hence, Eq. 5 yields(6)

In the other extreme case of a sufficiently deep valley that satisfies , we have [28]. Using the condition , Eq. 1 yields . Hence, Eq. 5 gives(7)

Interestingly, these expressions of are independent of at fixed . This stands into contrast with the regime discussed above where the non-subdivided population is in the sequential fixation regime. At fixed , the ratio expressed in Eq. 7 is minimal for(8)

The minimum of , corresponding to the largest speedup by subdivision, is obtained for this value of :(9)

The small values of mutation probabilities in nature ensure that the values of in Eqs. 6 and 9 can be very small.

### Conditions for subdivision to maximally accelerate valley or plateau crossing

The previous section was dedicated to the study of the best possible scenario, where the valley or plateau crossing time of the whole metapopulation is dominated by that, , of the champion deme in the metapopulation (i.e. the one that crosses fastest). We now determine analytically the conditions under which this best possible scenario is attained. For this, we focus on migration rates much smaller than division/death rates, , such that fixation or extinction of a mutant lineage in a deme is not perturbed by migration. In addition, we assume that isolated demes are in the sequential fixation regime, since we showed above that it is a necessary condition for subdivision to significantly accelerate crossing, and that it is a sufficient condition for subdivision to accelerate crossing in the best scenario.

In a nutshell, migration must be rare enough for demes to evolve quasi-independently, but frequent enough to spread the beneficial mutation rapidly. The analytical results below allow for predicting the range of migration rates such that subdivision maximally accelerates valley or plateau crossing.

#### First condition: Quasi-independence.

Migration must be rare enough for demes to remain shielded from migration while they harbor the intermediate mutation. Hence, the average time for a deme of ‘1'-mutants to fix the beneficial mutation ‘2', which reads , must be smaller than the average extinction time, , for a deme of ‘1'-mutants to be wiped out by migration from other demes with genotype ‘0'. The total rate of migration events in the metapopulation is , so , where is the average total number of migration events required for the ‘1'-mutants to go extinct. The first condition, , thus yields(10)

Let us now estimate . If one deme has fixed genotype ‘1' while all the others still have genotype ‘0', the probability that a migration event involves the mutant deme is . Following such a “relevant” migration event, extinction of the mutant (‘1') lineage occurs if the ‘0' migrant fixes in the ‘1' deme while the ‘1' migrant does not fix in the ‘0' deme: this occurs with probability . Conversely, the number of mutant demes increases to two with probability , and otherwise remains constant. For , using also , we have (see Eq. 1). Hence, migration-induced increases in the number of mutant demes can be neglected, and we obtain(11)

In Methods, Sec. 3, we derive the general expression of , which does not require , using finite Markov chain theory [25]. Note that this general expression is important because subdivision generically most accelerates valley crossing for (see Eq. 8).

#### Second condition: Rapid spreading.

Migration must be frequent enough for the average spreading time of the final mutation through the whole metapopulation to be shorter than the valley or plateau crossing time by the champion deme. Let be the average number of migration events required for the final beneficial mutants (with genotype ‘2') to spread by migration, once the champion deme has fixed genotype ‘2'. Then, we can write , and the second condition reads(12)

Let us now estimate , starting from a state where the champion deme has fixed genotype ‘2', while all others still contain genotype ‘0'. (Note that some demes may have genotype ‘1', but this is rare since fixation of mutation ‘1' is the slowest step. Moreover, this would not change the spreading time for a plateau and would shorten it for a valley.) Let us focus on the regime where but , such that mutation ‘2' is substantially, but not overwhelmingly, beneficial [28]. As in the above discussion about , we then obtain . Thus, it is possible to neglect any migration-induced decrease in the number of demes with genotype ‘2', which we denote by . The probability that a migration step exchanges individuals with different genotypes is , and the probability that such a relevant migration step increases by one is . Hence, we obtain(13)where the last expression is obtained for . In Methods, Sec. 3, we use finite Markov chain theory to derive the general analytical expression for , which does not require .

#### Combination of the two conditions.

Together, Eqs. 10 and 12 yield the interval of over which subdivision maximally accelerates valley or plateau crossing. For(14)we expect the valley or plateau crossing time of the whole metapopulation to be dominated by that of the champion deme: , where is the average crossing time for an isolated deme, and in the best scenario, .

In the regime where and , we can use the simple expressions of and given in Eqs. 11 and 13, which yields(15)

The ratio, , of the upper to lower bound in Eq. 15 reads(16)

This ratio increases exponentially with (this dependence on comes from that of ). This entails that, in this regime, the interval of where subdivision most accelerates crossing becomes wider as increases. However, the width of this interval is limited by the fact that isolated demes have to be in the sequential fixation regime (see Discussion). While the expressions of the interval bounds in Eq. 15 are more illuminating and easier to derive than the general ones, the latter, given in Methods, Sec. 3, actually play important roles since the highest speedups of valley crossing gained by subdivision are generically obtained for (see Eq. 8).

#### Case of the fitness plateau.

We have obtained an explicit expression of the interval of over which subdivision maximally accelerates valley crossing in the case of a relatively deep fitness valley where while . In the opposite limit of a fitness plateau (), retaining the assumptions and , Eq. 14 can also be simplified. For this, we use the expression of obtained in Eq. 35 of Methods, Sec. 3, and the expression of in Eq. 13, and we note that, since mutation ‘1' is neutral, and . Eq. 14 then becomes:(17)where we have used and . The ratio, , of the upper to lower bound in Eq. 17 reads

(18)This simple expression of demonstrates that the range of over which subdivision maximally accelerates plateau crossing increases as the deme size becomes larger, and that this range is quite wide as long as the number of demes satisfies , which is a realistic condition (recall that we are in the regime ).

### Simulation results

We now present numerical simulations of the evolutionary dynamics described above, which enable us to test our analytical predictions, and to gain additional insight in the process beyond the optimal scenario. Our simulations are based on a Gillespie algorithm [48], [49], and described in detail in Methods, Sec. 1.

Let us first focus on the example presented in Fig. 1D, which shows an example plot of as a function of the ratio of migration to mutation rates, , obtained through our simulations when varying only the migration rate. With the parameter values used in this figure, the interval of Eq. 14 is . Note that here, and in the following examples, we use the general expressions of and given in Methods, Sec. 3, to compute the interval of Eq. 14. Fig. 1D features a minimum right at the center of this theoretically predicted optimal interval. Moreover, this minimum corresponds to , while : hence, the metapopulation crosses the valley on average 6.54 times faster than an isolated deme. This is very close to the limit of the best possible scenario, where the metapopulation would cross 7 times faster than an isolated deme (since here). This example illustrates that speedups tend towards those predicted in the best scenario, when the interval in Eq. 14 is sufficiently wide (here the ratio between its upper and its lower bound is 359). Besides, here: comparing it to the above-mentioned value of yields a 3.47-fold speedup of valley crossing by subdivision. The simulation results in Fig. 1D also show that significant (albeit smaller) speedups exist beyond the optimal parameter window.

Fig. 2 shows heatmaps of the valley crossing time of a metapopulation as a function of the migration-to-mutation rate ratio, (varied by varying ), and of the fitness valley depth, . Fig. 2A shows that the optimal interval of Eq. 14 (solid lines) describes well the region where the ratio of the crossing time of the metapopulation to that of an isolated deme is smallest and tends to the best-scenario limit . For migration rates lower than those in this interval, the ratio increases when decreases. This can be understood qualitatively by noting that if , is determined by the valley crossing time of the slowest among the independent demes. In the opposite case of migration rates larger than those in the optimal interval, increases with , and it tends to the non-subdivided case, , at high values of , as expected. Above a threshold value of (dashed line), becomes smaller than , in which case large values of , such that tends to , give a low (see Fig. 2A).

**A.** Heatmap of the ratio of the average valley crossing time of a metapopulation with and to that of an isolated deme with , as a function of valley depth and migration-to-mutation rate ratio , in logarithmic scale. All numerical results are averaged over 100 simulation runs, and the heatmap is interpolated. Solid lines: bounds of the interval in Eq. 14. Dashed line: value of above which a non-subdivided population crosses the valley faster than an isolated deme. Dotted line: value of above which an isolated deme is in the tunneling regime. Dash-dotted line: value of above which the non-subdivided population is in the tunneling regime. Parameter values: , , ; and are varied. **B.** Similar heatmap for the ratio of the average valley crossing time of a metapopulation to that of a non-subdivided population (with ). Solid line: predicted value of , from Eq. 8, for which the largest speedup by subdivision is expected.

Fig. 2B plots the ratio of the crossing time of the metapopulation to that of the non-subdivided population, which directly yields the speedup obtained by subdividing a population. It shows that, for the parameter values chosen, subdivision accelerates valley crossing over a large range of valley depths and migration rates, extending far beyond the optimal range given by Eq. 14, and that the metapopulation can cross valleys orders of magnitude faster than a single large population. In addition, above a second, larger threshold value of (dotted line in Fig. 2), isolated demes enter the tunneling regime [28]: Fig. 2B shows that sufficiently above this threshold, the metapopulation no longer crosses the valley faster than the non-subdivided population, as predicted above. While having isolated demes in the sequential fixation regime is a necessary condition to obtain significant speedups by subdivision, the non-subdivided population is not required to be in the sequential fixation regime (see above, and Fig. 1C–D). The value of above which the non-subdivided population enters the tunneling regime is indicated by a dash-dotted line in Fig. 2: significant speedups are obtained both below and above this line. The highest speedups are actually obtained above it, i.e. when the non-subdivided population is in the tunneling regime. With the parameter values used, Eq. 8 predicts a minimum of for (solid line in Fig. 2B), which agrees very well with the results of our numerical simulations. (Note that this value of satisfies , and is such that the non-subdivided population is in the tunneling regime. These conditions were used in our derivation of Eq. 8.)

## Discussion

### Limits on the parameter range where subdivision maximally accelerates crossing

In the Results section, we have shown that having isolated demes in the sequential fixation regime is a necessary condition for subdivision to significantly accelerate crossing. This requirement limits the interval of the ratio over which the highest speedups by subdivision are obtained. The extent of this interval can be characterized by the ratio, , of the upper to lower bound in Eq. 14. Let us express the bound on imposed by the requirement of sequential fixation in isolated demes.

If , the threshold value below which an isolated deme is in the sequential fixation regime satisfies [28]. Let us also assume that , and that while , to be in the domain of validity of Eqs. 15 and 16. Combining the condition with the expression of in Eq. 16 yields(19)

For plateaus, isolated demes are in the sequential fixation regime if their size is smaller than [28]. In the regime of validity of Eqs. 17 and 18 ( while , and , ), this condition can be combined with Eq. 18, which yields(20)

Both Eq. 19 and Eq. 20 show that increasing the number of demes decreases the range where the highest speedup by subdivision is reached. This is because having more subpopulations makes the spreading of the beneficial mutation slower. In addition, we find that the bound on is proportional to . Hence, despite this bound, the interval where subdivision most accelerates plateau crossing can span several orders of magnitude, given the small values of the actual mutation probabilities in nature.

### Effect of varying the degree of subdivision of a metapopulation

An interesting question raised by our results regards the optimal degree of subdivision. Given a certain total metapopulation size, into how many demes should it be subdivided in order to obtain the highest speedup possible? We first attack this question using our analytical results, and then we present simulation results, which allow for going beyond the best scenario and its associated parameter window.

Let us consider a metapopulation of given total size . Our analytical results show that increasing subdivision, i.e. increasing the number of subpopulations at constant , leads to stronger speedups of valley crossing (see Eqs. 4 and 7, with ). However, Eqs. 16 and 18, and the previous paragraph, show that when is increased, the parameter range where the speedup by subdivision tends to the best-scenario value becomes smaller and smaller. Eventually, this parameter range ceases to exist altogether: this occurs when becomes of order 1 and below. This sheds light on an interesting trade-off in the degree of subdivision , between the magnitude of the optimal speedup gained by subdivision and the width of the parameter range over which the actual speedup is close to this optimal value. This effect can be observed qualitatively in Fig. 3A, where the valley crossing time of a metapopulation with fixed total size is shown versus the migration-to-mutation rate ratio, , for different values of : when is increased, the minimum becomes deeper but less broad.

**A.** Valley crossing time of a metapopulation with total carrying capacity , versus migration-to-mutation rate ratio , for four different numbers of demes. Dots are simulation results, averaged over 1000 runs for each value of (500 runs for a few points far from the minima); error bars represent 95% CI. Vertical lines represent the limits of the interval of in Eq. 14 in each case, except for , where this interval does not exist. Black horizontal line: plateau crossing time for a non-subdivided population with for the same parameter values, averaged over 1000 runs; shaded regions: 95% CI. Dashed line: corresponding theoretical prediction from Ref. [28]. Parameter values: , , and (same as in Fig. 1C–D); is varied. **B.** Valley crossing time of a metapopulation with total carrying capacity , versus the number of demes, for (i.e. ). Dots are simulation results, averaged over 1000 runs for each value of ; error bars represent 95% CI. Parameter values: same as in A. **C.** Valley crossing time , minimized over for each value of , of a metapopulation with total carrying capacity , versus the number of demes. For each value of , the valley crossing time of the metapopulation was computed for several values of , different by factors of or in the vicinity of the minimum (see A): corresponds to the smallest value obtained in this process. Results obtained for the actual metapopulation (blue) are compared to the best-scenario limit (red) where , calculated using the value of obtained from our simulations. Dots are simulation results, averaged over 1000 runs for each value of ; error bars represent 95% CI. Dashed line: value of such that . Dotted line: value of above which the deleterious mutation is effectively neutral in the isolated demes. Solid line: value of such that . Parameter values: same as in A and B.

In addition, Eqs. 15 and 17 show that when is increased, the lower bound of the interval where the speedup by subdivision tends to the best-scenario value decreases, as for plateaus (Eq. 17) and even more rapidly for deep valleys (Eq. 15). Qualitatively, this is because spreading of the beneficial mutation gets longer when increases. Conversely, the upper bound of this parameter range is independent of for deep valleys (Eq. 15), and grows only logarithmically with for plateaus (Eq. 17). Hence, when is increased, the center of the interval where the actual speedup is close to the optimal value shifts towards higher migration rates. This effect, which can be observed in Fig. 3A, is studied more precisely in Fig. 3B: at fixed migration rate , the crossing time of a metapopulation exhibits a minimum at an intermediate value of . Indeed, the crossing time of the metapopulation first decreases when is increased because the minimum crossing time then decreases. But beyond a certain value of , the migration rate that yields the highest speedup becomes larger than the fixed migration rate , so increases when is increased further.

Next, we study the dependence on of the valley crossing time minimized over for each , again for a metapopulation with fixed total size . For values of small enough for the interval in Eq. 14 to be broad, we expect to be close to the optimal scenario value . But, as discussed above, as increases, this interval will become smaller and then vanish. In such a regime, our analytical results are no longer sufficient to predict the dependence of on , but our simulations can provide additional insight. Fig. 3C shows that, while (left of the dashed line), is close to the best-scenario value. When is increased beyond this point, decreases slower than the best-scenario value. Indeed, the interval in Eq. 14 is no longer wide enough for the best-scenario limit to be approached. Note also that when demes become small enough, verifying (right of the dotted line in Fig. 3C), mutation ‘1' becomes effectively neutral in individual demes, as tends to (see Eq. 1). For even higher values of , is observed to saturate rather than exhibiting a unique minimum. Interestingly, this occurs for such that the interval in Eq. 14 fully vanishes (i.e. when passes below 1, right of the solid line on Fig. 3C). While we do not have rigorous proof of the generic existence of this saturation, we have explored this point for other parameters, and found similar behavior (data not shown). Importantly, this indicates that there is a whole class of nearly optimal population structures.

### Extension to weakly beneficial intermediates

Our work has focused on fitness valleys (), such that mutation ‘1' is deleterious, and on fitness plateaus (), such that mutation ‘1' is neutral. For , mutation ‘1' is effectively neutral, as far as valley crossing is concerned, in a population with individuals [28]. (This condition holds both in the sequential fixation regime and in the tunneling regime.) This implies that our arguments and our results obtained in the case of the fitness plateau also hold for weakly beneficial intermediates. This point is illustrated in Fig. 4A.

**A.** Valley crossing time of a metapopulation composed of demes with , versus migration-to-mutation rate ratio , in logarithmic scale, for three values of in the effectively neutral regime. Dots are simulation results, averaged over 100 runs for each value of ; error bars represent 95% CI. Black vertical lines represent the limits of the interval of in Eq. 14. Black horizontal line: plateau crossing time for an isolated deme with for the same parameter values, averaged over 100 runs; shaded regions: 95% CI. Dashed line: corresponding theoretical prediction from Ref. [28]. Note that the plateau crossing time of the non-subdivided population with is indistinguishable from that of the isolated deme here (as both are in the sequential fixation regime). Parameter values: , , ; is varied. **B.** Valley crossing time of a large population with connected to smaller islands with , versus migration-to-mutation rate ratio , in logarithmic scale. Dots represent simulation results averaged over 100 runs for each value of , and error bars are 95% CI. Vertical lines represent the limits of the interval of in Eq. 49. Blue (resp. red) line: valley crossing time for an isolated population with (resp. ) for the same parameter values, averaged over 100 runs; shaded regions: 95% CI. Dashed blue (resp. red) lines: corresponding theoretical predictions from Ref. [28]. For , the observed minimum satisfies , where is the average valley crossing time of an isolated island. For , the observed minimum satisfies . Parameter values: , , and ; is varied.

### Extension to a population coupled to small island populations

Thus far, we focused on demes of equal size for simplicity, but demes of different sizes are relevant in practice. As a step toward more general populations structures, we now consider a population connected by migration to smaller satellite populations of identical size, assumed to be in the sequential fixation regime. We only allow migration between the large population and each of the smaller islands, and the total migration rate is denoted by . The small island affected by migration is chosen randomly at each migration event. It is straightforward to adapt our work to this case (see Methods, Sec. 4). We obtain an interval of over which the crossing time for the large population is dominated by the crossing time of the champion island. This is corroborated by our simulations (see Fig. 4B).

### Realistic parameter values

Let us consider the example of *Escherichia coli*, for which the mutation probability per base pair per division is [50]. In order to gain a speedup of crossing by subdivision, we require demes to be in the sequential fixation regime. For plateaus, this condition reads . Let us consider deme sizes such that this condition is satisfied.

First, let us choose , which is within the smallest range of sizes used in current evolution experiments. For instance, it is the number of bacteria transferred at each dilution step for small populations in [27]. For this value of , all plateaus with are in the sequential fixation regime (from the condition ). Let us also consider , since 96-well plates are often used in these experiments [27], [51]. This yields a total population size of individuals, which is in the tunneling regime for all plateaus with . For , isolated demes are in the sequential fixation regime for . (Subdivision cannot significantly accelerate crossing for deeper valleys since isolated demes are then in tunneling, but those valleys take longer to cross than shallow ones and are thus probably less often crossed in practice.) The ratio of the bounds of the interval in Eq. 14 satisfies throughout this range of valleys, with for the plateau and for the deepest valleys in the range. Thus, actual speedups will approach the best-scenario one, and significant speedups will exist in a wide parameter window. Eq. 8 predicts that the highest speedup is obtained for , and Eq. 9 then yields a speedup factor by subdivision of . (Using instead the full expression of obtained from Eq. 23 (see Methods, Sec. 2) yields , i.e. a correction of 7%.) Moreover, for all valleys with , the best-scenario speedup ranges from 18 to . Thus, subdivision significantly accelerates crossing for this entire class of valleys.

It should be noted that the timescales obtained in this example are long compared to experimental ones. For instance, for the plateau, corresponds to divisions while is divisions. However, can become smaller if the number of subpopulations is increased, as discussed in our previous section. Besides, we have chosen to focus on standard *Escherichia coli* for simplicity. Organisms with a higher mutation rate, e.g. viruses such as HIV, or mutator strains, would have much shorter timescales, but smaller subpopulations would then be required for demes to be in the sequential fixation regime.

Our example thus far focused on a small but realistic deme size, . Experimentally more frequent values of are in the range – [27], [51]. Increasing at fixed decreases the range of for which demes are in the sequential fixation regime. For a plateau, this condition reads . For , this yields , and for , this yields . Hence, the range of plateaus (and similarly, of valleys) for which subdivision accelerates crossing becomes more restricted when is increased. Nevertheless, if these increasingly stringent conditions on are satisfied, significant speedups by subdivision are still expected. Indeed, Eq. 9 shows that the smallest value of the ratio is proportional to , so if one increases while decreasing as , the maximal speedup by subdivision will remain unchanged.

In this work, we have considered the crossing of one particular valley or plateau corresponding to a specific pair of two mutations. Given the complexity and high dimensionality of actual fitness landscapes, there may be a large number of parallel valleys or plateaus, so that one of these could be crossed quite frequently even though the crossing time for a single valley or plateau remains large. Our work shows that, under specific conditions, subdivision can significantly accelerate crossing for whole classes of valleys and plateaus. Furthermore, in a generic, high-dimensional fitness landscape that contains both valleys and/or plateaus and uphill paths, subdivision can provide an additional effect: it “shields'' some demes in the metapopulation from adaptation via the uphill paths, leaving them time to explore valley-crossing paths that may be better in the longer term. While this effect is outside the scope of the present paper, it could lead to additional advantages of subdivision in evolution on rugged fitness landscapes.

### Conclusion

Our study of a generic and minimal model of population subdivision with migration demonstrates that subdividing a population into demes connected by migration can significantly accelerate the crossing of fitness plateaus and valleys, without the need for additional ingredients. We have derived quantitative conditions on the various parameters for subdivision to accelerate crossing, and for the resulting speedup to be maximal. In particular, isolated demes have to be in the sequential fixation regime for a significant speedup to occur. This condition is quite strong, but provided that it is met, significant speedups can be obtained in a wide range of migration rates, with the fastest deme driving the crossing of the whole metapopulation in the best scenario. We have derived the interval of migration rates for which this best scenario is reached. In addition, we have shown that increasing the degree of subdivision of a population enables higher speedups to be reached, but that this effect can saturate.

Our quantitative assessment of the conditions under which subdivision significantly speeds up valley or plateau crossing can aid in optimally designing future experiments, enabling one to choose the sizes and the number of demes, as well as the migration rates, such that subdivision can accelerate valley and plateau crossing.

Further directions include investigating the evolution of a metapopulation with a distribution of deme sizes on a more general rugged landscape, as well as assessing the impact of specific geographic structure. Our work could also be extended to sexual populations, where recombination plays an important role in valley or plateau crossing [52]. The interplay between recombination and subdivision, which respectively alleviate and exacerbate clonal interference, would be interesting to study.

## Methods

### 1 Simulation methods

Our simulations are based on a Gillespie algorithm [48], [49] that we coded in the C language. Here we will describe our algorithm for the case of a metapopulation of demes of identical size, which is the primary situation discussed in our work. In our simulations, each deme has a fixed carrying capacity –we discuss this choice further in this section.

#### 1.1 Algorithm.

A number of different events occur in our simulations, each with an independent rate:

- Each individual divides at rate , where is the fitness associated with the genotype of the individual, and is the current total number of individuals in the deme to which the individual belongs. This corresponds to logistic growth.
- If a dividing cell has , upon division, its offspring (i.e., one of the two individuals resulting from the division) mutates with probability , to have genotype instead of .
- Each individual dies at rate . Hence, at steady-state, , where is the average fitness of deme . In practice, we choose , and fitnesses of order one, thus .
- Migration occurs at total rate . Two different demes are chosen at random, an individual is chosen at random from each of these two demes, and the two individuals are exchanged. There is no geographic structure in our model, i.e. exchange between any two demes is equally likely.

In practice, the number of individuals with each genotype in each deme is stored, as well as the corresponding division rate. This data fully describes the state of the metapopulation, and allows determination of the rates of all events. For each event in the simulation, the following steps are performed:

- A timestep is drawn from an exponential distribution with rate equal to the total rate of events (i.e., the sum of all rates), and time is increased from its previous value, , to . In other words, the next event occurs at time .
- The event that occurs at is chosen randomly, in such a way that the probability of an event with rate is equal to : either a cell divides, or a cell dies, or a migration event occurs.
- The event is performed, and the relevant data is updated. Since we store the number of individuals with each genotype in each deme, only one or two of these numbers need to be updated at each step. In addition, the division rates of the affected deme must be updated upon division and death because is modified. Note, however, that this represents only three numbers at most (one for each genotype).

The advantage of the Gillespie algorithm is that it is exact, and does not involve any artificial discretization of time.

#### 1.2 Working at fixed carrying capacity.

In our simulations, demes have a fixed carrying capacity, and the number of individuals per deme fluctuates weakly around its equilibrium value. This approach, also used in e.g. [23], has the advantage of realism. Alternatively, we could impose a constant number of individuals per deme.

- First, we could choose a dividing individual in the whole metapopulation with probability proportional to its fitness, and simultaneously suppress another individual, chosen at random in the same deme. However, in this case, individuals in demes of higher fitness would exhibit shorter lifespans, which is not realistic and may introduce a bias.
- A second possibility would be to choose a dividing individual (according to fitness) in each of the demes, and to simultaneously suppress another individual, chosen at random, in each deme. However, in this case, unless migration events are far less frequent than these collective division-death events (i.e., these division-death events), the time interval between them becomes artificially discretized. This introduces biases unless the total migration rate is much smaller than , i.e. unless .

Consequently, while imposing a constant number of individuals is a good simulation approach for a non-subdivided population (see e.g. [28]), it tends to introduce biases in the study of metapopulations. While we chose to perform simulations with fixed carrying capacities in order to avoid any of these biases, we checked that, for small enough migration rates, our results are completely consistent with simulation scheme (ii) described above. This consistency check also demonstrates that it is legitimate to compare our simulation results obtained with fixed carrying capacities to our analytical work carried out with constant population size per deme.

### 2 Crossing time of the champion deme

In this section, we give more details on the calculation of the average valley or plateau crossing time by the champion deme amongst independent ones. We show in the Results section that, in the best scenario, the crossing time of the whole metapopulation is determined by this time.

is the average shortest crossing time of independent demes. This minimum crossing time, which we denote by , is also called the smallest (or first) order statistic of the deme crossing time amongst a sample of size [53].

Let us denote by the probability density function of valley or plateau crossing time for a single deme, and let us introduce (it satisfies where is the cumulative distribution function of valley or plateau crossing by a single deme). The probability that is larger than is equal to the probability that the crossing times of each of the independent demes are all larger than . By differentiating this expression, one obtains the probability density function of the crossing time by the champion deme (see e.g. [53]):(21)

We now express explicitly. Since demes are assumed to be in the sequential fixation regime, valley or plateau crossing involves two successive steps. The first step, fixation of a ‘1'-mutant, occurs with rate , and the second step, fixation of a ‘2'-mutant, occurs with rate (see the Results section for expressions of these rates). The total crossing time is thus a sum of two independent exponential random variables, with probability density function given by a two-parameter hypoexponential distribution [53]:(22)

Combining Eqs. 21 and 22, we obtain(23)with given by Eq. 22. can then be determined for any value of the parameters by computing the average value of over this distribution.

Since mutation ‘1' is deleterious or neutral while mutation ‘2' is beneficial, the first step of valley crossing is much longer than the second one over a broad range of parameter values. In this case, we can approximate with a simple exponential distribution,(24)

Eq. 21 then yields(25)i.e. is distributed exponentially with rate . In this case, we simply have , which can be written as , where is the average crossing time for an isolated deme. Hence, in this case, on which our analytical discussion focuses, the champion deme crosses the valley times faster on average than an isolated deme.

For this approximation to be valid, the second step of valley crossing must be negligible even for the champion deme, i.e., . For very large , the value of will not be as small as , since the second step will no longer be negligible (see [52] for a discussion of similar issues). The crossover to this regime can be determined by computing the average of the distribution in Eq. 23 and comparing it to .

### 3 Number of migration events for extinction or spreading in a metapopulation

In our Results section, we have derived an interval of the ratio of migration rate to mutation rate over which subdivision most reduces valley or plateau crossing time (see Eq. 14). The upper bound involves , the average number of migration events required for the ‘1'-mutants to be wiped out by migration, starting from a state where one deme has fixed genotype ‘1', while all other demes have genotype ‘0'. Similarly, the lower bound involves , the average number of migration events required for the ‘2'-mutants to spread by migration to the whole metapopulation, starting from a state where one deme has fixed genotype ‘2', while all other demes have genotype ‘0'. In our Results section, we have provided intuitive derivations of the simple expressions of and , valid for and , and (see Eq. 15). However, it is important to derive more general expressions, especially since subdivision generically most accelerates valley crossing in the intermediate regime where (see Results, Eq. 8).

Here, we derive general analytical expressions for and , both for fitness plateaus and for fitness valleys. These more general expressions are those used for numerical calculations of the bounds in our examples. Throughout this section, we consider a metapopulation of demes composed of individuals each, and we assume that individual demes are in the sequential fixation regime (see Results).

#### 3.1 A finite Markov chain.

In order to determine and , we study the evolution of the number of demes that have fixed the mutant genotype (‘1' for the calculation of ; ‘2' for that of ), while other demes have genotype ‘0'. Given that the value of just before a migration step fully determines the probabilities of the outcomes of this migration step, and given that and are absorbing states, the number evolves according to a finite Markov chain, each step being a migration event. We next express the transition matrix of this Markov chain.

The only migration events that can affect are those that exchange individuals from two demes with different genotypes. Let us call these migration events “relevant''. The probability of a migration event being relevant corresponds to the probability that this migration affects one of the mutant populations and one of the ‘0' populations: . We only focus on the final outcome of a migration event, after fixation or extinction of each of the two migrants' lineages has occurred. Let denote the probability that the mutant migrant fixes in the ‘0' deme, and the probability that the ‘0' migrant fixes in the mutant deme. As a result of one such relevant migration event:

- increases by one with probability , if the migrant mutant fixes in the ‘0' deme while ‘0' migrant does not fix in the mutant deme.
- decreases by one with probability , in the opposite case.
- Otherwise, does not change. This happens either if both migrants fix (with probability ) or if no migrant fixes (with probability ).

These probabilities, multiplied by the probability that a migration event is relevant, yield the transition matrix of our finite Markov chain, which is tri-diagonal (or continuant) since each migration step can either leave constant, or increase or decrease it by one:(26)

(28)for , and . We have denoted by the probability that varies from to as the final outcome of one migration event.

Here, we do not account for independent mutations arising and fixing in other demes during the process of spreading (or extinction) of the mutant's lineage in the metapopulation. Indeed, our aim is to compare the timescales of migration and mutation processes, so we treat them separately. Note that, in practice, this hypothesis is reasonable if mutations that fix are sufficiently rarer than migration events. We also consider that the time between two successive migration events is large enough for fixation to occur in the demes affected by migration before the next migration event occurs, which is true in the low-migration rate regime that we study in our work (, where is the migration rate per individual, while is the death and division rate per individual).

and can be directly expressed as the average number of steps of the Markov chain necessary to go from the initial state to absorption in a particular absorbing state, either or . Let us present general expressions of these average numbers of steps, before using them to obtain explicit expressions of and .

#### 3.2 Some results regarding finite Markov chains with tri-diagonal probability matrices.

We are interested in the average number of steps until the system reaches each of the absorbing states , starting from the state :(29)where is the average number of steps that the system spends in the state before absorption, given that it starts in the state and finally absorbs in state . It can be expressed as [25](30)where is the average number of steps the system spends in state before absorption in either of the two absorbing states, given that it started in state , and is the probability that the system finally absorbs in state if it starts in state .

Using the explicit expressions given in [25] for and in the case of a tri-diagonal probability matrix, we obtain:(31)

#### 3.3 Explicit expression of .

, in fact, corresponds to , where is the probability that a ‘1'-mutant fixes in a deme of ‘0' individuals (i.e. ) and is the probability that a ‘0'-individual fixes in a deme of ‘1'-mutants (i.e. ). Hence, it can be expressed explicitly from Eqs. 31, 26, and 27. Since the expressions of and depend whether mutation ‘1' is neutral or deleterious, we obtain different expressions for the fitness plateau and for the fitness valley.

**Fitness plateau.** For a fitness plateau (i.e. a neutral intermediate ‘1'), , where is the number of individuals per deme. Hence,(34)which implies that for all (see Eq. 33). Thus, Eq. 31 yields(35)where the last expression holds for and .

**Fitness valley.** Eqs. 33 and 26, 27 yield , with(36)and Eq. 31 gives:

(37)In these expressions, is the probability of fixation of a deleterious ‘1'-mutant, with fitness , in a deme where all other individuals have genotype ‘0' and fitness . It can be obtained from Eq. 1, as well as the probability of the opposite process.

#### 3.4 Explicit expression of .

corresponds to , where is the probability that a ‘2'-mutant (with fitness ) fixes in a deme of ‘0' individuals (with fitness 1), and is the probability that a ‘0'-individual fixes in a deme of ‘2'-mutants. Hence, it can be expressed explicitly from Eqs. 32, 26, and 27, using Eq. 1 to express the fixation probabilities. For , there is no difference between the valley and the plateau, since genotype ‘1' is not involved.

As above, Eqs. 26, 27 and 33 yield , with defined in Eq. 36. Thus, Eq. 32 gives(38)

#### 3.5 Simplified expressions for deep valleys and for plateaus.

In our Results section, we have shown that the benefit of subdivision is highest when is situated between a lower bound,(39)and an upper bound,(40)(see Eq. 14), where denotes the probability of fixation of a single mutant with genotype ‘2' in a background of ‘1'-mutants. Here we present simplified expressions for and , and hence of and , in particular parameter regimes.

Throughout this section, we focus on the regime where but , such that mutation ‘2' is substantially, but not overwhelmingly, beneficial [28]. We then have and (see Eq. 1). To leading (i.e. zeroth) order in , we obtain from Eq. 38 that(41)where the last expression holds for . This expression of is identical to Eq. 13, which was demonstrated more intuitively in the Results section by directly assuming and .

We now consider the case of a plateau () and the case of a valley such that but . We demonstrate that the latter case is consistent with the simplified derivations in our Results section.

**Fitness plateau.** For a fitness plateau, combining Eqs. 35 and 40, the upper bound reads(42)where we have used , since mutation ‘1' is neutral, and assumed and .

Additionally, Eqs. 41 and 39 can be combined to write the lower bound as(43)again to lowest order in . Here, we have used , since in the case of the plateau, mutation ‘1' is neutral. This expression too holds for and .

Combining Eqs. 42 and 43 yields Eq. 17.

**Fitness valley.** Next we focus on valleys such that but . (Note that, in the opposite limit , mutation ‘1' is effectively neutral, and the above discussion regarding the fitness plateau applies.) Then, (see Eq. 1). To lowest (i.e. zeroth) order in , Eq. 37 becomes(44)where we have used the approximation , which holds for and . This expression of coincides with Eq. 11, which is obtained in the Results section through a more intuitive argument that directly assumes and . Hence, from Eq. 40, the upper bound is(45)where we used the conditions , , and to simplify the expression of .

Meanwhile, from Eq. 39 and 41, the lower bound takes the form(46)where, again, we used the conditions , , and to simplify the expressions of and .

Combining Eqs. 46 and 45 yields Eq. 15.

### 4 A population connected by migration to smaller population islands

Let us consider a population of individuals connected by migration to smaller population islands with individuals each. These islands of identical size are assumed to be in the sequential fixation regime. For the sake of simplicity, we consider that migration only occurs between the large population and the islands: a migration step is a random exchange of two individuals between the large population and one of the islands (chosen at random at each migration event), and the total migration rate is denoted by . Here, we focus on the valley or plateau crossing time of the large population. We demonstrate that the evolution of a large population can be driven by that of satellite islands.

In the optimal case, the crossing time of the large population is determined by that of the champion island, i.e., that which crosses the fitness valley or plateau fastest. We now determine the conditions under which this optimum is achieved, focusing on migration rates much smaller than division/death rates, , such that fixation or extinction of a mutant lineage in either the large population or an island is not significantly perturbed by migration. Again, migration should be rare enough for islands to remain effectively shielded from migration events while they have fixed the intermediate mutation, until the final beneficial mutation arises. Second, migration should also be frequent enough for the spreading time of the final beneficial mutation from the champion island to the large population to be negligible with respect to the crossing time of the champion island. These two criteria again provide upper and lower bounds on .

The average time (with from Eq. 1) required for an island of ‘1'-mutants to fix the beneficial mutation ‘2' must be smaller than the average time, , for an island of ‘1'-mutants to be wiped out by migration from the large population, which still exhibits genotype ‘0'. The rate of migration events between the island of ‘1'-mutants and the large population is . Hence, , where is the probability of fixation of the lineage of a single migrant with genotype ‘0' in an island where all other individuals are ‘1'-mutants: for valleys, it is given by Eq. 1, while for plateaus, it is equal to . The first condition, , thus yields(47)

The second condition is that the average spreading time, , for the final beneficial mutation to fix in a large population after it has fixed in the champion island, must be smaller than the average valley or plateau crossing time, , of the champion island. Similar to previously, we obtain , where is the probability of fixation of a migrant with genotype ‘2' in the large population, which is assumed to exhibit genotype ‘0' before migration (see Eq. 1). is the average of the minimum crossing time among independent islands. We again focus, for simplicity, on the limit where the first step of valley or plateau crossing, which occurs at rate , is much longer than the second. Then, we simply have (see Results). In this expression, (with obtained from Eq. 1) is the average crossing time for an isolated island. Hence, the champion island crosses the valley times faster on average than a single isolated island. The second condition, , finally yields(48)

Together, Eqs. 47 and 48 yield the interval of over which we expect subdivision to maximally accelerate crossing:(49)

In this range, we expect the valley or plateau crossing time of the large population to be dominated by the crossing time of the champion island, so that . This prediction is confirmed by our simulations (see Fig. 4B).

## Author Contributions

Conceived and designed the experiments: AFB DJS. Performed the experiments: AFB DJS. Analyzed the data: AFB DJS. Contributed reagents/materials/analysis tools: AFB DJS. Contributed to the writing of the manuscript: AFB DJS.

## References

- 1. Dawid A, Kiviet DJ, Kogenaru M, de Vos M, Tans SJ (2010) Multiple peaks and reciprocal sign epistasis in an empirically determined genotype-phenotype landscape. Chaos 20: 026105.
- 2. Draghi JA, Plotkin JB (2013) Selection biases the prevalence and type of epistasis along adaptive trajectories. Evolution 67: 3120–3131. doi: 10.1111/evo.12192
- 3. Franke J, Klözer A, de Visser JAGM, Krug J (2011) Evolutionary accessibility of mutational pathways. PLoS Comput Biol 7: e1002134. doi: 10.1371/journal.pcbi.1002134
- 4.
Szendro IG, Schenk MF, Franke J, Krug J, de Visser JAGM (2013) Quantitative analyses of empirical fitness landscapes. J Stat Mech Theor Exp: P01005.
- 5. Whitlock MC, Phillips PC, Moore FBG, Tonsor SJ (1995) Multiple fitness peaks and epistasis. Annual Review of Ecology and Systematics 26: 601–629. doi: 10.1146/annurev.es.26.110195.003125
- 6.
Schrag SJ, Perrot V, Levin BR (1997) Adaptation to the fitness cost of antibiotic resistance in
*E. coli*. Proc R Soc Lond B 264: 1287–1291. doi: 10.1098/rspb.1997.0178 - 7. Beerenwinkel N, Pachter L, B S, Elena SF, Lenski RE (2007) Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evolutionary Biology 7: 60. doi: 10.1186/1471-2148-7-60
- 8. Trindade S, Sousa A, Xavier K, Dionisio F, Ferreira M, et al. (2009) Positive epistasis drives the acquisition of multidrug resistance. PLoS Genetics 5: e1000578. doi: 10.1371/journal.pgen.1000578
- 9. Andersson DI, Hughes D (2010) Antibiotic resistance and its cost: is it possible to reverse resistance? Nat Rev Microbiol 8: 260–271. doi: 10.1038/nrmicro2319
- 10. Bloom JD, Gong LI, Baltimore D (2010) Permissive secondary mutations enable the evolution of influenza oseltamivir resistance. Science 328: 1272–1275. doi: 10.1126/science.1187816
- 11. Kryazhimskiy S, Dushoff J, Bazykin G, Plotkin J (2011) Prevalence of epistasis in the evolution of influenza A surface proteins. PLoS Genetics 7: e1001301. doi: 10.1371/journal.pgen.1001301
- 12. Breen M, Kemena C, Vlasov P, Notredame C, Kondrashov F (2012) Epistasis as the primary factor in molecular evolution. Nature 490: 535–538.
- 13. Gong LI, Suchard MA, Bloom JD (2013) Stability-mediated epistasis constrains the evolution of an influenza protein. eLife 2: e00631.
- 14. Covert AW, Lenski RE, Wilke CO, Ofria C (2013) Experiments on the role of deleterious mutations as stepping stones in adaptive evolution. Proc Natl Acad Sci USA 110: E3171–E3178. doi: 10.1073/pnas.1313424110
- 15.
Østman B, Adami C (2014) Predicting Evolution and Visualizing High-Dimensional Fitness Landscapes. In: Richter H, Engelbrecht A, editors, Recent Advances in the Theory and Application of Fitness Landscapes, Springer, volume 6 of
*Emergence, Complexity and Computation*. pp. 509–526. - 16. Korona R, Nakatsu CH, Forney LJ, Lenski RE (1994) Evidence for multiple adaptive peaks from populations of bacteria evolving in a structured habitat. Proc Natl Acad Sci USA 91: 9037–9041. doi: 10.1073/pnas.91.19.9037
- 17. Hallatschek O, Hersen P, Ramanathan S, Nelson DR (2007) Genetic drift at expanding frontiers promotes gene segregation. Proc Natl Acad Sci USA 104: 19926–19930. doi: 10.1073/pnas.0710150104
- 18. Waclaw B, Allen RJ, Evans MR (2010) Dynamical phase transition in a model for evolution with migration. Phys Rev Lett 105: 268101. doi: 10.1103/physrevlett.105.268101
- 19. Martens E, Hallatschek O (2011) Interfering waves of adaptation promote spatial mixing. Genetics 189: 1045–1060. doi: 10.1534/genetics.111.130112
- 20. Martens E, Kostadinov R, Maley C, Hallatschek O (2011) Spatial structure increases the waiting time for cancer. New J Phys 189: 115014. doi: 10.1088/1367-2630/13/11/115014
- 21. Otwinowski J, Boettcher S (2011) Accumulation of beneficial mutations in one dimension. Phys Rev E 84: 011925. doi: 10.1103/physreve.84.011925
- 22. Zhang Q, Lambert G, Liao D, Kim H, Robin K, et al. (2011) Acceleration of emergence of bacterial antibiotic resistance in connected microenvironments. Science 333: 1764–1767. doi: 10.1126/science.1208747
- 23. Greulich P, Waclaw B, Allen RJ (2012) Mutational pathway determines whether drug gradients accelerate evolution of drug-resistant cells. Phys Rev Lett 109: 088101. doi: 10.1103/physrevlett.109.088101
- 24. Hermsen R, Deris JB, Hwa T (2012) On the rapidity of antibiotic resistance evolution facilitated by a concentration gradient. Proc Natl Acad Sci USA 109: 10775–10780. doi: 10.1073/pnas.1117716109
- 25.
Ewens WJ (1979) Mathematical Population Genetics. Springer-Verlag.
- 26. Weinreich DM, Chao L (2005) Rapid evolutionary escape in large populations from local peaks on the Wrightian fitness landscape. Evolution 59: 1175–1182. doi: 10.1554/04-392
- 27. Rozen DE, Habets MG, Handel A, de Visser JAGM (2008) Heterogeneous adaptive trajectories of small populations on complex fitness landscapes. PLoS ONE 3: e1715. doi: 10.1371/journal.pone.0001715
- 28. Weissman DB, Desai MM, Fisher DS, Feldman MW (2009) The rate at which asexual populations cross fitness valleys. Theor Pop Biol 75: 286–300. doi: 10.1016/j.tpb.2009.02.006
- 29. Wright S (1931) Evolution in Mendelian populations. Genetics 16: 97–159.
- 30. Wright S (1932) The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proc 6th Int Congress of Genetics 1: 356–366.
- 31. Wright S (1940) Breeding Structure of Populations in Relation to Speciation. The American Naturalist 74: 232–248. doi: 10.1086/280891
- 32. Wright S (1982) The shifting balance theory and macroevolution. Ann Rev Genet 16: 1–19. doi: 10.1146/annurev.ge.16.120182.000245
- 33. Lande R (1985) The fixation of chromosomal rearrangements in a subdivided population with local extinction and colonization. Heredity 54: 323–332. doi: 10.1038/hdy.1985.43
- 34. Slatkin M (1989) Population structure and evolutionary progress. Genome 31: 196–202. doi: 10.1139/g89-034
- 35. Wade MJ, Goodnight CJ (1991) Wright shifting balance theory - an experimental study. Science 253: 1015–1018. doi: 10.1126/science.1887214
- 36. Barton NH, Rouhani S (1993) Adaptation and the shifting balance. Genetics Research 61: 57–74. doi: 10.1017/s0016672300031098
- 37. Coyne JA, Barton NH, Turelli M (1997) A critique of Sewall Wright's shifting balance theory of evolution. Evolution 51: 643–671. doi: 10.2307/2411143
- 38. Gavrilets S (1997) Evolution and speciation on holey adaptive landscapes. Trends in Ecology & Evolution 12: 307–312. doi: 10.1016/s0169-5347(97)01098-7
- 39. Wade MJ, Goodnight CJ (1998) Perspective: The theories of Fisher and Wright in the context of metapopulations: When nature does many small experiments. Evolution 52: 1537–1553. doi: 10.2307/2411328
- 40. Coyne J, Barton N, Turelli M (2000) Is Wright's shifting balance process important in evolution? Evolution 54: 306–317. doi: 10.1111/j.0014-3820.2000.tb00033.x
- 41. Crow JF (2008) Mid-Century Controversies in Population Genetics. Annual Review of Genetics 42: 1–16. doi: 10.1146/annurev.genet.42.110807.091612
- 42. Wade MJ (2013) Phase III of Wright's shifting balance process and the variance among demes in migration rate. Evolution 67: 1591–1597. doi: 10.1111/evo.12088
- 43.
Desai MM (2013) Statistical questions in experimental evolution. J Stat Mech Theor Exp: P01003.
- 44.
Kerr B (2013) QCB Seminar at Princeton University, and private communication.
- 45.
Kryazhimskiy S, Rice DP, Desai MM (2011) Population subdivision and adaptation in asexual populations of
*Saccharomyces cerevisiae*. Evolution 66: 1931–1941. doi: 10.1111/j.1558-5646.2011.01569.x - 46. van Marle G, Gill MJ, Kolodka D, McManus L, Grant T, et al. (2007) Compartmentalization of the gut viral reservoir in HIV-1 infected patients. Retrovirology 4: 87. doi: 10.1186/1742-4690-4-87
- 47. Schnell G, Price RW, R S, Spudich S (2010) Compartmentalization and clonal amplification of HIV-1 variants in the cerebrospinal fluid during primary infection. J Virol 84: 2395. doi: 10.1128/jvi.01863-09
- 48. Gillespie DT (1976) A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys 22: 403–434. doi: 10.1016/0021-9991(76)90041-3
- 49. Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81: 2340–2361. doi: 10.1021/j100540a008
- 50. Wielgoss S, Barrick JE, Tenaillon O, Cruveiller S, Chane-Woon-Ming B, et al. (2011) Mutation rate dynamics in a bacterial population reflect tension between adaptation and genetic load. G3 1: 183–186.
- 51. Kerr B, Neuhauser C, Bohannan BJM, Dean AM (2006) Local migration promotes competitive restraint in a host-pathogen 'tragedy of the commons'. Nature 442: 75–78. doi: 10.1038/nature04864
- 52. Weissman DB, Feldman MW, Fisher DS (2010) The rate of fitness-valley crossing in sexual populations. Genetics 186: 1389–1410. doi: 10.1534/genetics.110.123240
- 53.
Bolch G, Greiner S, de Meer H, Trivedi KS (2006) Queuing networks and Markov chains (2nd edition). Wiley.