The Arrival of the Frequent: How Bias in Genotype-Phenotype Maps Can Steer Populations to Local Optima

Genotype-phenotype (GP) maps specify how the random mutations that change genotypes generate variation by altering phenotypes, which, in turn, can trigger selection. Many GP maps share the following general properties: 1) The total number of genotypes is much larger than the number of selectable phenotypes; 2) Neutral exploration changes the variation that is accessible to the population; 3) The distribution of phenotype frequencies , with the number of genotypes mapping onto phenotype , is highly biased: the majority of genotypes map to only a small minority of the phenotypes. Here we explore how these properties affect the evolutionary dynamics of haploid Wright-Fisher models that are coupled to a random GP map or to a more complex RNA sequence to secondary structure map. For both maps the probability of a mutation leading to a phenotype scales to first order as , although for the RNA map there are further correlations as well. By using mean-field theory, supported by computer simulations, we show that the discovery time of a phenotype similarly scales to first order as for a wide range of population sizes and mutation rates in both the monomorphic and polymorphic regimes. These differences in the rate at which variation arises can vary over many orders of magnitude. Phenotypic variation with a larger is therefore be much more likely to arise than variation with a small . We show, using the RNA model, that frequent phenotypes (with larger ) can fix in a population even when alternative, but less frequent, phenotypes with much higher fitness are potentially accessible. In other words, if the fittest never ‘arrive’ on the timescales of evolutionary change, then they can't fix. We call this highly non-ergodic effect the ‘arrival of the frequent’.


Introduction
Darwin's account of biological evolution [1] stressed the importance of natural selection: If some individuals are better adapted to their environment than their competitors, their offspring will come to dominate the population. The fittest survive and the less fit go extinct. Yet selection alone is not sufficient to drive evolution because natural selection reduces the very variation that it requires to operate. It was only recognised well after Darwin's day [2], in part through the success of the Modern Synthesis, that the fuel for selection is provided by mutations that make offspring genetically different from their parents. Crucially, mutations change genetically stored information (the genotype) while selection operates on the physical expression of this information (the phenotype). Understanding the relation between genotypes and phenotypes -the GP map -is therefore crucial to understanding evolutionary dynamics [3].
GP mappings have been studied at different levels of abstraction [4] The most basic systems are concerned with the sequence-tostructure(-to-function) relation of single molecules such as RNA [5] or proteins [6][7][8], but higher-level systems such as protein complexes [9], gene-regulatory networks [10] and developmental networks [11] have also been studied. Even though these GP maps arise in quite different contexts, they share several interesting properties: 1) Most basically, the number of possible genotypes N G is typically much greater than the number of possible phenotypes N P , so the map is many-to-one. As a consequence, many mutations may conserve the phenotype, leading to mutational robustness. Important prior work has linked such robustness to the concept of neutral spaces, namely the set of all genotypes that map to a particular phenotype, with the additional property that they be linked by neutral mutations [4,5,12]. 2) Even though N P %N G , the accessible genetic neighbourhood of a single genotype g that generates a given phenotype p may include significantly fewer alternative phenotypes (potential variation) than is found in the neighbourhood of the (neutral) set N p of all N p~D N p D genotypes that map onto phenotype p.
Exploration of a neutral space can therefore increase the variety of phenotypes discovered by a population [13,14]. 3) Perhaps the most striking commonality of these GP maps is a strong bias in assignment of genotypes to phenotypes: Most phenotypes are realised by a tiny proportion of all genotypes, while most genotypes map into a small fraction of all phenotypes. This property is shared by all the GP maps we noted before. Typically the number N p of genotypes per phenotype p and the related phenotype frequencies F p~Np =N G can vary over many orders of magnitude. Such huge variations are likely to have an effect on the course of evolution.
In this paper we study the evolutionary dynamics of a classical Wright-Fisher model, but with explicit microscopic GP maps that capture the three generic properties of such maps introduced above. Motivated by the strong bias in the distribution of the F p observed for many GP maps, we derive a mean-field like approximation for the average probability w pq that a mutation will change a genotype that generates phenotype q into one that generates phenotype p. This approximation greatly simplifies the dynamics, allowing us to calculate analytic expressions for quantities such as the median time T p for phenotype p to first appear in the population as a function of population size N, the point mutation rate m, genome length L and the mutation probabilities w pq .
These approximations are then tested against extensive simulations of two models: firstly, a simple GP map where the genotypes are randomly assigned to phenotypes according to a pre-determined distribution for the frequencies F p and secondly, the well-known mapping of RNA sequence to secondary structure [4,5,15], which is more complex, but also more biologically realistic. We focus on the case where a population of N individuals has initially equilibrated at a fitness maximum given by phenotype q, and then measure the median time T p for alternative phenotype p to first arise in the population.
Our analytic expressions agree quantitatively with the simulations in the polymorphic limit where NLm&1, and also in the opposite monomorphic limit NLm%1. In between these regimes a single scaling factor must be included. In all regimes the median discovery time T p !1=w pq . For the random model w pq &F p ; this scaling also holds for the more complex RNA mapping, although there is significantly more scatter due to local correlations within the neutral spaces and for some phenotypes we find w pq~0 even though F p is large (this can be due to biophysical constraints explained for example in ref. [16]). Despite such higher order effects, the variation of the F p over many orders translates directly into the T p . More frequent (higher F p ) phenotypes are therefore discovered more rapidly and more often along evolutionary trajectories. In this way the structure of the GP map can play a key role in determining evolutionary outcomes.
Finally, we employ the RNA GP map to study the case where two phenotypes p1 and p2 are both more fit than the source phenotype q, but where F p1 &F p2 (or more accurately w p1q &w p2q ). Direct simulations show that phenotype p1, which is more frequent, is much more likely to fix in the population, even if its fitness is much lower than that of p2, an effect we call 'the arrival of the frequent'.

Theoretical framework
We study the evolution of a population of N asexual haploid individuals. Each individual i carries a genotype g i of L letters taken from an alphabet of size K. The individual's phenotype p i is determined from g i via the GP map. The population evolves in in discrete, non-overlapping generations according to the classical Wright-Fisher model for haploid individuals: At each generation T, N parents are drawn with replacement with probability proportional to their fitness 1zs i with the constraint that the population size (or carrying capacity) N is fixed. Each parent gives rise to one offspring, and the offspring make up the population for the next generation. During reproduction, each base in the genotype of length L mutates to a random alternative base with probability m. The number of mutations (that is, the Hamming distance) d between parent and offspring is thus distributed binomially according to h(d)~L d m d (1{m) L{d . In this way the set fg i g of N genotypes changes at each generation. The expected number of individuals with phenotype p that arises at generation t can be written as: where W p (g i ,s i ,d) is the probability that a d fold mutation of genotype g i (selected for reproduction according to fitness 1zs i ) generates an individual with phenotype p. It takes into account the mutational connections between the N G~K L genotypes that make up the GP map. The probability of not finding p is approximately given by the Poisson distribution as exp({m p (t)).
While exact, these dynamic expressions depend implicitly on time through stochastic changes in the set fg i g, and are typically very hard to solve. In order to gain intuitive insight, we employ a number of simplifications and approximations, motivated in part by the general properties of GP maps discussed in the introduction. First, we assume that Lm%1 so that for dw1, h(d)%h(1)&Lm, which means that we can ignore higher order mutations (terms with dw1 in Eq. (1)). For a given source phenotype q (where the fitnesses of all genotypes mapping into q are equal, and so we take 1zs q~1 for simplicity) we can then calculate the mean probability w pq that a single point mutation will generate another phenotype p: where the sum is over the set N q of all N q genotypes that generate phenotype q (see also Figure 1). It is convenient to introduce the robustness of phenotype q as the average probability over all N q of neutral mutations: r~w qq . If we consider the case where at generation t{1 the whole population is on N q , then Eq. (1) simplifies in this mean-field (or pre-averaged) approximation to: The polymorphic limit. If NLm&1then the population naturally Consider the case where1zs p~dqp so that the population remains on N q , which is one way to model neutral exploration. In the mean-field approximation the expected number of individuals with phenotype p produced per generation is now independent of time, and given by Eq.(3), as long as double mutations can be ignored. The time T p (a) when on average the probability of having discovered p is a (so that the median discovery time of p is T p (1=2)) is then given by: Eqns. (3)(4) should provide a good approximation of the full dynamics in the limit that N is large enough that variations between individual genotypes g i [N q are averaged out, in other words, for the case where the 1-mutant neighbourhood of the population is similar to that of the whole neutral space.
Neutral spaces can be astronomically lathan even the largest viral or bacterial populations. In that case, the local neighborhood of the population may not be fully representative of the neighborhood of the entire space. This scenario can most easily be understood in the monomorphic limit where mutants are rare, NLm%1, and exploration is dominated by genetic drift. Every neutral mutant has a probability of 1=N to go to fixation, allowing the population to move to a new genotype. Thus the timescale of fixations is Kimura's famous result [18] t f~1 =(Lmr), where the robustness r is the probability that a mutation is neutral, so that Lmr is the rate of neutral mutations.
Between fixations, the population undergoes periods of genotypic stasis in which only the 1-mutant neighborhood of the current genotype g is explored by (rare) mutations. As there are (K{1)L adjacent genotypes, the timescale of this exploration is t e~( K{1)L=(NLm)~(K{1)=(Nm).
It is instructive to compare the ratio j of these two time-scales, defined via We can use this dimensionless ratio to distinguish between different dynamic regimes. If j&1, fixation takes much longer than exploration. If we define n g p as the number of local neighbours of the genotype g mapping to phenotype p for the current population, then in this limit, phenotypes with n g p w0 are produced continuously (on a time-scale given by t e ) until the population moves to a different genotype. The dynamics under strong genetic drift therefore induce short-term correlations in the mutant phenotypes. Since j&N=L, we call this regime the large population limit.
In the opposite extreme j%1, which we call the large genome limit, the population typically moves to a different genotype before all accessible mutants have been explored. In this regime, we do not expect short-term correlations in the mutant phenotypes, simply because every mutant occurs only very rarely.
Actual discovery and neutral fixation times can show strong fluctuations. As our evolutionary process is a Markov process -the next set of mutants depends only on the parents, not on earlier mutants -the first discovery time of a neighbour genotype as well as the arrival time of the neutral mutant ''destined'' to be fixed, are distributed geometrically (or exponentially in a model with continuous time). Thus the mean of t e or t f is equal to the respective standard deviation, and any particular evolutionary trajectory can be very different from the average behaviour.
Let t be the actual time the population stays at the current genotype. In the continuous time approximation, t is distributed exponentially with mean t f . If the genotype g has n g p mutations leading to p then the probability that p is found during this time is 1{exp({n g p t=t e ). Integrating over the distribution of t, we have the probability P(n g p ) that phenotype p is discovered before the next neutral fixation: If fixations are the rate-limiting step (ie. j&1), P?1 if n g p =0, as each neighborhood is searched exhaustively before the population moves on. On the other hand, if fixation is faster than exploration (j%1), the introduction of alternative phenotypes is determined by random fluctuations, as most available mutants are not produced. To leading order, we find P(n g p )&n g p j~Nn g p =((K{1)Lr). We . The uniform shading of the blue neutral space implies that in the meanfield approximation, the population is assumed to continually explore the neighbourhood of its entire neutral space. Mutational outcomes are thus determined from the local frequencies of phenotypes around the neutral space, as measured by the w pq coefficients. This mean field approximation allows us to derive analytic forms that can be compared to simulations of the full GP map. rge [17],much bigger The monomorphic limit.
note that the inverse dependence on r arises from t f : More robust neutral spaces are explored faster, but therefore less thoroughly. The dynamics in the monomorphic regime are thus relatively straightforward. But whether some new phenotype p is discovered still depends on the structure of the neutral space which in turn determines how the available phenotypes change upon a neutral fixation. To describe this structure, we turn again to a mean-field approximation: The mutational neighborhood of each particular genotype g[N q resembles the average over N q . As the mean number of mutations per genotype leading to p is given by n n pq~( K{1)Lw pq , the probability that p is accessible after a neutral fixation is 1{exp({ n n pq )& n n pq (the approximation is valid provided n pq %1, that is p is not accessible from every genotype in the source neutral space; of course, this is just the condition we are interested in, as otherwise neutral exploration would not typically be necessary for phenotype p to arise).
Over a large number of generations (t&t f ), a monomorphic population explores its neutral space uniformly [19]. Assuming that n g p w1 can be ignored in practice, we have T p (a)~{t f log(1{a)=(n pq P(1)). The first discovery time in the large population limit becomes: whereas in the large genome limit we obtain which has the same form as the polymorphic limit, Eq. (4): When the population is too small (compared to the genome length), the exploration of each genotype's mutational neighborhood is typically incomplete. Then, just as in the polymorphic limit, only random fluctuations determine which accessible genotypes are actually realized by the population. Finally, let us compare our results for large populations in the monomorphic and polymorphic limits. Most importantly, in both cases T p is inversely proportional to w pq : Rare phenotypes are hard to find. Comparing Equations (4) and (7), the only difference is that N in the polymorphic regime is replaced by L(K{1)r in the monomorphic limit. This difference is intuitive: When the population is diverse, every new individual helps exploration and reduces discovery times. But if all individuals have the same genotype, simply having ''more of the same'' does not make neutral exploration faster. However, repeated mutants may influence the fixation of adaptive phenotypes.
These results suggest that for intermediate NLm there should be a smooth transition between these two regimes. To quantify the crossover we introduce a factor c that multiplies N in Eq.(4); we expect that c?1 as either NLm becomes very large (the polymorphic limit) or N%L (the large genome limit), and that c?(K{1)Lr=N as NLm%1 and N&L (the large population monomorphic limit).

Simulations in model GP maps
In order to test our mean-field theory we study two kinds of GP maps that both include the generic properties of GP maps that we introduced earlier.
Random GP map. In the random GP map, the total number of phenotypes N P and the frequencies fF p g can be set arbitrarily (subject to the normalization constraint P NP p~1 F p~1 ). The K L |F p genotypes mapping into phenotype p are distributed randomly in genotype space. The statistical properties of the map are thus determined by the parameters L, K, and the set fF p g.
Studying this map has two motivations: First, ignoring some biophysical detail may help illuminate generic features shared by the systems described in the introduction. Second, a simple model may clarify which deviations from our theory arise from population dynamic effects rather than from detailed (and system-specific) structure in the GP map.
In this simple model, correlations between genotypes are absent, facilitating analysis of the resulting neutral spaces. For example, w pq~Fp is a good approximation as long as N P %N G and N q ,N p &1. Also, there is a percolation threshold l(K)~1{K {1=(K{1) : thus only phenotypes with F q wl(K) have completely connected neutral spaces [20].
Here we study a particular random GP map with L~12, and K~4 (as in DNA and RNA) so that there are N G~4 12 &1:68|10 7 genotypes. These map onto N P~5 8 phenotypes distributed with frequencies F p !1:2 {p . The F p vary over about 5 orders of magnitude, a range similar to the F p of L~12 RNA (see also Figure S1). To make sure that the largest neutral space percolates, its frequency is set separately as F 1~0 :5wl(4)~0:37. For several values of m, we simulated N~1000 individuals for up to 7|10 10 generations. The fitness was set as 1zs p~dp,1 so that we are effectively modelling neutral exploration on the space 1 , which is convenient for measuring all T p . We measured first discovery times for the 57 alternative phenotypes over 100 independent simulations to obtain the median time T p . Figure 2A depicts these median discovery times T p for simulations ranging from the polymorphic regime NLm&1 to the monomorphic limit NLm%1 (see also Figure S2). We note the following: 1) For all regimes the T p vary over many orders of magnitude, but they are found in fewer generations for larger m. 2) Locally frequent phenotypes (i.e. those with high w pq ) are much easier to discover. The inset of Figure 2A shows that w pq &F p , so this conclusion carries over to frequent phenotypes with large F p . 3) A subset of the phenotypes with w pq ww L :1=(K{1) L&0:028 are likely to be in the one-mutation neighbourhood of any genotype. In the monomorphic regime these are are then found by exploration of a genome so that T p is given by Eq. (8), which has the same form as the polymorphic limit, Eq. (4), as can be seen in Figure 2A. Discovery times cross over to the regime where neutral exploration is required when w pq %w L . Such behaviour can be viewed as a finite size effect: N P typically increases with L. Therefore the largest F p will likely decrease for larger systems, so that a smaller fraction of phenotypes can be found without neutral exploration. 4) In the fully polymorphic regime where each individual essentially explores independently, any phenotype with w pq w1=(NLm) is likely to be part of immediately accessible standing variation [21] in the initial population, and is therefore found quickly. Indeed, in Figure 2A for m~10 {2 , where NLm~120, these phenotypes are typically found in one or two generations on average. However, for rarer phenotypes, where neutral exploration is important, the T p are well approximated by Eq. (4). Again, the fraction of phenotypes that are immediately accessible should decrease for larger L.  (7) suffices. Instead, we use the previously introduced factor c that multiplies N in Eq. (4) to achieve quantitative accuracy. In Appendix S1 and in Figures S3 and  S4, we explore the scaling of c with the parameters N,L,m, for different dynamic regimes, and for a range of j.
In summary then, our theory derived in the previous section accurately describes the median discovery time T p of this simple random GP map as a function of the parameters N,m,w pq . We find that w pq &F p , and thus T p *1=F p in all regimes studied. The more frequent the phenotype, the earlier (and more often, see Figure S2) it appears as potentially selectable variation in an evolving population. Given the success of our theory for the random model, we now will test our theory and conclusions for a more complex GP map.
RNA secondary structure mapping. One of the best studied GP mappings has RNA genotypes of length L made up of nucleotides G, C, U and A. The phenotypes are the minimum free-energy secondary structures for the sequences, which can be efficiently calculated [15]. The number of genotypes grows as 4 L , while the number of phenotypes is thought to grow roughly as N P *1:8 L [4] so that N P %N G . Moreover, sampling and exact enumerations [5,16,22] have shown that the distribution of phenotype frequencies F p is highly biased, with a small fraction of phenotypes taking up the majority of genotypes. The neutral spaces N q are typically broken up into a number of large components that are connected by single point mutations that allow neutral exploration [16,22]. By exhaustive enumeration of the L~20 RNA mapping (see also Figure S5) we calculate the w pq between several neutral components of the 11,219 distinct secondary structures that the N G~4 20 &1:1|10 12 genotypes map to. Figure 2b shows the w pq for the largest component of the phenotype q drawn in the figure. This phenotype is ranked as the 3rd most frequent for L~20 and exhibits behaviour typical of this system. First, the w pq vary over many orders of magnitude. Second, as shown in the inset if w pq =0, then the local w pq are, to first order, proportional to the global F p . Finally, this neutral space connects to just over 75% of the total N P~1 1,219 phenotypes in this particular map: Some w pq are zero even though F p can be quite large. Generally, the number of phenotypes that can be reached from N q increases with F q [13,16].
We performed extensive simulations of the L~20 RNA system. Typical results are shown in Figure 2B. First, we note that the median discovery times vary over many orders of magnitude. The most frequent are found in a median time of T p &10 3 generations while after the maximum measured time of 2|10 9 generations, over 42% of the directly accessible phenotypes (with w pq =0) have still not been found. We estimate that over 10 13 generations would be needed to discover all accessible phenotypes, giving a ten order of magnitude range in the T p . Second, the local frequency w pq is a good predictor for ranking T p (see Figure S6 for a comparison of T p and global frequency F p ). Further, the criterion w pq~0 accurately predicts that phenotypes are not discovered (see also Figure S6). However, in contrast to the random GP map, the T p are discovered at a slower rate than predicted by Eq. (7). Instead, we use a single cv3Lr=N to renormalise N in Eq. (4). This slower Figure 2. Test of the meanfield model. A) Median discovery times T p for the random GP map averaged over 100 simulations with N~1000 and varying mutation rates. Note that the y-axis is scaled with m. In the the polymorphic limit (m~10 {2 ), Eq. (4) (dashed line) describes discovery times well for w pq v1=(NLm). Phenotypes with larger w pq are part of the standing variation typically found in the first generation (yellow dash-dotted line). In the monomorphic limit (m~10 {6 ), Eq. (7) (dotted line) quantitatively describes T p for w pq %w L , whereas Eq. (4) tracks the simulation data with just one fit parameter c~0:099 multiplying N for the intermediate regime with m~10 {4 (solid line). For w pq * > w L the curves follow Eq. (4), for reasons described in the text. Inset: For the random GP map the local phenotype frequency w pq correlates very well with the global frequency F p . B) Local frequency w pq ranked for the 8639 phenotypes that link with single point mutations from the DN q D~460,557,583 genotypes that map to this RNA structure; an example sequence from N q is shown in the figure. Inset: The local connections w pq are roughly proportional to the global frequency F p , but there is significant scatter due to the internal correlations of the RNA neutral spaces. Organge points depict the 2580 phenotypes for which w pq~0 . Light blue points depict the 4933 phenotypes that are discovered in our simulations, and the dark blue points depict the 3705 accessible phenotypes that are not found (q itself is shown in green). C) Simulations of T p (blue dots) versus w pq for the RNA phenotype shown in B), compared to Eq. (4) (solid line) with a factor c~0:070 multiplying N. Here N~100, m~10 {5 and the simulations were run for 2|10 9 generations. Also shown are the purely polymorphic (dashed) and monomorphic (dotted) predictions. Dark blue dots above 2|10 9 (dot-dashed line) depict some of the 3705 accessible phenotypes that are not found (as can be seen in see the inset of B). We estimate that about 10 13 generations would be needed to find the phenotypes with the smallest w pq =0. doi:10.1371/journal.pone.0086635.g002 discovery rate reflects the internal structure of the RNA: similar genotypes typically have similar mutational neighbourhoods [23], and so the population needs to neutrally explore longer in order to find novelty. Nevertheless, a single c factor yields a remarkably good fit for all the different phenotypes p (something we find for all source phenotypes q we have so far studied). Finally, we note that the three most frequent phenotypes are found relatively faster because they satisfy w pq * > w L . As expected, for this larger system the fraction of phenotypes for which this holds is lower than for the random GP map with smaller L.
Overall, the evolutionary dynamics of this rather complex RNA system resembles that of the much simpler random GP map. Most importantly, the discovery times vary over many orders of magnitude. More precisely, as long as w pq =0, T p !1=w pq for both the monomorphic and polymorphic regimes: Phenotypic bias leads to a simple, systematic ordering in the discovery of novel phenotypes.

The arrival of the frequent
The many orders of magnitude difference in the arrival rate of variation between phenotypes should have many important implications for evolutionary dynamics. Consider for example the situation where the population has equilibrated to a phenotype q, which was the fitness peak, when subsequently the environment changes so that a different phenotype p has a higher fitness 1zs. In order to fix, the alternative phenotype must first be found. If the time-scale T E on which the environment changes again is much longer than T p then it likely that the population will discover and fix p. However, if T E %T p , then a new phenotype p' may become more fit before p has time to fix. T p can vary over many orders of magnitude, so many potentially highly adaptive phenotypes may satisfy T p wT E and thus never be found.
Consider also the situation where two phenotypes p1 and p2 are both more fit than q after an environmental change. If s 2 ws 1 * > 1=2N, then in a standard population genetics picture, we would expect p2 to fix rather than p1 as long as T p2 T E . However, this argument ignores the rate at which variation arises. If, for example, w p1q &w p2q , then p1 may fix well before p2 is discovered and fixes.
To illustrate this effect, we study the L~12 RNA system depicted in Figure 3, where the source neutral space has N q~1 932 genotypes, while the two target phenotypes have w p1q~0 :067 and w p2q~0 :0015, so w p2q =w p1q &0:022, a relatively modest ratio compared to the what could be found from e.g. Fig. 2. For this particular system w p1p2~0 : there are no direct single mutation connections between the two target phenotypes -p1 and p2 are distinct peaks of the fitness landscape.
We simulated a population of N~1000 individuals with fixed s 1~0 :002w1=2N, but with varying ratios s 2 =s 1 §1. The population begins on phenotype q and evolves until p1 or p2 fixes.
Results are shown in Figure 4. As the mutation rate increases and the system moves from the monomorphic to polymorphic regime, the probability that p2 is discovered at least once increases (and is largely independent of fitness). Nevertheless, phenotype p1 is discovered much earlier and also much more often because w p1q &w p2q . Furthermore, in the monomorphic regime where j&1 the population remains on a single genotype g much longer than it takes to explore all the neighbours. Thus if p1 is accessible from g, then p1 is likely arise repeatedly in relatively quick succession (in ''bursts''). This effect, which arises naturally in our microscopic model [24], can significantly enhance the probability of fixation over that predicted by origin-fixation models [25] which ignore the discreteness of the source neutral space.
Overall, our simulations show how the more frequent phenotype p1 can fix at the expense of the more fit phenotype p2. Given the many orders of magnitude difference possible between the T p , such an ''arrival of the frequent'' effect may prevent the arrival of the fittest: If a highly beneficial phenotype is never discovered, a much less adaptive but easily accessible phenotype may go to fixation instead.
Finally, phenotype p2 is significantly less mutationally robust than p1 (more frequent phenotypes are typically more robust [13,16]), and so once discovered, produces deleterious mutants at a higher rate, making it harder for p2 to fix at higher mutations rates, a phenomenon known as ''survival of the flattest'' [26], observed here for the lower ratios s 2 =s 1 at higher m. Thus both the ''arrival of the frequent'' and the ''survival of the flattest'' mitigate against the fixation of phenotypes with lower frequency F p , even if their fitness is much higher.
We note that differences in neutral network size have traditionally also been taken into account in terms of free fitness [27], which -in analogy with free energy in statistical physics [28] -incorporates an entropy-like component to account for mutational effects such as genetic drift and mutational robustness. This picture provides a theoretical foundation for the ''survival of the flattest'' [26] effect we observe at high mutation rates in Figure 4. However, the ''arrival of the frequent'' effect is fundamentally different because it does not rely on mutationselection balance and quasi-equilibrium or steady-state assumptions like free-fitness theory does. Rather, it reflects the strongly non-equilibrium effect that p 2 is rarely or never found. In the example above, the difference in discovery times between p 1 and p 2 is rather modest, and so at large enough mutation rates p 2 is found fairly regularly and free-fitness could be used to analyse results in that regime. But as can be seen for instance in Figure 2 for L~20 RNA, differences in discovery times can vary over many more orders of magnitude than is the case for our particular example, so that in practice highly adaptive yet rare phenotypes may not be discovered at all, even on very long timescales.

Discussion
Mutations provide the fuel for natural selection. Based on this principle, we have presented a detailed model of evolutionary dynamics that focuses on a microscopic description of the outcome of mutations. The phenotypic effect of mutations is mediated by the genotype-phenotype (GP) map which is therefore a crucial ingredient. As outlined in the introduction, several generic features are shared by many different example maps, independent of model details. Here we mainly focussed on the fact that these mapping are highly biased: Some phenotypes are realised by orders of magnitude more genotypes than most other phenotypes.
Our calculations for a simplified random mapping and for the more complex RNA secondary structure model predict that the large bias observed in the GP maps translates into a similar order of magnitude variation in the median discovery times T p for a range of population genetic parameters. For both maps the local frequencies w pq (which predict discovery times) are a good predictor for the discovery times T p . For the random GP map w pq &F p . For RNA this relationship provides a rough first order estimate, but the local frequencies can also deviate strongly, especially when w pq~0 , which can occur even when the global frequency F p is large. For both maps the strong bias in the GP map leads to a systematic ordering of the median discovery times of alternative phenotypes, an effect that we postulate may hold for other GP maps as well.
In light of the simplicity of our mean-field approximation, its success in predicting the first-discovery time T p (cf. Figure 2) is rather striking. In the random GP map, the excellent agreement probably arises because all genotypes in the source neutral space are similar in the sense that they have the same probability distribution to have a certain mutational neighbourhood. There are static fluctuations because the number of neighbours is less than the number of states with w pq =0. But while these fluctuations have an effect on processes like fixation, they average out over the many runs used to find the mean or median T p . By contrast, in the RNA GP map mutational neighbourhoods of adjacent genotypes are often correlated [13,23] so that a single neutral mutation does not completely re-shuffle the accessible phenotypes (as the meanfield assumption would assume). This effect explains why the value of the exploration parameter c we obtain by fitting is below the value suggested by our mean-field model, and also why we still observe around 1 order of magnitude variation in T p for very similar values of w pq (see Figure 2). Despite such correlations (which we postulate may occur in other realistic GP maps), rare phenotypes (low w pq ) remain hard to find; the strong phenotypic bias in the RNA GP map provides a good a posteriori justification for our mean-field calculations: The many orders of magnitude range in w pq dominates the scale of the phenotype discovery times.
The large differences we observe in the rate with which potential variation appears should have many consequences for evolutionary dynamics. There is of course a long history of invoking processes that impose directionality on the pathways available for evolutionary exploration (see ref. [29] for a recent discussion). Here, by solving microscopic population genetic models, we show in detail just how strong these orienting processes can be. Other authors have also pointed out how evolution may favour phenotypes with large neutral networks for RNA, see e.g. refs. [5,22]. Similar points have been made for protein models [12]. Consider, for example, our L~20 RNA system. Despite its rather modest size, we find 10 orders of magnitude difference between the discovery times of frequent and rare phenotypes. These differences should be even more pronounced for larger L. In nature, selectable RNA phenotypes are of course characterised by more than just their secondary structure, and evolutionary processes don't always work at constant L. Nevertheless, it is hard to see how such enormous variations in T p would not persist in some form in much more sophisticated treatments of biological RNA. Similar arguments can be made for the other GP maps we listed above. More generally we emphasise that including the GP map in population genetic calculations may be of importance to a wide range of evolutionary questions.
We explicitly showed how phenotypes with a high local frequency can fix at the expense of locally rare phenotypes, even if the latter have much higher fitness. Taken together, these arguments suggest that the vast majority of possible phenotypes may never be found, and thus never fix, even though they may globally be the most fit: Evolutionary search is deeply non-ergodic. When Hugo de Vries was advocating for the importance of mutations in evolution, he famously said ''Natural selection may explain the survival of the fittest, but it cannot explain the arrival of the fittest'' [2]. Here we argue that the fittest may never arrive. Instead evolutionary dynamics can be dominated by the ''arrival of the frequent''. . Lines depict single mutations to itself, or to two alternative phenotypes p1 (grey) and p2 (red). The genotypes were ordered using the Fruchterman-Reingold algorithm [30]. B) Illustration of the fitness landscape. doi:10.1371/journal.pone.0086635.g003 Figure 4. The arrival of the frequent. Probability that phenotype p2 is discovered (dotted lines) or is fixed (dashed lines) as a function of mutation rate m for different relative selection coefficients s 2 =s 1 for Ns 1~2 . The probability that p2 is discovered is independent of relative fitness (within statistical simulation errors). Phenotype p1 is much more likely to fix than phenotype p2, even when the latter is much more fit, due to an ''arrival of the frequent'' phenomenon. doi:10.1371/journal.pone.0086635.g004 In the dynamic simulations, all N individuals of the population are initially assigned to a single random genotype in the source neutral space. Then the population evolves for 10N generations to reach a steady-state dispersal on the neutral space before measurements are started.

RNA
Secondary structures for RNA were predicted from sequence using the Vienna package [15], version 1.8.5 with all parameters set to their default values.

Supporting Information
Appendix S1 (PDF) Figure S1 Static properties of the random GP map. A) Global phenotypes frequencies. In addition to the distribution of frequencies F p used in our simulations (orange), the diagram also shows the frequencies of RNA secondary structures at L~12, obtained by exhaustive enumeration using the Vienna package, Version 1.8.5 with all parameters set to their default values [15]. B) Comparison of global frequencies F p and local frequencies w pq for the source neutral space q with rank 1. The robustness of phenotype q (r:w qq ) is marked in green; alternative phenotypes (p=q) are shown in light blue. The dashed line marks the equality of global and local frequency F p~wpq . The relative size of deviations becomes more severe as F p becomes small: The less genotypes map into p, the less will frozen fluctuations in the GP map average out. (TIF) Figure S2 Total number of mutants per phenotype in different dynamic settings. The diagram shows the total number of mutants M p~P T t~1 m p (t) carrying phenotype p that were produced during a total of T~10 4 =(Nm) generations of simulation under the random GP map. Dots show the average over 100 simulations, error bars show the standard deviation. The dashed lines correspond to the mean-field theory M p~N Lmw pq T that follows directly from Eq. (3). In panels B and D, the populations are in the highly polymorphic regime (NLm&1) and hence evolve towards greater robustness [19] so that the total number of non-neutral mutants is reduced. (TIF) Figure S3 Scaling of Nc with population dynamic parameters. The diagram shows the dependence of c on: A) mutation rate m, B) population size N and C) number of mutants per generation NLm. Note that the y-axis has been scaled by population size N. (TIF) Figure S4 Scaling of c with population dynamic parameters. The diagram shows the dependence of c on: A) mutation rate m, B) population size N and C) number of mutants per generation NLm. In contrast to Figure S3, the y-axis shows c without any scaling factors. (TIF) Figure S5 Phenotypic bias for RNA secondary structures of length L~20. A) Global phenotype frequencies F p for all N P~1 1,219 secondary structures. It required about 1 CPUyear on typical present-day hardware to fold all 4 20 &10 12 sequences once using the fold-routine of the Vienna package [15], version 1.8.5 with all default parameters. B-D) Local phenotype frequencies w pq around 3 neutral spaces. An example sequence and its secondary structure is given in each panel; starting from this sequence, the w pq can be obtained exactly by tracing out all possible neutral mutations and counting how often each phenotype is produced. Insets: Comparison of global and local frequencies. Accessible phenotypes (w pq w0) are drawn in blue, inaccessible phenotypes (w pq~0 ) are shown in orange and the phenotype corresponding to the neutral space itself is shown in green (w qq :r). The dashed line marks the equality of local and global frequencies F p~wpq and the dotted line indicates the minimal (non-zero) local frequency w min,q~1 =(3LN q ), corresponding to only a single mutation away from one of the N q genotypes in the neutral space. Inaccessible phenotypes with very small global frequencies are omitted for clarity. Note that all these phenotypes are relatively rare ones when compared to Fig. 2b. (TIF) Figure S6 Predictions based on global frequency. The diagram shows the same median discovery times of alternative RNA secondary structures that are displayed in Figure 2c, but here as a function of the phenotypes' global frequencies F p rather than their local frequencies w pq . The different colors indicate: Accessible phenotypes that are typically discovered within the simulation time (T p (1=2)ƒ2|10 9 ), w pq w0, light blue); accessible phenotypes that are typically not discovered (T p (1=2)w2|10 9 , w pq w0, dark blue); inaccessible phenotypes that are typically discovered (T p (1=2)ƒ2|10 9 , w pq~0 , orange); inaccessible phenotypes that are typically not discovered (T p (1=2)w2|10 9 , w pq~0 , red). The lines correspond to the prediction for T p based on global rather than local frequencies: T p (1=2)~log 2=(NLmF p ) (cf. Eq. (4)), dashed) and T p (1=2)~log 2=(3L 2 mrF p ) (cf. Eq. (7)). In contrast to the predictions based on the local frequencies w pq in Figure 2c, we note the following: 1) Several phenotypes arise even earlier than predicted by the analogue of the polymorphic limit (points below dashed line). 2) Many phenotypes are not discovered even though other phenotypes of comparable (and even much lower) frequency do arise during the simulation. 3) 4 of the most frequent, but locally inaccessible phenotypes are discovered on a time-scale when double mutations become relevant (orange dots; since N~100 and m~10 {5 , double mutants occur on the timescale t 2 &1=(N(Lm) 2~2 :5|10 5 , so if double mutations were to lead to globally random phenotypes, we expect phenotypes with w pq~0 to be discovered around T p &t 2 log 2=F p .) (TIF)

Author Contributions
Conceived and designed the experiments: AAL SS. Performed the experiments: AAL SS. Analyzed the data: AAL SS. Contributed reagents/materials/analysis tools: AAL SS. Wrote the paper: AAL SS.