Biophysics and population size constrains speciation in an evolutionary model of developmental system drift

Developmental system drift is a likely mechanism for the origin of hybrid incompatibilities between closely related species. We examine here the detailed mechanistic basis of hybrid incompatibilities between two allopatric lineages, for a genotype-phenotype map of developmental system drift under stabilising selection, where an organismal phenotype is conserved, but the underlying molecular phenotypes and genotype can drift. This leads to number of emergent phenomenon not obtainable by modelling genotype or phenotype alone. Our results show that: 1) speciation is more rapid at smaller population sizes with a characteristic, Orr-like, power law, but at large population sizes slow, characterised by a sub-diffusive growth law; 2) the molecular phenotypes under weakest selection contribute to the earliest incompatibilities; and 3) pair-wise incompatibilities dominate over higher order, contrary to previous predictions that the latter should dominate. The population size effect we find is consistent with previous results on allopatric divergence of transcription factor-DNA binding, where smaller populations have common ancestors with a larger drift load because genetic drift favours phenotypes which have a larger number of genotypes (higher sequence entropy) over more fit phenotypes which have far fewer genotypes; this means less substitutions are required in either lineage before incompatibilities arise. Overall, our results indicate that biophysics and population size provide a much stronger constraint to speciation than suggested by previous models, and point to a general mechanistic principle of how incompatibilities arise the under stabilising selection for an organismal phenotype.

Typical spatial patterning profile of the output transcription factor [T F ](x) over the course of the evolutionary simulations, where early times are indicated by red, becoming progressively yellow at late times (as indicated by the colour bars); a) large population size 2N κ F = 50 and b) smaller population size 2N κ F = 1. the results show evolution gives solutions which are less well adapted. on these phenotypes is relatively weak. Finally, an emergent property of the solutions is that there is a division of genotype space into a high sequence entropy region, where there are many sequences that code for slightly sub-optimal solutions corresponding to a small value of α, and a low sequence entropy, higher fitness region, which corresponds to a large value of α -this is what gives rise to the bistability visible in α for 2N κ F = 50 (see [S1] for more details). On long timescales, we see transitions between these regions, whilst on short timescales we mainly see exploration of each of these basins of attraction.

Different hybrid genotypes have different growth rates of DMIs
To examine in more detail the overall trends in the number of incompatibilities for each hybrid type, we plot the average number of DMIs as a function of divergence time µt in Fig.S3, where we have averaged over pairs of complementary genotypes (e.g. RmcA and rMCa), which have identical statistical properties. In Fig.S3, we first note that each hybrid-genotype behaves in a different, population size dependent manner. In general, there is an initial growth in the average number of DMIs as the divergence time increases, followed by a plateau or a slowing down of the growth. As denoted in the figure, we distinguish hybrid-genotypes based on the types of potential mismatch: R-type is characterised by a mismatch of the R locus with the M and C loci, M -type is a mismatch of M with the R and C loci (e.g., RmCa), C-type is a mismatch of C with R and M (e.g. rmCa) and α-type is a mismatch of the α locus with the rest of the loci. For each of the R-, M -and C-types, we assume the interaction with the α locus is relatively weak, so or example, the hybrid genotype Rmca or RmcA are both R-type. Equilibrium properties of phenotypes and fitness of genotype-phenotype map at the population sizes of 2N κ F = 50 (purple) and 2N κ F = 1 (yellow); a) Malthusian fitness (defined by Eqn.1&2 in the main text) over time, where we see whereas at large population sizes fitness is near the optimum, at low population sizes the fitness of solutions are sub-optimal and very commonly near the inviability boundary for functional vs non-functional spatial patterning; b) the morphogen steepness trait α where we see there is generally a broad variation at small population sizes, against more constrained variation at the larger population size; histogram of c) protein-DNA binding energy traits and d) of protein-protein binding energy traits, where we see that different binding energies are constrained to greater or lesser extent (indicating their selective or functional importance), which are further relaxed at smaller population sizes, indicating the increasing dominance of drift. In particular, as binding energies are determined by a sum of mismatches, the neutral distribution is a binomial distribution centred on pd pd /2 or pp ( pp + 1)/2, so that deviations from this distribution are a measure of the selective constraint.
In addition to the general trends pointed out in the main text, we see that the growth of DMIs is different for different hybrids; for small population sizes (2N κ F = 0.1 & 2N κ F = 1), M -type, C-type and R-type dominate the growth of DMIs, in this order and with only a small difference between them, whilst α-type arise far more slowly; we might expect this since substitutions in α only tend to shift the pattern away from the mid-point of the embryo, which with the model of fitness defined in Eqn.7 (main text) only moderately affects fitness. As the population size increases, and at small times, we see that initially M -type DMIs arise more slowly relative to C-types and R-types, but at longer times there is a cross-over where M -type DMIs dominate R-type, which moves to longer times at increasing population sizes.
How can we understand this general behaviour? It is clear from Eqns. 7 & 8 (main text) that we need co-evolution of the relevant sequences to maintain these the important binding energies within certain constraints; e.g. the binding energy of R to the promoter P , E RP mustn't be too strong and so on each line the sequences will co-evolve to maintain this constraint. From previous work on transcription factor DNA binding [S2, S3] we expect that the rate at which incompatibilities arise due a particular transcription factor binding site interaction j, is controlled by the product of the population size and the relative selection coefficient κ j (which should be distinguished from the overall selection coefficient for the patterning phenotype κ F ); so the binding energy E M B is under very strong selection to give high binding affinity of the morphogen to the 1st binding site, whilst for example, the binding energy E RP is under weaker selection and so as κ RP > κ M B , we would expect incompatibilities due to this

Null spectrum of DMIs
To test whether the parsimony decomposition of DMIs observed in Fig.5 of the main text, where 2-way DMIs are dominant, is not due to an inherent bias to detect 2-way DMIs of the method, we applied it to the case of randomly assigning incompatibilities to the different hybrid genotypes. This is easily accomplished by drawing random binary numbers from the Bernoulli distribution with a probability p that any hybrid is incompatible. For each random collection of incompatible hybrid genotypes the parsimony method is applied and the number of each type of n-way DMI is counted (as detailed in the Material and Methods of the main text); this is repeated 1000 times to calculate the average number of n-way DMIs shown below. To allow some comparison to the results of Fig. 5 of the main text, the value of p is varied according to the average fraction of incompatible hybrids found at different time-points from the simulations in the main text. Hence, the panels of Fig.S4 correspond to the number of n-way DMIs expected at different time-points for different scaled population sizes, where p is mapped into time using the sum of all curves in Fig.4 of the main text to calculate the fraction of incompatible hybrids at each time point. This null spectrum shows the opposite behaviour to that found in the simulations in the main text demonstrating that the results observed are not due to a random bias to find 2-way DMIs using the parsimony method.

Spectrum of 2-way DMIs
In Fig.S5 we have plotted the number of 2-way DMIs of each type, where for example, I mc is a 2-way DMI caused by an incompatibility between the M locus and C locus.  First, we note that at small population sizes the rate of increase of the different 2-way incompatibilities seem to cluster into two types; those that involve α and those that do not, where the latter arise more rapidly; this suggests that sequence entropy constraints are dominating for small populations, particularly at short times, though for the latter group there are some differences as discussed below. As the population size increases, we see that differences arise in the rate of growth between these different types of 2-way incompatibilities. Below we discuss these properties.
For each type of 2-way incompatibility there are 2 binding energy traits that could contribute. So increases in I mc could be due to an incompatibility in the hybrid that can be traced to E M B or E M P ; in this case, as the binding energy E M P is almost neutral [S1], we would expect incompatibilities to arise predominantly from E M B . Similarly, we expect I rc to be dominated by E RP and not E RB and I rm dominated bỹ E RM and notẼ M M orẼ RR , as E RP is under strongest selection of the traits coded by the R & M loci. When it comes to incompatibilities involving the α-locus, there is no clear phenotypic trait that can be identified and as we will see the analysis of these DMIs will not be so clear.
Examining Fig.S5, we see that as the scaled population size increases, the time for I mc incompatibilities to arise sharply increases, while the time for I rm increases less rapidly and I rc even less rapidly. This corroborates a key prediction of a simple model of transcription factor DNA binding described in [S3], where traits under greatest selection should generate incompatibilities less rapidly, since the distribution of their trait values are further away from inviability in the common ancestor. As observed above, E M B , which contributes most to I mc , is under the greatest selection pressure and so as the population size changes these should change most rapidly. So we see that for large population sizes, it is not the phenotypic traits under the strongest selection that give rise to significant DMIs at short times, but those under a weaker selective constraint; traits under weaker selection will be affected more by the sequence entropic pressure for poorer binding affinities and so the common ancestor is more likely to be closer to the inviability boundary 2 .
However, at small population sizes, 2N κ F ≤ 1, all DMIs grow approximately quadratically at short times with a saturating form at long times, as also seen in Fig.5 of the main text. For small population sizes, the simple model of TF-DNA binding [S3] predicts that for the same sequence length and assuming the same threshold of inviability that the rate that incompatibilities arise should be approximately independent of selection and dominated by sequence entropy; if the sequence length of the trait increases, or the effective region of incompatibility is smaller, then we would expect incompatibilities to arise more quickly. As we see in Fig.S5, for small scaled population sizes and short times, the incompatibilities not involving α have similar behaviour except I mc , which arrive more quickly; this is as expected as the effective mutation rate for I mc to arise, will be double that for I rm , since the binding of the morphogen to DNA involves 10 sites (nucleotides), vs 5 sites for protein-protein interaction. On the other hand I rc also has 10 nucleotides, but arises more slowly. We suggest this could be due to a different effective region of incompatibility, which could confound simple expectations, it is not clear how a single inviability threshold F * effectively maps to these pair-wise incompatibilities, complicating the picture further.
The 2-way incompatibilities involving the α locus are more difficult to interpret, since there is no clear trait in the patterning model associated with an interaction solely between the α locus and R, M , or C loci. As 2-way DMIs involving α are typically an order of magnitude smaller than the other DMIs they do not have a large impact on the number of DMIs.

Spectrum of 3-way & 4-way DMIs
In Fig.S6, we have plotted the 3-way DMIs as a function of divergence time µt, where the panels from left to right represent increasing scaled population size. We see that for small population sizes, 3-way DMIs between the R, M and C loci dominate at all times and in particular that the different types of DMIs of this type are all roughly equal, I Rmc ≈ I rMc ≈ I rmC . In addition, we see that all other DMIs arise more slowly and each of the 9 other types of 3-way DMIs are all again approximately equal. However, at larger population sizes this degeneracy is lifted amongst the different types of DMIs and different 3-way DMIs grow at different rates. How can we understand this general behaviour?
The patterning solution found in these evolutionary simulations involves the morphogen binding strongly to the first binding site recruiting the polymerase to bind to the promotor to turn on transcription, through a high affinity interaction between the morphogen and the polymerase; the spatial position along the length of the embryo where the transcription switches from on to off is controlled by an interaction with the steepness of the morphogen gradient α. Given this, incompatibilities between R, M and C loci could arise through a 3-way interaction where the R locus interacts with the parts of the C and M loci coding for E RP andẼ RM , or where the M locus interacts with the parts of the C and R loci coding for E M B andẼ RM . So in analogy to 2-way DMIs, where a pair of loci give rise a single phenotypic binding energy trait, whose value contributes to fitness, here the triplet of loci, R, M and C, give rise to two binding energy traits, which together contribute to fitness. These two traits will co-evolve to maintain good fitness, balanced by the constraints of sequence entropy on the underlying 3 loci; at large population sizes, the effects of sequence entropy will diminish. On the other hand 3-way incompatibilities between, for example, M , C and α could arise due to an interaction of the E M B binding energy trait with α; in this model this is subject to a sequence entropy constraint between only two loci. This is true for all the 3-way interactions that involve the α locus. Qualitatively, this then explains the behaviour at low population sizes, as sequence entropy dominates fitness, meaning that the behaviour of the different 3-way DMIs will be dominated by their underlying sequence entropy constraints.
The sequence entropy constraints for the 3-way interactions involving the α locus is straightforward and given by a binomial degeneracy function Ω(E) = E/ , so that the sequence entropy function S(E) = ln(Ω) is approximately quadratic in E, where here E represents one of the binding energies that interacts with α [S2]. However, for the other 3-way interactions that don't involve α, but involve the R, M and C loci, the sequence entropy constraint will be related to a degeneracy function Ω(E,Ẽ) = Ω(E)Ω(Ẽ), where the joint number of sequences that give E andẼ is a product, since these energy traits are coded by different sequences, even though they come from the same loci (the joint number of sequences Ω(E M B , E M P ) = Ω(E M B )Ω(E M P ) since the protein binding sequence of the morphogen that determines E M B and E M P is the same in this case). Given that the joint number of sequences that give E andẼ is a product of two binomial coefficients, the sequence entropy function will approximately be a sum of two quadratic terms S(E,Ẽ) ≈ − 2 pd (E/ pd − pd /2) 2 − 2 pp (Ẽ/ pp − pp /2) 2 . At small population sizes, where genetic drift dominates selection, we expect the distribution of common ancestors to be such that they are poised at the incompatibility boundary for E andẼ; incompatibilities then arise when substitutions arise that take hybrids across the boundary. Given that a 3-way DMI between the R, M and C genetic loci corresponds to co-evolution of a pair of binding energy traits, instead of a single binding energy trait for 2-way and 3-way DMIs that involve α, means the fraction of substitutions that lead to incompatibilities versus those that keep the hybrids compatible/fit becomes larger when going from one to two dimensions. This then explains why 3-way DMIs between the R, M and C loci gives rise to incompatibilities more quickly than those involving the α locus, as seen in Fig.S6.
4-way DMIs correspond to an interaction where all four loci require a particular combination of alleles for good patterning. As previously noted they are of much smaller number compared to 2-and 3-way DMIs, so here, we do not examine these DMIs in detail. However, we note that the 4-way DMIs shown in Fig.S7, show a similar pattern as found with 3-way DMIs, where for small scaled population sizes the DMIs tend to cluster, which suggests, as found for 2-and 3-way DMIs, this is due to sequence entropy constraints dominating the growth of DMIs; on the other hand, at large scaled population sizes this degeneracy is lifted and each hybrid has a different growth rate of DMIs, depending on their particular contribution to fitness and how that balances against the constraints of sequence entropy.