Muller’s Ratchet and the Long-Term Fate of Chromosomal Inversions

Chromosomal inversions contribute widely to adaptation and speciation, yet they present a unique evolutionary puzzle as both their allelic content and frequency evolve in a feedback loop. In this simulation study, we quantified the role of the allelic content in determining the long-term fate of the inversion. Recessive deleterious mutations accumulated rapidly on both arrangements with most of them being private to a given arrangement. The emerging overdominance led to maintenance of the inversion polymorphism and strong non-adaptive divergence between arrangements. The accumulation of mutations was mitigated by gene conversion but nevertheless led to the fitness decline of at least one homokaryotype. Surprisingly, this fitness degradation could be permanently halted by the branching of an arrangement into multiple highly divergent haplotypes. Our results highlight the dynamic features of inversions by showing how the non-adaptive evolution of allelic content can play a major role in the fate of the inversion.


Introduction
Chromosomal inversions are large-scale structural mutations that may encompass millions of nucleotides but segregate together as a single unit due to repressed recombination. A surge of interest in inversions over the last 20 years has shown that inversions occur in a wide variety of taxa [1][2][3], are often found to have facilitated evolutionary processes such as adaptation and speciation [3][4][5][6][7], and are frequently under balancing selection [7]. However, we lack a solid understanding of how inversions themselves evolve and which factors determine their fate. Critically, inversions are dynamic and behave in qualitatively different ways from single-nucleotide polymorphisms (SNPs), since both their allelic content and their frequency can change over time. Incorporating this concept better into evolutionary theory will improve our ability to explain and predict the evolution of inversions in natural populations ( [8], but see [9][10][11]).
A key feature governing the evolution of inversions is the reduction in effective recombination between the standard (S) and inverted (I) arrangements. Recombination proceeds normally in both homokaryotypes (II and SS). However, in heterokayotypes (IS), single crossovers can lead to unbalanced chromosomes and therefore inviable gametes (but see [12] for other mechanisms of recombination repression). Thus, only gene conversion and double crossovers contribute to gene flux (i.e. genetic exchange between arrangements [13]), although recent studies have demonstrated that gene conversion occurs at normal or higher rates in inverted regions [14,15]. Due to the partial repression of recombination, the arrangements behave like independent populations that exchange migrants. Thus, the arrangements essentially suffer a reduced population size when compared to the rest of the genome; within each arrangement, selection is less effective and genetic drift stronger. This pseudo populationsubstructure affects both standing genetic variation and the fate of new mutations. The magnitude of this effect is governed by the frequencies of the different karyotypes (II, IS, and SS). In turn, the allelic content of the inverted and standard arrangements determines their marginal fitness and therefore the frequencies of the different karyotypes. This creates a dynamic feedback loop between the frequency and the allelic content of the arrangements, which has to date received little attention in the literature.
Here we close this gap by modelling how the allelic content of an inversion evolves during its lifetime and significantly impacts its long-term fate. Using Slim v2.6 [16], a forward simulation program, we quantify changes in the allelic content of the inverted region over time and elucidate the role of gene conversion in preventing the accumulation of recessive deleterious mutations. We find that the minority arrangement, which experiences the stronger decrease in population size, accumulates mutations rapidly, leading to a swift decline in the fitness of the corresponding homokaryotype. In smaller populations, this process also occurs in the majority arrangement, resulting in a balanced lethal system. We identify a mechanism that can stop the fitness degradation of homokaryotypes, which we term 'haplotype structuring'. We discuss how our theoretical predictions can be validated empirically, and highlight the relevance of our results to other scenarios of low recombination.

Simulations
We modeled an isolated population of diploid individuals at initial mutation-selection balance using SLiM v2.6 [16]. We simulated a population of N=25,000 (with a subset of simulations run for N=5,000) diploid individuals. The genome consisted of three chromosomes of 1Mb, with 300 kb of exons for which allelic content was simulated. The allelic content of the rest of the chromosome was not simulated to alleviate the computational load, although recombination could occur anywhere. Exons were modelled as 50 kb segments, separated from each other by 100 kb introns.
To calibrate our model, we chose parameter estimates inspired by Drosophila melanogaster [17][18][19]. In our model, mutations happened at a rate of µ=8.4 x 10 -9 per bp per generation [20]. All simulated mutations were deleterious (s < 0), recessive, and only occurred in exons. The fitness effects of deleterious mutations were drawn from a Gamma distribution Γ (α=0.05, β=10). Overall recombination rate was defined as the sum of the rate of single crossovers (CO, ρ = 3.0 x 10 -8 per base pair per meiosis [17,18]) and gene conversion (GC, γ =1.8 x 10 -8 per base pair per meiosis [19] for the rate of initiation of a gene conversion event) and corresponded to the rate of initialization of a recombination event. This overall rate was constant along the genome and for all karyotypes. However, the success of recombination initialization differed between genomic regions and karyotypes. We use the term effective recombination rate to describe the difference in realized events between karyotypes due to crossover suppression in the inverted region in heterokaryotypes. It should be noted that SLiM (in its 2.6 version) did not allow for the possibility of double crossover events. Gene conversion track length followed a Poisson distribution with parameter λ = 500 bp [19]. As recombination is generally restricted to females in D. melanogaster but occurs in all individuals in our simulation, we divided the overall recombination rate by 2 (and therefore r =(ρ + γ)/2), resulting in r = 2.4 x 10 -8 per base pair per meiosis.
Simulation with these parameters was not feasible because of the extremely large computational burden.
To reduce computation time while maintaining the same evolutionary scenario, we used the common practice of rescaling parameters so that evolutionary processes happened at an accelerated rate (see for example [48]). A recent paper showed that such rescaling may fail to accurately represent the original population genetics when the product of 2Ns is very large [21]. However, this should not be an issue in our simulations as we remain in the parameter space where using rescaled parameters should not significantly affect the genetic diversity of the population. We thus downscaled both population size and genome length by a factor 10 and upscaled the remaining parameters so that 2NµL, 2Ns, 2NrL, λ/L (with L the length of the genome) remained constant.
Following a burn-in of 200N generations to ensure that mutation-selection-drift equilibrium was attained; we assumed that an inversion occurs in a random haplotype. The inversion occurred between two given loci on chromosome one and encompassed 30% of the chromosome and 10% of the genome. We assumed that the inversion provided a small heterozygote advantage s HET =0.003 or 2Ns HET =150. We followed the fate of the newly introduced inversion over the next 200N generations or until the loss of the inversion polymorphism. We recorded the fitness distribution of the various karyotypes and the inversion frequency over time. For a given haplotype, 100 replicates were used to estimate the invasion probability, both with and without gene conversion. We performed the same analysis for 200 haplotypes from 100 random individuals. In addition to the 200 randomly chosen haplotypes, we also considered the fate of the four fittest and four least fit haplotypes (see Figure S1 for how this choice affected the mutational load of the inversion haplotype). All SLiM scripts, analysis scripts, and the seeds used to run the simulations are available at https://gitlab.com/evoldyn/inversion/wikis/home.

The Fate of the Inversion
Gene conversion had little to no effect on the short-term fate (Figure 1a) of the inverted arrangement but increased the probability that the inversion was fixed or lost in the long term ( Figure 1b). Without GC, the fate of the inversion (i.e. whether it was fixed, lost, or maintained as polymorphic over > 500,000 generations) was decided within the initial ~60,000 generations after appearance of the inversion ( Figure   1f; no losses were observed after generation 58,620). At high GC rates, this was no longer true: even if the inverted arrangement successfully invaded, a risk of losing the polymorphism through genetic drift remained ( Figure 1e). This occurs when the GC rate is high enough to partly compensate for the lack of crossing over in heterokaryotypes, which partially erases the pseudo population-substructure created by the inversion. Here, the mutational load of the majority arrangement, usually the standard, remains low through two processes. First, purifying selection remains effective in the majority arrangement due to its high frequency. Second, mutations spread between arrangements and thus neither contribute to fitness differences between the karyotypes nor impact the fate of the inversion. Under soft selection, i.e., when there are always enough offspring produced to reach carrying capacity, fitness is relative. Therefore, the fixation of deleterious mutations in the whole population does not count towards the mutational load. The resulting high fitness of the majority arrangement allows for its potential fixation through genetic drift, which can result in the loss of the inversion polymorphism.
Nei and colleagues postulated that an inverted arrangement should be able to spread in a population without additional selective advantage only if it captures a haplotype with low mutational load compared to the rest of the population [22]. This is because inversions originate in a single haplotype; therefore, any inversion homokaryotype (II) will be homozygous for all deleterious recessive mutations present in the original haplotype. Standard homokaryotypes (SS) do not suffer from their mutational load because on average they are homozygous for very few deleterious recessive mutations. Thus, only a few inversion homokaryotypes (II) have a fitness equal to or higher than the mean fitness of the standard homokaryotypes (SS) ( Figure S2). In agreement with Nei's analytical results, we only observed fixation of the inverted arrangement when the inversion occured in a haplotype with a low mutational load ( Figure   1c). However, fixation of the inverted arrangement only occurred in the presence of gene conversion and at large enough population size (N=25,000). This is because fixation is only possible if the fitness of the inverted homokaryotype remains similar to the fitness of the heterokaryotypes, requiring a low mutational load of the inverted arrangement. Initial mutational load

Pb. of fixation of the inversion
Initial mutational load

Muller's Ratchet Occurs Inside Chromosomal Inversions
Our results reveal that the content of both the inverted and standard arrangements can change dramatically through the accumulation of recessive deleterious mutations ( Figure 2). Generally, the fitness dropped more steeply in the inverted arrangement, but this pattern was reversed when the inversion occurred in a high-fitness haplotype and the inverted arrangement became the majority arrangement. Importantly, whenever the inversion invaded, both arrangements suffered a decrease in both effective population size and effective recombination rate. This had two important consequences. First, most new mutations remained private to the arrangement they occurred in. Second, recessive deleterious mutations accumulated on the arrangements (Figure 2b,d,f). Accordingly, each arrangement experienced a process similar to Muller's ratchet, which is the step-wise stochastic loss of haplotypes with the lowest mutational load [23][24][25][26]. Despite the accumulation of deleterious mutations, the inversion remained in the population due to heterokaryotype advantage. This is sometimes referred to as associative overdominance which is caused by linkage disequilibrium between the inversion and alleles within it that confer heterozygote advantage. Both overdominant as well as recessive deleterious alleles may contribute to this phenomenon [8,27,28]. In our model, overdominance of the inversion is generated by genic selection where inversions act as neutral vehicles of the individual alleles, sensu Wasserman [29,30]. However, high GC rates did not affect the fitness of the two arrangements equally, mutation accumulation was stopped in the majority arrangement. Thus, the fitness of the majority homokaryotype was scarcely affected by mutation accumulation (because a small decrease in population size means a slightly larger mutational load), whereas the fitness of the minority homokaryotype decreased to ~0 (<10 -3 ). Non-zero GC rates allowed both mutations and ancestral alleles to "jump" between arrangements and fix in the whole population, which reduced divergence between arrangements (see below) and aided the purging of deleterious mutations. At low GC rates, the global fixation rate of mutations within the inverted region (i.e. mutations that spread across arrangements) was reduced (see turquoise line, Figure 2b,d). However, at sufficiently high GC rates, mutations could spread across arrangements and fix in the whole population at a similar rate to the collinear genomic regions (see turquoise line, Figure 2f). Thus, the mutational load of the individual arrangements remains lower, but ancestral alleles can be irreversibly lost from the whole population.  The population size also has a strong impact on the long-term fate of the inversion. In larger populations,

Mutation accumulation causes strong divergence between arrangements
Whenever the inverted arrangement invaded, mutation accumulation within each arrangement resulted in fixed differences between the inverted and standard arrangement (Figure 3a,b). Unsurprisingly, more fixed differences accumulated in the absence of gene conversion (average number of fixed mutations without GC: 4,609 ± 7) than in its presence (average number of fixed mutations with GC: 182 ± 2). This strong between-arrangement divergence was reflected in high overall F ST values between arrangements within the inverted region, compared with little divergence across the rest of the chromosome (Figure 3).
Notably, no beneficial mutations are necessary for the buildup of the between-arrangement divergence.
To better understand the role of purifying selection, we can separate the deleterious mutations into two categories: effectively neutral mutations (i.e. |s| <1/(2N)) and deleterious mutations. In our simulations (see Methods), about 5% of new deleterious mutations are effectively neutral. If purifying selection is a potent force, we expect a greater proportion of fixed mutations to be effectively neutral. We find that purifying selection in large populations was relatively effective in collinear regions as ~50% of the fixed mutations were effectively neutral ( Figure S3). However, within the two arrangements, the effectiveness of purifying selection was strongly decreased, particularly in the minor arrangement. This is evidenced by the proportion of effectively neutral fixed mutations in simulations without GC (majority arrangement: 46.1% ± 0.1%; minority arrangement: 5.2% ± 0.03%). The addition of GC changed the number of fixed mutations within arrangements (see above) but barely affected the proportion of effectively neutral fixed mutations (majority arrangement: 43.6% ± 0.9%; minority arrangement: 5.4% ± 0.1%). Surprisingly, some fixed mutations were very strongly deleterious ( Figure S4). Both the strong within-arrangement divergence and the observation of less effective purifying selection support the interpretation of an inversion as a genomic region in which the population experiences a pseudo-substructure.

Appearance of haplotype structuring
The fitness degradation of one or both arrangements that we describe above was occasionally (10/1,228 runs without GC) halted by a mechanism we term haplotype structuring. When haplotype structuring occured, the subpopulation of one arrangement split into two or more divergent haplotype clusters that carried partially complementary sets of deleterious recessive alleles (see Figure 4 & 5). Homokaryotypes with two divergent haplotypes that each have a high mutational load are still relatively fit (e.g. I j I k and S j S k ) because deleterious mutations will be masked when divergent haplotypes are paired. Notably, this is equivalent to what is happening in heterokaryotypes (IS). Homokaryotypes with similar haplotypes (e.g. I j I j or S j S j ) tend to be inviable because the mutational load is no longer masked. This means that the fitness distribution of a given homokaryotype (e.g. II) has two modes; one corresponding to extremely unfit individuals and the other to relatively fit ones (see Figure 5 for a schematic). Thus, a signature of haplotype structuring in a given arrangement is that the fitness of the corresponding homokaryotypes shifts from a unimodal to a bimodal distribution ( Figure S5). Haplotype structuring is stable against recombination as the new recombinant will express both mustard and cyan mutations, leading to a lower fitness, whenever it is associated with either of the two major haplotypes. Haplotype structuring requires a significant level of within-arrangement diversity. Namely, the mutational load of the segregating haplotypes has to be high to create a large fitness difference between homokaryotype homozygotes (e.g. I j I j or S j S j ) and homokaryotype heterozygotes (e.g. I j I k or S j S k ), which in turn generates within-arrangement genic selection. Therefore, haplotype structuring is not possible in small populations or at high GC rates. At high GC rates, the mutational load of the majority arrangement is not sufficiently large for haplotype structuring to occur and there are not enough copies of the minority arrangement present to create the necessary diversity. Similarly, in small populations, the haplotype diversity necessary for haplotype structuring cannot build up or be maintained because it is overwhelmed by the diversity-reducing force of genetic drift.
The divergent haplotype clusters that result from haplotype structuring are stable and are not disrupted by recombination. This is because recombination between divergent haplotypes creates new haplotypes that expose deleterious recessive mutations to selection when paired with either one of the parental haplotypes. Therefore, any recombinant haplotype is swiftly removed from the population even though its deleterious mutations are not exposed to selection in a heterokaryotype. Haplotype structuring has previously been described by Charlesworth and Charlesworth in a model of a diploid non-recombining population with deleterious recessive mutations [37]. To confirm this similarity, we triggered haplotype structuring in simulations of whole genomes with greatly reduced recombination rates. Haplotype structuring was possible across the full range of GC rates we tested as long as crossing-over rates were low (20% or less of our default value, Figure S6). Thus, similar to how heterokaryotype advantage maintains an inversion polymorphism, heterozygote advantage at the level of the haplotype maintains the haplotype polymorphism (i.e. haplotype structuring). Importantly, although haplotype structuring halts the fitness decay of homokaryotypes, mutation accumulation continues, therefore the ratchet is not stopped.

Discussion
Chromosomal inversions are dynamic variants that behave in qualitatively different ways from other polymorphisms (SNPs, indels). Specifically, both their allelic content and their frequency change over time, leading to two intertwined levels of evolution. We demonstrate here that the allelic content of an arrangement can degrade rapidly via a Muller's ratchet-like process. While the inversion remains polymorphic in the population, we observe an accumulation of deleterious recessive mutations in one or both of the arrangements, which can result in at least one of the homokaryotypes becoming inviable. In our simulations, this fitness decay is slowed by gene conversion but can only be stopped by haplotype structuring, the appearance of multiple highly-divergent haplotypes within an arrangement. Together, our results imply that inversions observed in nature can be substantially different from the original invader even without the action of directional selection. Furthermore, we predict that they may harbor subhaplotypes within arrangements that can distort population genetic statistics.
We show that a mutation accumulation process similar to Muller's ratchet happens within the arrangements, resulting in an excess of deleterious mutations within the inverted region compared to the rest of the genome. The relationship between recombination and the efficiency of selection is well documented [38][39][40]. The increased accumulation of deleterious mutations in polymorphic inversions compared to collinear regions has previously been noted in multiple empirical studies. In seaweed flies (Coelopa frigida), a significant proportion of the observed heterokaryotype advantage could be ascribed to associative overdominance caused by deleterious recessive mutations [41]. Likewise, in Drosophila melanogaster, minority arrangements in wild populations contained significantly more p-elements [9]. A follow-up study also found increased numbers of TEs in low frequency inversions [42]. Here, the authors argued that the rate of back mutation (i.e. removal of TEs) was too high to allow for continued accumulation as predicted under Muller's ratchet. Other studies have shown that selection is reduced in inversions. In the laboratory, lethal alleles located within inversions in Drosophila melanogaster were maintained at similar frequencies for over 100 generations indicating that selection was not effective [43].
Next generation sequencing has allowed more detailed surveys of inversion content. A recent study by Jay et al. [44] examined the content of the P supergene in Heleconius which encompasses two chromosomal inversions. They found an enrichment of non-synonymous relative to synonymous substitutions, negative selection on the arrangements, and a larger proportion of transposable elements compared to the rest of the genome [44]. Overall, these results indicate that mutation accumulation may be a common process in natural inversions, where the types of mutations that are accumulated can vary.
The rate of mutation accumulation differs between the standard and inverted arrangements. The extent of this difference depends on the relative frequency of the two homokaryotypes, as most "genome shuffling" occurs within homokaryotypes. Mutation accumulation is magnified in the minority arrangement as this subpopulation experiences both a stronger reduction in population size and a lower effective recombination rate (approx. rp 2 , with r -the recombination rate and p -the frequency of the minority arrangement). Moreover, purging of recessive deleterious mutations is less efficient in the minority arrangement as the respective mutations are only exposed to selection in few individuals. Eanes et al.
developed a model showing that the minority arrangement accumulated more p-elements at lower frequencies and predictions from this model matched empirical data from D. melanogaster [9]. Other empirical studies have also illustrated the relationship between arrangement frequency and mutational load [45][46][47]. Most notably, Tuttle et al. examined the 2 m allele (an arrangement of an inverted region on chromosome 2) in white-throated sparrow (Zonotrichia albicollis), which exists almost exclusively in the heterokaryotypic state [48]. They found that 2 m contained an excess of non-synonymous fixed mutations, which is consistent with functional degradation. Here, by revealing the feedback loop between arrangement frequency and mutational load, we present an intuitive reasoning for these observations. The accumulation of recessive deleterious mutations in the arrangements led to heterokaryotype advantage caused by the masking of recessive mutations. In the theoretical literature, the role of recessive deleterious mutations has been addressed previously, mainly regarding the invasion of an inverted arrangement [22,29]. In contrast, we do not know of theoretical work that has addressed the role of deleterious mutations in the long-term maintenance of an inversion polymorphism. In nature, a contribution of deleterious recessive alleles to heterokaryotype advantage has been inferred in seaweed flies [41], but empirical tests in other taxa remain scarce. As heterokaryotypes are often observed to be fitter than homokaryotypes [49][50][51], mutation accumulation may commonly play a role in the maintenance of inversion polymorphisms.
In the age of next generation sequencing, the genomic landscape of many inversions is being dissected to elucidate the processes driving inversion evolution [7,52]. Divergence observed between arrangements is often assumed to be adaptive and/or to predate the inversion itself, whereas the process of deleterious mutation accumulation is largely ignored [7,12]. However, as we show in Figure 3, it is possible that fixed mutations between different arrangements are neither adaptive nor predating the inversion. The strong divergence between arrangements that results from deleterious mutation accumulation can produce a similar population genetics signature to that of a cluster of (co-)adapted alleles within an arrangement [53][54][55].
We were specifically interested in the long-term evolutionary fate of the inversion, when both arrangements were maintained in the population. We identified multiple stable evolutionary outcomes for each arrangement under deleterious recessive mutation accumulation (over 600N generations). They can be divided into three general categories, depending on the mutational load of the arrangement and the fitness of its corresponding homokaryotype.
First, if the mutation accumulation and the associated gradual decrease in homokarypotype fitness continued, then the corresponding homokaryotype eventually became inviable. This often occurred in only the minority arrangement. In this case the polymorphism was maintained but the minority arrangement only appeared in heterokaryotypes. When the corresponding homokaryotypes of both arrangements are inviable, only heterokaryotypes contribute to subsequent generations. Thus, the mutation accumulation process shown here is a credible model for the evolution of a balanced lethal system. Our results show that low population size and reduced gene flux favor the evolution of balanced lethality. Several empirical examples of balanced lethal systems associated with structural variants exist.
These include multiple overlapping structural variants in crested newts [35], inversions in Drosophila tropicalis [32], and translocations (similar to inversions, effective recombination in the translocated regions is also reduced) in multiple genera of plants such as Isotoma [33], Rhoeo [34], Gayophytum [36] and Oenothera [31]. Using a mathematical model inspired by the latter system, de Waal Malejit and Charlesworth proposed that the accumulation of deleterious recessive mutations could create sufficient mutational load for the maintenance of translocation heterozygosity in a selfing population, assuming a large enough mutational target [56]. To provide evidence for the evolution of balanced lethal systems through mutation accumulation in structural variants, inference of the demographic history of these populations will be essential in the future.
The second long-term outcome is the maintenance of a highly fit homokaryotype due to the low mutational load of the corresponding arrangement. This outcome was only observed in the majority arrangement and at high GC rates, where the mutation accumulation process was halted. Note that here the ratchet is truly stopped as opposed to the case of haplotype structuring, where the consequences of the ratchet are bypassed. While the majority homokaryotype maintains a stable, high fitness, the fitness of the minority homokaryotypes drops to 0. When this occurs, the minority arrangement remains at very low frequency (s HET /(1+ 2s HET ) if the fitness differences are only due to the imposed initial heterozygote advantage). Thus, this outcome is the least stable as the high frequency of the majority arrangement combined with a small fitness difference between heterokaryotypes and majority homokaryotypes facilitates fixation of the majority arrangement.
The third category of long-term stable outcomes involves haplotype structuring in one or both of the arrangements. Haplotype structuring halts the fitness decay of the corresponding homokaryotype but it does not stop the mutation accumulation process. As illustrated in Figure 5, the existence of two (or more) divergent haplotype clusters within an arrangement implies that most mutations will be masked in homokaryotype heterozygotes (e.g. I j I k or S j S k ). Similarly to what happens between arrangements, mutations tend to be private to haplotype clusters. Therefore, a subset of homokaryotypes still contributes to the next generation. The fitness consequences of the ratchet are merely bypassed due to the recessivity of the deleterious mutations.
Essentially, haplotype structuring occurs when a continual input of deleterious mutations results in associative overdominance in regions of low recombination, where it increases genetic diversity by maintaining complementary heterozygous haplotypes. Thus, the occurrence of haplotype structuring is not unique to inversions. It can also occur in diploid low-recombination systems with segregation of chromosomes. We were able to reproduce haplotype structuring using simulations with similar conditions but without assuming an inversion, provided there was a strong decrease in crossing-over rate ( Figure   S6). Using a theoretical model, Gilbert et al. recently derived that haplotype structuring can occur in lregions of low recombination under quite general conditions, especially if deleterious selection coefficients are of intermediate strength [57]. Importantly, they demonstrated that the phenomenon is also sustained with moderate amounts of dominance. Moreover, the predicted pattern of increased diversity was observed in human genomic data [57].
Haplotype structuring has been described previously [37] [35], where the authors modeled the accumulation of deleterious recessive mutations in a diploid, non-recombining, random-mating, sexual population and noted that the population could become crystallized into two divergent haplotypes.
Although we recovered the crystallization part of the process, we sometimes observed more than two haplotype clusters ( Figure S7). In this case, fitness could be multimodal ( Figure S7b Whereas various examples of balanced lethals are known (discussed above), we are not aware of existing empirical evidence for haplotype structuring in inversions. This could be for two reasons. First, compensatory evolution and/or selective sweeps of beneficial mutations within the arrangements could erase haplotype structuring. We are currently ignoring beneficial mutations; adding these to the model would lead to selective sweeps that should reduce the diversity within the population. Therefore the initial requirement of strongly divergent haplotypes would possibly not be met. Second, the pattern may have remained invisible to date due to the low density of markers available in the past as well as the current common practice of pooled sequencing, which does not reveal haplotypes. Additionally, other aspects of experimental design -for example breeding designs that allow the fitness of offspring of each mating pair to be measured -are necessary to detect the predicted bimodal fitness distribution. Future empirical work could investigate these patterns, testing explicitly for bimodal fitness distributions and for the existence of clusters of haplotypes within arrangements using individual re-sequencing data.
Our results show that inversions are dynamic variants whose allelic content can evolve and impact their evolutionary fate. We also show that non-adaptive processes in inversions can nevertheless generate "adaptive-like" signatures. These results stress that the evolution of the allelic content of the inversion should be included in future models and in interpretations of sequence variation in inversions. Our study suggests several particular evolutionary outcomes of inversion evolution, which are potentially also applicable to regions of low recombination. The advent of improved methods for genome assembly should make it possible to determine how often haplotype structuring and balanced lethals are occuring in nature.

Materials and Methods
Simulations were implemented in SliM v2.6 [16]

Figure 2
Fitness decay of the homokaryotypes and accumulation of mutations in the different arrangements (A,C,E). Fitness of the different karyotypes for the inversion and frequency (green) of the inversion over 500,000 generations following the introduction of the inversion under (A) a scenario with no gene conversion, (C) a scenario with 1/10 of the D. melanogaster gene conversion rate, and (E) a scenario with the D. melanogaster gene conversion rate. (B,D,F) Corresponding cumulative distribution of fixed mutations per kb in the inverted arrangement (red), the standard arrangement (blue), the inverted region (turquoise), and in the collinear region (black) depending on the generation when the mutation appears. Results were obtained from 1,000 replicates where we only display successful maintenance of the inversion polymorphism (5 cases with a high rate of GC, 60 cases with 1/10 of the previously used GC rate GC, and 61 cases without GC).  (A-D) represent the fitness of the different karyotypes as well as the frequency of the inversion for all 4 outcomes. Fitness of the standard homokaryotype is given by the dotted blue line, of the inverted homokaryotype by the red dashed line and of the heterokaryotype by the dash-dotted purple line. The frequency of the inversion is given by the solid green line. A) Balanced lethals, B) inverted homokaryotypic is inviable, standard homokaryotype remains viable through haplotype structuring: C) inverted homokaryotype is viable, standard homokaryotype is inviable until the inversion fixes, D) haplotype structuring in both the inverted and standard arrangements. (E-H) Allelic content of the inversion, each horizontal line represents a haplotype in the population and each vertical line represents a genomic locus. Yellow denotes that an individual possesses the derived allele and blue the ancestral one. The black circle indicates where the haplotypes were taken from. E) Mutation accumulation in the minor arrangement, F) haplotype structuring in the standard arrangement, G) purifying selection in the majority arrangement, H) haplotype structuring in the inverted arrangement.

Figure 5
Schematic representation of the consequences of haplotype structuring on the fitness distribution of the homokaryotypes. Red, cyan, and mustard represent deleterious mutations. Homokaryotypic homozygotes have a fitness near 0 while homokaryotypic heterozygotes have a positive fitness, as only the mutations that are fixed in the arrangements (in red) are expressed, while the mutations unique to each haplotype (in mustard and cyan) are masked. This leads to the bimodal distribution of fitness illustrated here. For reference the vertical lines correspond to the mean fitness of heterokaryotypes (dashed purple) and homokaryotypes (black line). Haplotype structuring is stable against recombination as the new recombinant will express both mustard and cyan mutations, leading to a lower fitness, whenever it is associated with either of the two major haplotypes.