Evolution of Mutational Robustness in the Yeast Genome: A Link to Essential Genes and Meiotic Recombination Hotspots

Deleterious mutations inevitably emerge in any evolutionary process and are speculated to decisively influence the structure of the genome. Meiosis, which is thought to play a major role in handling mutations on the population level, recombines chromosomes via non-randomly distributed hot spots for meiotic recombination. In many genomes, various types of genetic elements are distributed in patterns that are currently not well understood. In particular, important (essential) genes are arranged in clusters, which often cannot be explained by a functional relationship of the involved genes. Here we show by computer simulation that essential gene (EG) clustering provides a fitness benefit in handling deleterious mutations in sexual populations with variable levels of inbreeding and outbreeding. We find that recessive lethal mutations enforce a selective pressure towards clustered genome architectures. Our simulations correctly predict (i) the evolution of non-random distributions of meiotic crossovers, (ii) the genome-wide anti-correlation of meiotic crossovers and EG clustering, (iii) the evolution of EG enrichment in pericentromeric regions and (iv) the associated absence of meiotic crossovers (cold centromeres). Our results furthermore predict optimal crossover rates for yeast chromosomes, which match the experimentally determined rates. Using a Saccharomyces cerevisiae conditional mutator strain, we show that haploid lethal phenotypes result predominantly from mutation of single loci and generally do not impair mating, which leads to an accumulation of mutational load following meiosis and mating. We hypothesize that purging of deleterious mutations in essential genes constitutes an important factor driving meiotic crossover. Therefore, the increased robustness of populations to deleterious mutations, which arises from clustered genome architectures, may provide a significant selective force shaping crossover distribution. Our analysis reveals a new aspect of the evolution of genome architectures that complements insights about molecular constraints, such as the interference of pericentromeric crossovers with chromosome segregation.


D) Supplementary Figures, Tables and Videos
Suppl. Figure 1: Mutations and selection for reproductive fitness in S. digitalis Suppl. Figure 2: R max of S. cerevisiae chromosome IX and random chromosomes Suppl. Figure 3: The effect of EG clustering on R max at different inbreeding rates Suppl. Figure 4: Evolution of pericentromeric EG clustering requires a mating type Suppl. Figure 5: Evolution of EG clustering in S. cerevisiae X-like chromosomes Suppl. Figure 6: Competition analysis of mating type switching populations vs. non-switching populations at different levels of deleterious pre-load Suppl. Figure 7: Number of crossovers and ORFs in S. cerevisiae chromosomes Suppl. Figure 8: Competition analysis of crossing over rates in yeast chromosomes Suppl. Figure 9: FACS sorting of spores and dyads Suppl. Figure 10: Germination efficiency and colony size distribution of FACS-sorted single spores Suppl. Figure 11: Mating success rates in the presence of mutational load Suppl. Figure 12: Purging and survival rates in the evolution of clustering experiment Table 1: Essential gene clustering in S. cerevisiae Table 2: The parameters and simulation modules of S. digitalis Video S1: Maintenance of EG clustering at low and at high mutation rates Video S2: Evolution of EG clustering

E) S. digitalis Simulation Settings F) References
A) The Computer Simulation S. digitalis 1

. Overview of the computer simulation
The computer simulation termed S. digitalis has been designed to model the basic life cycle of populations of unicellular diploid individuals containing simplified descriptions of their genomes. In this section, a general outline of the simulation is given, including descriptions of all simulation rules. Since the simulation source code is included as supplementary online material, a detailed documentation of all modules (including some optional features not used in the present study) will be provided in the following sections (see Sections A2-A12, Table 2), allowing users to set up their own in silico experiments using S. digitalis.
The life cycle of the simulation considers diploid individuals that are subjected to alternating rounds of vegetative (mitosis) or sexual (meiosis) divisions. Meiotic progenies immediately return to a diploid life cycle via mating. Mating type loci (MAT), of which two opposite types exist (MATa and MATα), are optional. If active, mating requires that engaged individuals have opposite MAT loci. Mating can occur between individuals from the same meiosis (inbreeding, automixis, intratetrad mating) or between individuals from different meioses (outbreeding, amphimixis).
The framework of the simulation represents each individual by a genome that consists of one diploid chromosome. The basic building blocks of these chromosomes are essential genes (EGs), non-essential genes (NEGs) and intergenic elements (IEs) (see Section A2). IEs are either proficient for meiotic recombination (hotspots) or not (coldspots). Additionally, mating type loci (MAT) can be introduced. The position of the MAT loci is then either linked to a discrete location in the chromosome, or present externally on another chromosome. If present on another chromosome, the MAT is linked to the simulated chromosome via the centromere, which can be simulated at a specified position. If MAT loci are enabled, a diploid individual contains one MAT locus of each type on each chromosome of the homologous pair at identical positions.
The population of individuals is implemented as a two-dimensional matrix. Each pair of adjacent columns in this matrix defines the structural composition of one specific genome present in one diploid individual in the population (see Figure 3A in Main Text). Each element within a column accounts for one gene or one intergenic element. Genes and intergenic elements alternate. EGs exhibit identity; a viable individual needs at least one functional copy of each essential gene. NEGs do not exhibit identity.
An initial set of genomes (the "seeding genome") is provided at the start of the simulation and generated either by the user or by an integrated random genome generator that constructs genomes according to specifications (number of elements, specific or random distribution of individual elements). In some experiments (see Figure 4D in Main Text), we analyzed the performance of digital yeast chromosomes under mutagenic stress. These chromosomes were modeled according to genome-wide data on crossing over sites published by Mancera et al. (2008) [1] (see Section A3). The size of the initial population is adjustable; however, at least one individual must be present at any time during the simulation. If the number of individuals is reduced to zero, the program stops and the population is considered extinct. Genomes are removed from the matrix, if one of the following situations arises: the genome lacks a functional copy of at least one EG, or a genome has been randomly selected for removal ("starvation") in an overpopulated matrix, i.e. in a matrix with a number of columns that is larger than the specified population size cap.
S. digitalis iteratively applies a simulated life cycle to the initial set of genomes (see Figure 3B-D in Main Text). The three main modules of this cycle are mitosis, meiosis and mating (in this order). Mitosis is initiated by a duplication of the genomes in the population matrix, i.e. a copy of each pair of adjacent columns is generated. Subsequently, mutations that lead to the functional inactivation of essential genes at a specified rate R are applied. In addition, structural rearrangements (optionally either swapping of genomic sites, which may either be genes or IEs, or inversions of entire chromosomal fragments within a column) can be applied to the population matrix at random (see Section A4). The structural fitness of each genome is then evaluated. The fitness is defined as "1", if at least one functional copy of each essential gene within a genome is present; otherwise it is defined as "0". Zero-fitness genomes are removed from the matrix. If the total number n of vital genomes is larger than a pre-defined limiting value m (the "population size cap"), n -m genomes are removed from the matrix at random.
Meiosis also starts with a duplication of the matrix's columns, i.e. four haploids are generated from each diploid genome, and subsequently mutations in essential genes can be applied. Unlike in mitosis, each quadruplet of haploids then undergoes meiotic recombination. In the recombination module, fragments are exchanged between the haploids. The locations of these fragments are determined by computation of crossover events between the haploids. A crossover can only occur at hotspots and only between non-sister chromatids. Three different algorithms are implemented for the computation of crossover events in the genome (see Section A5). The distribution of crossover events in the genome is modeled by the mathematical description of "crossover interference" that quantifies the likelihood of finding two crossover events at a certain distance to each other ( Figure 3C in Main Text).
Mathematically, crossover interference is described by an Erlang probability density distribution [2,3] (F. W. Stahl, personal communication). The total number of resulting crossover events is then statistically defined by this probability density distribution. Each crossover event is fully characterized by its location in the genome, the two participating haploids and the parity of the resulting exchange of fragments (the decision of exchanging fragments upstream or downstream of the crossover location). The simulation framework also allows the user to switch off crossover interference, but this option was not used in the present work (see Section A5, paragraphs on "Erlang-based computation of random crossover locations" and "Random recombination").
Having determined the structural composition of all haploids on the basis of the calculated crossover events, new diploid genomes are defined by the mating module. Three mating concepts have been implemented: inbreeding, outbreeding and mating type switching (see Section A6). In inbreeding ("intratetrad mating" [4], amphimixis), non-sister or sister chromatids from the same parent individual are combined to form diploid genomes. Depending on the presence of a mating type locus, four (+MAT) or six (-MAT) different pairings of haploids are possible for each set of four haploids (see Figure 3D in Main Text). In outbreeding, haploids from different parent individuals, i.e. columns from different tetradsubgroups in the population matrix, are combined. In mating type switching, haploids are selected at random, duplicated and combined with their own copy. The simulation framework allows defining an inbreeding fraction i (the probability of inbreeding used for determining the mating events) as well as (optionally) a mating type switching probability s. In scenarios involving outbreeding (i < 1), the outbreeding mating partners are chosen at random from the entire population matrix, but in agreement with the rules of inbreeding and outbreeding (see Section A6). Only haploids of opposite mating types can be combined, if a mating type locus is present. If mating results in a lethal structural constellation (see above) the corresponding pair of columns is removed from the matrix. At the end of each mitotic and meiotic cycle, the population size cap is evaluated and if necessary, random removals of genomes are performed in order to limit the number of genomes to the pre-defined maximum value ("starvation").
The simulated life cycle of the population matrix is iterated until the pre-defined number of cycles has been reached or the population becomes extinct (e.g. due to a high mutation rate R that does not sustain growth).
S. digitalis is able to perform a wide range of experiments. The major applications shown in this study are mutational robustness benchmarks (see Section A7), evolution/maintenance of essential gene clustering (see Section A8), evolution of complex genome architectures under variable inbreeding/outbreeding conditions (see Section A9) and survival competition scenarios (see Section A10). In mutational robustness benchmarks, the maximum mutation rate R max a given genome architecture is able to endure is being determined. In evolution/maintenance of essential gene clustering experiments, the formation and disruption of essential gene clusters is monitored and analyzed (see Section A11), with a focus on the determination of the boundary conditions (properties of meiosis/recombination, breeding strategy and mating type, mutation and EG and hotspot distributions) that provide higher fitness (competitive advantage) or mutational robustness (maximum value of R a population can withstand) or that allows for the evolution of particular non-random distributions of chromosomal elements. Survival competition experiments simulate the coexistence of two populations with different properties (different architectures or different mating/recombination behaviors) in an environment that sustains a (predefined) maximum total population (sum of the two individual subpopulations). Survival competition experiments allow to determine the population that is best designed for survival in a given environment (i.e. in a specific parameter space).
Most of the modules and parameters of the simulation can be modified by the user. Therefore, we provide an overview table that lists all properties as well as brief instructions on how to operate the program (see Section A12 and Table 2).

The simplified concept of genomic structure in the simulation framework
Each of the genomic elements in the population matrix is characterized by an integer identifier according to the following code: Recombination deficient intergenic elements (recombination coldspots) are represented by the value "0". Recombination proficient intergenic elements (recombination hotspots) are indicated by the value "1". Intergenic elements are always flanked by two genes. Since genes and intergenic elements alternate in each column, each genome with a total length of 2k+1 structural components consists of k+1 genes and k intergenic elements. Genes are categorized as non-essential and essential genes. Since a deleterious mutation in a non-essential gene has no further effect within the simulation framework, both functional and non-functional non-essential genes are indicated by the same value ("2") in the matrix. Essential genes, however, have a unique identity and are therefore represented by a unique identifier. Depending on their mutagenic state the identifier starts with a leading "1" (functional essential gene) or a leading "2" (mutated essential gene), followed by three digits that define the gene's identity.
The structural composition of S. cerevisiae chromosome IX shall serve as an example. The digital version of this chromosome consists of 413 elements (n genes = 207 genes and n IEs = 206 intergenic elements). n EGs = 35 of the 207 genes are essential genes and represented by the identifier "10XX" with "XX" ranging from "01" to "35". The remaining 172 genes are nonessential genes. n hotspots = 58 of the 206 intergenic elements are recombination hotspots. One chromosome IX is represented by the following sequence of identifiers (reformatted into a row): Since the investigation of the impact of essential gene clustering in regions of low meiotic recombination is a major focus of this work, a module for the automated generation of clustered genome structures (which then function as seeding genomes) was implemented. Using this module, the user may provide the parameters n genes , n EGs and n hotspots as well as a fourth parameter, the number of essential gene clusters n cluster (with n cluster ≤ n EGs ). n cluster defines the total number of recombination hotspot-free genome fragments that are separated by at least one recombination hotspot. The essential genes are evenly distributed amongst these fragments, while the recombination hotspots are evenly distributed amongst the interfragment regions. If n cluster = n EGs , any two essential genes in the seeding genome are separated from each other by at least one recombination hotspot. This situation is referred to as a "maximally unclustered genome". If n cluster = 1, all essential genes are located in one single large essential gene cluster that contains only recombination coldspots as intergenic elements. This situation is referred to as a "maximally clustered genome". Additionally, there are modules for the generation of randomly structured genomes (based on the layout parameters n genes , n EGs and n hotspots ) and for the initialization of seeding genomes from a userprovided database.
Finally, since the seeding genome does not have to contain all essential genes in two functional copies, all modules can be combined with a seeding mutation module that randomly mutates essential genes (indicated by the identifier "2XXX") according to a userdefined ratio prior to the start of the simulation.

Digitalization of yeast chromosome architectures
Genome-wide information about the position of genes, recombination hotspots and centromeres as well as the categorization of the genes (essential/non-essential character) were obtained from www.yeastgenome.org (for the positions of essential genes), from Gerton et al., (2000) [5] (for the positions of recombination hotspots) and from Mancera et al. (2008) [1] (for the positions of the break points of crossovers). We used the following algorithm to convert the Mancera et al. data into digital chromosomes in the S. digitalis framework: 1. For each chromosome, an array of the length (2n genes -1) was generated as a structural template. The numbers "2" or "1XXX" were assigned to the odd-indexed slots from 1 to (2n genes -1), depending on whether Mancera et al. defined the corresponding gene as being essential or not. As explained in Section A2, "2" represents non-essential genes, while a four-digit code (leading "1" followed by a unique identification number) represents the essential genes.
2. The numbers "0" (coldspot) or "1" (hotspot) were assigned to the even-indexed slots from 2 to (2n genes -2), depending on whether Mancera et al. found at least one recombination event between the genes flanking the respective intergenic slot. It should be noted that by assigning the hotspot-character to each intergenic fragment with at least one detected recombination event, we are likely to underestimate the level of essential gene clustering in the real chromosome. Despite this conservative approach we find a significantly better performance of the digital yeast chromosomes if compared to random architectures in a survival competition assay.
3. The index of the gene or intergenic element closest to the measured centromere position was defined as the centromere position in the respective digital chromosome.
An analogous algorithm was applied to the data from Gerton et al. (2000) [5], which we initially used for digitalization of chromosome IX (at this time point of the work the data by Mancera et al. (2008) [1] was not yet available).

Mutations and genomic rearrangements in mitosis
Deleterious mutations and structural arrangements occur at random at the end of mitosis. Alternatively, mutations may also be activated after the duplication step in meiosis, or optionally both in mitosis and meiosis. The mutation rate R defines the average number of essential gene inactivating mutations per mitosis and per genome (or per life cycle and per genome, if the meiotic module is active as well). The statistical probability defined by R is assumed to be identical for all genes. Intergenic elements cannot mutate. Optionally, the simulation framework allows for an adaptive mutation rate (a feature that was not studied in this work). In this scenario, the mutation rate is continuously adjusted during simulation runtime, based on an evaluation of the population size and the speed of population growth. The framework subjects the population to the maximum mutagenic stress that still allows for the population's survival. In other words, the mutation rate is increased if the population grows rapidly, while it is decreased if the population is in danger of becoming extinct. While this type of experiment in principle allows for the determination of the mutational robustness R max of genome architectures, we employed a different and more robust approach for this purpose (see Section A7).
Similarly, the genomic rearrangement rate r defines the probability of a restructuring event per genomic element and per mitosis. Rearrangements can be applied in two different ways, either by site swapping or fragment inversions. Both mechanisms are related, since the effect of a site swapping can also be achieved by a pair of fragment inversions (see below). The simulation framework provides a switch that allows the user to assign the active module. For this work, we restricted the simulation to the application of the swapping module.
If a swapping event is applied to a genomic element e 1 (either a gene or an intergenic element), a random swapping target e 2 is determined within the same column of the population matrix (i.e. on the same chromatid). The algorithm then determines the homologous site e 1 ' on the second chromatid of the same chromosome. If e 1 is an essential gene, the algorithm defines the genomic element with the same identity on the second homologue as e 1 '. In any other case (identifiers "0", "1" or "2") the geometrically closest region with the same identifier is assigned as the homologous region e 1 '. e 2 ' is determined in the same way. In the last step, the algorithm swaps the identifiers at the positions e 1 and e 1 ' as well as at e 2 and e 2 '.
Fragment inversions are performed similarly. Since genomic rearrangements can lead to a decrease in the level of homology of the two strands of a chromosome in outbreeding situations (but not in inbreeding situations), fragment inversions are only allowed in inbreeding experiments (see below). If an inversion event is assigned to a genomic element e 1 , a second element e 2 is determined in the same column of the population matrix in order to mark the end point of the inversion. The homologous elements e 1 ', e 2 ' are then determined as described above and the sections [e 1 e 2 ] on the first homologue and [e 1 ' e 2 '] on the second are inverted.
Additional rules apply for outbreeding in the presence of genomic rearrangements (a scenario that was not applied in this work). We observed amplification of EGs as a consequence of outbreeding in populations containing individuals with different chromosome architectures.
To prevent or restrict this phenomenon additional parameters have to be provided, including the required number of NEGs and the reproductive barrier (the level of homology that chromosomes require in order to allow a faithful meiosis I). While such parameters can in principle be specified in S. digitalis, we did not activate these modules in the present work.
Mitotic rearrangements can also influence the position of the mating type locus. If the genomic location of the mating type locus is linked to a site that becomes subject of a swapping or inversion event, the mating type locus remains linked to this site and is therefore also repositioned.

Computation of crossover events on the basis of crossover interference
The computation of crossover events on the basis of crossover interference is facilitated via an Erlang probability density distribution (see Figure 3C in Main Text). This mathematical model quantifies the probability p of measuring a distance d between two crossovers. The distance d between the two crossovers can be defined as the number of intermediate recombination hotspots (genetic distance definition) or as the total number of intermediate genetic elements (physical distance definition).
The Erlang distribution with the shape factor k is defined as follows: A shape factor k = 4 describes crossover interference in S. cerevisiae best (information kindly provided by Frank Stahl). This results in the probability distribution: An optional scaling factor s was implemented in the simulation. This factor allows adjusting the average number of crossover events per genome, a degree of freedom that was investigated in the analyses shown in Figure 10 in Main Text. Depending on the distance definition, the discrete distance unit u is normalized by n max = n genes (physical definition) or n max = n hotspots (genetic definition).
In the simplified framework of the simulation, each recombination hotspot contributes equally to the computation of the distance d, i.e. all hotpots are assumed to be characterized by the same quantitative intrinsic ability of facilitating double strand breaks. In order to compute a probability distribution q 1 for the discrete distances of the simulation framework, the probability density distribution must be stepwise integrated.
The integration over q 1 results in a probability distribution q 2 that returns the probability for observing the next crossover event within n discrete distance units.
The occurrence of recombination is a statistical phenomenon in the simulation, i.e. in addition to activating crossover interference the user can define a probability per genome for the initiation of the computation of crossover events. The computation of a complete series of crossovers for a given tetrad starts with localizing all hotspots within the four haploids. One of these hotspots is chosen at random for the first crossover event. The probability distribution q 2 is then iteratively employed to determine the location of neighboring events (both up-and downstream of the initial event). If a recombination hotspot is present on all four haploids (as compared to only two haploids) at a specific genomic location and the distance definition mode is set to genetic distance, the probability of a crossover event at that position is doubled. The computation of crossover events is aborted, if the next target region is located outside the population matrix.
In the last step, the identity of the involved haploids is randomly determined for each crossover event. Four configurations are possible. For a tetrad A A' B B', which is created by duplication of the genome A B, the crossover constellations AB, AB', A'B, A'B' are allowed. Crossovers cannot occur between sister chromatids. Having computed the entire set of crossovers and the corresponding haploid configurations, the algorithm sequentially evaluates the events (starting at one end of the chromatid). A random flag determines whether the first chromosomal fragment, i.e. the fragment that stretches from the chromatids' end to the first active recombination hotspot, is recombined or not. Subsequent fragments are then recombined in an alternating fashion. The mating types of haploids are swapped if the mating type locus is located on a chromosomal fragment that is affected by meiotic recombination.

Computation of random crossover locations in the absence of crossover interference
The simulation allows investigating the effect of crossover interference. We did not use this module in the current work, but, since it is provided in the code, a brief description of its implementation shall be given: In order to achieve identical levels of crossover as compared to the crossover interference situation, the total number of crossover is first computed using the Erlang-based computation of crossovers. The actual positions of crossovers are discarded and replaced by locations that are computed from a uniform probability distribution within the set of coordinates provided by the recombination hotspots. This approach ensures that in the statistical average the number of crossovers is identical to a situation, in which crossover interference is active. The locations of the crossovers, however, are not affected by the Erlang distribution.

Random recombination with fixed crossover frequencies
In this scenario, the number of crossover events is not subjected to statistical fluctuations, but rather given by the statistical average c defined in equation 6. The locations of the crossover events are determined by a uniform probability distribution. This module was not employed in the experiments presented in this study.

Inbreeding, outbreeding, the mating type locus and mating type switching
Starting with n genomes at the onset of meiosis, the meiotic recombination module will provide 4n restructured haploids that subsequently undergo the mating procedure. The ) originates from the paternal genome p and is distinguished from the other haploids of the same tetrad by the identifier q. The simulation interface allows defining an arbitrary inbreeding/outbreeding-ratio i (with 0 ≤ i ≤ 1). Thus, taking into account the optional presence of a mating type locus, four different mating situations can occur. The mating type locus introduces an additional parameter α that is either assigned as "-1" or "1", depending on the mating type of the haploid. The following rules apply to the six identifiers a, b, c, d, α and β of the two mating haploids h a,b (with mating type α) and h c,d (with mating type β): 1. inbreeding without a mating type locus: a = b; α and β are undefined.
2. inbreeding with a mating type locus: a = b; choice of c and d must result in α·β = -1.
3. outbreeding without a mating type locus: a ≠ b; α and β are undefined.
4. outbreeding with a mating type locus: a ≠ b; choice of c and d must result in α·β = -1.
Inbreeding and outbreeding events are computed in agreement with these four rules, but at random with respect to the existing degrees of freedom.
Optionally, a mating type switching probability s may be defined. If s > 0, an according number of haploids are picked at random and subjected to a module for mating type switching. These haploids do not participate in inbreeding or outbreeding. Instead, their architectures are duplicated and the haploids are combined with their own copy. Since this step will inevitably lead to the homozygotisation of any essential gene mutation, an individual with a diploid genome that has been created by mating type switching can only survive, if none of its essential genes are mutated.
The mating type can be exchanged between haploids by means of crossovers during meiosis and can be the target of structural rearrangements during mitosis. The genome duplications that occur at the onset of mitosis and meiosis transfer the identity of the mating type to the new chromatids.
The simulation framework provides the option to define an "external" mating type locus position. This option considers a scenario, in which the simulated chromosome is linked to the mating type located on another (non-simulated) chromosome in the same cell. Mathematically, this situation corresponds to a 50% probability per meiosis for an inversion of the mating type of the four haploids in a tetrad.

Determination of the mutational robustness R max
The simulation interface provides the option to screen for the population survival boundary with respect to the mutation rate R. This parameter, the mutational robustness R max , is defined as the average mutation rate at which the transition from population survival to population extinction occurs. In other words, R max is the largest average mutation rate at which the simulated population still survives for a predefined number of life cycles. If R max can be approximated from previous investigations, the computation time for R max can be significantly reduced by specifying the desired resolution R res and an interval of mutation ; R R that includes the mutational robustness. The simulation algorithm then starts the first simulation run at the specified maximum mutation rate R 1 and iterates the experiment with reduced mutation rates (using the step size R res ) until the first successful experiment is completed, i.e. until a population survives the predefined number of simulation cycles. The simulation returns the mutation rate that corresponds to this run.
Since R max is subject to statistical fluctuations, the user can specify the level of averaging a performed in the calculation. If a > 1, the procedure of computing R max is repeated (a -1) times and an array of the resulting mutation rate survival boundaries as well as the average mutation rate survival boundary , max a R and its standard deviation are returned.

The clustering score
The investigation of clustering of essential genes in genomic regions is a major focus of the simulation. Therefore, a parameter has been introduced that quantifies the level of clustering and allows comparing different structural constellations. This parameter is termed "clustering score" v. The purpose of the clustering score is the rating of genomes, providing a low score for unclustered genomes and a high score for strongly clustered ones. In a maximally unclustered genome, any pair of essential genes on a chromosome is separated by at least one recombination hotspot. This structure scores the lowest v-value. The highest level of clustering is reflected by a genome with one large essential gene cluster, i.e. a continuous fragment that contains all essential genes but no recombination hotspots as intergenic elements. This situation scores the highest v-value.
The clustering score analysis of a genome is performed as follows. The algorithm detects all positions e i (i = 1..n) in the genome that at least one of the genome's two homologues has a recombination hotspot at. Additionally, the positions e 0 and e n+1 are defined as the two end coordinates of the genome. In the second step, the algorithm computes the numbers s i (i = 1..(n + 1)) of essential genes that are located in each fragment [ ] The clustering score v is then defined as: The clustering score is normalized by the total number of essential genes n EGs and the total number of hotspot-separated structural fragments n hotspots + 1 in the genome as indicated in equation 8. The normalization factor allows comparing the level of essential gene clustering in different genome layouts.
The seeding genome of the "creation of clustering in small genomes" experiment shall serve as an example (see Figures 7C/D in Main Text). The genome layout is characterized by n EGs = 5 essential genes and a total of n genes = 10 genes and n hotspots = 4 recombination hotspots.
Due to the nature of its definition, the clustering score is particularly sensitive to large clusters. Some non-random distributions may yield relatively low scores, if they are accompanied by an unusually large number of single EGs that are flanked by hotspots. While this can be seen as the main disadvantage of the "clustering score", it should also be noted that this parameter is particularly robust, analytically accessible and suitable for comparing chromosomes of different lengths, due to the normalization on total hotspot and essential gene numbers. These were our main reasons to use this score. For a discussion of an alternative sore, the "grouping score", see Section A11 ("Simulation protocols and sliding window analyses").

Evolution of complex architectures under variable inbreeding/outbreeding conditions
This simulation module allows defining initial populations with arbitrary sizes and arbitrary levels of complexity. The aspect of complexity arises from the option of categorizing genomes with different architectures in sub-populations within the main population. Only individuals within the same sub-population can be partners in outbreeding. Arbitrary mutation rates, genome rearrangement rates and inbreeding/outbreeding ratios can be applied during the life cycle. The simulation framework keeps track of the modification of individuals by stochastic rearrangements and defines new sub-populations upon changes in the architecture that lead to repositioning of essential genes, recombination hotspots and coldspots and mating type loci. This dynamic grouping of the population matrix into subpopulations allows applying a global user-defined inbreeding/outbreeding-ratio without the need to consider a potential architectural incompatibility of outbreeding individuals. For all aspects of the life cycle other than mating (e.g. nutritional supply, stochastic mutations), the population matrix is considered as one entity irrespective of the sub-population groupings and all individuals are therefore subjected to the same conditions. As a consequence, this simulation module is particularly useful for the evolution and monitoring of large genome architectures under variable inbreeding/outbreeding conditions. This aspect of the simulation has been implemented to facilitate the experimental analysis of the evolution of MAT-linked and peripheral essential gene clustering in large populations with yeast chromosome IX-like genome content (see Figure 8A in Main Text). Initially, the 35 essential genes, 172 non-essential genes, 58 recombination hotspots and 148 recombination coldspots of the chromosome IX-like genomes were arranged such that a genome architecture with a minimal level of clustering resulted. In this architecture, each pair of essential genes was separated by at least one recombination hotspot (clustering score = n hotspots + 1 = 59). The evolution of the architectures in the initial seeding populations was then monitored over a time period of 100,000 generations with active/inactive mating types and at low and high mutation rates (see Video S2).

Survival competition experiments
Some of the results shown in this paper are based on survival competition experiments, in which two populations with different initial genomes compete for the same pool of nutrients. Nutrient limitation is mimicked by a maximum population size that applies to the sum of individuals from both sub-populations. Binary identifiers map the individuals in the initial population matrix to the sub-populations. The simulation then monitors and analyzes the evolution time-course of both populations. Mating occurs strictly within the sub-populations, but both populations together are subjected to the population size cap, i.e. if the size of one population stagnates but the other population grows rapidly, the stagnating population will also be affected by starvation. The populations can be subjected to different reproduction mechanisms (vegetative/sexual) or different recombination rates. The mating types in the two populations can be positioned differently (e.g. externally in one population and internally in the other, or simply in different regions of the chromosomes).
Each competition experiment starts with the same number of individuals in both populations. There are four possible outcomes of the simulation run, which lasts either until all individuals of (at least) one of the sub-populations are extinct (cases 1, 2 and 4) or until a specified maximum number of generations is reached (case 3): 1. Population A prevails, i.e. population B is extinct and at least one individual of population A is still alive.
2. Population B prevails, i.e. population A is extinct and at least one individual of population B is still alive.
3. Both populations survive a predefined number of simulation cycles (generations).

Both populations simultaneously become extinct before a predefined number of simulation cycles is reached.
This competition scenario is a simple and straightforward way of comparing the mutational robustness provided by different genome architectures, mating behavior or recombination frequencies.

Simulation protocols and sliding window analyses
At simulation run-time, protocols are generated that provide an easy access to pre-processed data, which are directly obtained from the population matrix. In order to save disk space, the population matrix itself is stored on the hard disk only every one-hundredth generation. Using the dynamic protocols, every single event that occurred within the simulation framework and thus also the population matrix at any given time point can be reconstructed. The core protocols monitor the position of recombination hotspots in the matrix, the size of essential gene clusters, the "starvation" of individuals (i.e. the removal of genomes induced by the population size cap), the crossovers during meiotic recombination, mitotic rearrangements, mitotic mutations and the computation of mating partners.
Other protocols monitor the population size after mitosis and after meiosis, the average number of mutations in the population, the average number of functional deleterious mutations (affecting essential genes), the number of genome removals due to homozygous mutations during mitosis and during meiosis, the average number of rearrangements, the average number of recombination events, the number of attempted and successful inbreeding and outbreeding events, the number of genome removals due to a lack of homology or due to a lack of recombination hotspots (only for outbreeding experiments in combination with a non-zero rearrangement rate), the amount of mutagenic load in the population during mitosis and meiosis and the amount of mutagenic load in deleted genomes during mitosis and meiosis (mutagenic purging).
Finally, two types of structural analyses are performed at simulation run-time: the determination of the average population clustering score (see Section A8) and a slidingwindow histogram analysis of the genetic contents of the population matrix.
In the sliding-window analysis, each column in the population matrix is analyzed with respect to a local presence of essential genes and recombination hotspots. A sliding window filter, typically of the size n window = 10 genetic units, is moved unit-wise from the top to the bottom of each column, while the total number of essential genes and hotspots is noted in two protocol arrays. In the histogram analysis the frequencies of the sliding window counts are determined. The "grouping score" g is defined as the standard deviation of the array that results by subtracting the sliding-window array for essential genes from the array for recombination hotspots. g is another measure for essential gene clustering in the population matrix. We found both the grouping score g and the clustering score v to be useful in the analysis of structural phenomena in the simulation. However, the clustering score v (in contrast to the grouping score g) gives access to a simple analytical assessment of the simulation's results (see e.g. Figure 7B-D in Main Text). Therefore, the results derived in this study are based on the clustering score v.
All protocols and analyses described in this section are stored in individual arrays in the subdirectories "clustering_plots" and "protocol_plots". Graphical visualizations in plots and histograms are also generated and provided as JPEG images. A movie of the evolution of inbreeding populations is provided as Video S1.

Open and internal simulation parameters
Most of the simulation's parameters are assigned at the command line when invoking the simulation program. An overview of all open and non-open parameters is provided in a table (see Table 2). The table contains the identifiers used in the simulation code, a brief description of the parameters, the valid numerical ranges and experimental values that are known from literature (if applicable).

Estimating the mutational robustness of asexual populations
A deactivation of meiosis in the simulation framework leads to a situation, in which the deleterious mutagenic load k of the genomes monotonously increases until the saturation level k = n EGs is reached (there are 2 n EGs essential gene copies in the diploid genome). At this stage, any additional deleterious mutation will inevitably lead to the death of the concerned individual. Since recombination of the genomes is not possible, neither structural rearrangements nor mating types or recombination hotspots have any effect on the handling of deleterious mutational load. The mutational robustness of the population depends on the number of essential gene copies per diploid genome (2 n EGs ) and the size s of the population (expressed as the number of diploid genomes). We define m as the mutation rate per essential gene copy and per mitosis, i.e. m = R / (2 n EGs ). The probability p survival (m) for an equilibrated genome (a genome that reached the saturation level of deleterious mutations) to not experience a mutation in one of the remaining functional essential gene copies at a mutation rate m is then Thus, p survival is the probability of survival for an equilibrated genome. In a population of s individuals, the probability p s (m,n) that precisely n genomes survive the random application of deleterious mutations is therefore: At this value of m in average half of the population size s is removed due to homozygous deleterious mutations. Since each mitotic cycle duplicates the population, the population is exactly at the edge of survival. The population does on average neither shrink nor grow.
The mutational robustness of an asexual population is independent of the population size s (a result that can also be directly derived from equation 9). Inserting equations 9 and 10 in equation 11 yields: We obtain the formula for the mutational robustness R max of an asexual population:

Nature of the lethality caused by depletion of Msh2 during vegetative growth
In order to analyze the deleterious mutations that accumulate during vegetative growth in cells depleted for Msh2, we used cells that have been grown on YPD three times for 10-12 generations using serial transfer in order to allow for accumulation of mutations. Thereafter, 2 x 10 7 cells were grown for approximately 1 day on YP-Gal/Raf plates, in order to induce the GalS-promoter followed by sporulation and tetrad dissection. Under these specific conditions we found that ascus formation occurred with a frequency of > 99% within a period of 36 hours. 400 tetrads were dissected and spore viability was scored. Replica plating onto YPD containing G418 or ClonNat was used to investigate the segregation of the two GalS-MSH2 loci, one marked with kanMX (which confers resistance to G418) and the other one with natNT2 (which confers resistance to ClonNat) and on SC-LEU medium to follow the segregation of the leu2 allele. All tetrads that produced two viable spores were scored, and the linkage of the lethal phenotype was analyzed using the formula cM = 100/2(T/(PD+NPD+T)) [6]. The segregation of lethality in 131 tetrads with two viable spores PD:NPD:T=27:34:70 (versus MSH2, marked with kanMX and natNT2) and PD:NPD:T=21:18:92 (for leu2). T stands for tetratype tetrads. In our case we scored situations, in which the two viable spores had a different allele with respect to the investigated marker (leu2/LEU2 and MSH2-kanMX/MSH2-natNT2). PD and NPD tetrads were scored when both viable spores had the same marker. In the case of GalS-MSH2 (which is not centromere-linked), no linkage to the load was measured (35 cM, a linkage ≥ 35 cM cannot be calculated using this formula), whereas the lethal load scored a linkage of 27 cM to leu2/LEU2, indicating that some lethal mutations exhibited centromere linkage. Since spore lethality must be caused by a different mutation in each tetrad with two viable spores, the global linkage of 27 cM of the load to leu2 represents an average. This average is composed of mutations in essential genes, which may themselves exhibit centromere linkage. Additionally, some linkage may be caused by meiosis I non-disjunction of a chromosome, which results in two (or zero, in the case of several non-disjoined chromosomes) viable spores.
In order to mate, a spore needs to be able to break open the spore wall and show some indication of germination. Microscopic inspection of single dead spores that failed to form visible colonies revealed that their majority (93.5%; n=200) was able to germinate. In most cases they formed micro-colonies of up to approximately 20 cells (the histogram of the observed colony sizes is shown in Supplementary Figure 10). This is likely due to maternal contribution of functional mRNA or protein. We moreover found that most of the spores that did not show signs of germination appeared nevertheless swollen and we could often see faint remnants of what may have been a spore wall (after touching the spore with the needle of the micromanipulator). This indicates some metabolic activity. But it is unclear whether those spores would still be able to mate. Together, these results provides additional support for single mutations as the cause of the lethal phenotypes, since spores that lack entire chromosomes (either due to meiosis I non-disjunction or meiosis II chromosome misssegregation) usually fail to germinate. For methods, see Section C below.
For successful mating, spores furthermore require a functioning complement of the proteins required for mating. In order to estimate the frequency of mutations that inactivate the mating machinery, we tested the viable spore colonies for pheromone secretion and mating. We found that 1.2% (5 out of 425) viable spore colonies were deficient in mating, four of which also failed to secrete pheromone. One spore colony was well secreting mating pheromone but completely mating deficient. One spore colony could be induced to undergo haploid meiosis, which could be due to an extra chromosome III. For the remaining three spore colonies we could not easily determine the defect that hindered them from mating.
These tests provide additional support for our claim that mutagenized yeast genomes accumulate mutations that, even though they may be lethal, do not prevent the transmission of genomes through meiosis in the vast majority of the cases, provided that the spores have access to a mating partner immediately after germination.

Essential gene clustering in pericentromeric regions
Our simulations predict a fitness advantage for a linkage of essential gene clusters to centromeres. While clustering can be caused by the absence of meiotic recombination in pericentromeric regions, an enrichment of essential genes near centromeres (our previous finding [7]) constitutes evidence for a force underlying chromosome organization that arises from the mutational robustness. In Taxis et al. (2005) [7], we have calculated the essential gene enrichment in a region of 10 kb on either side of the centromeres, which corresponds to the physical distance that shows significant centromere linkage (within 35 cM). The correlation between physical distance p (in kbp) and genetic distance g (in cM) in pericentromeric regions (up to approximately 40 cM) can be approximated using the formula: The formula was derived from a compilation of the genetic and physical distances present in the yeast genome available at www.yeastgenome.org. Batada and Hurst [8] reported that the essential gene enrichment for this interval is not (or borderline) significant (P = 0.077), while we found it to be significant (P = 0.03) [7]. The deviation arises from the different statistical tests employed in both studies. Batada and Hurst used the non-parametric Mann-Whitney U test to decide whether the two distributions are significantly different. Non-parametric tests have less statistical power than the corresponding exact tests, which conversely can only be used if the correct theoretical distribution is known. In our case, we are comparing independent counts in two samples, rendering the theoretical distribution hypergeometric, and thus employ Fisher's exact test. In conclusion, Batada and Hurst arrive at a P-value of 0.077 whereas we obtain the P-value 0.03 for the same data.

Mitotic versus meiotic mutations
For most in silico experiments we implemented mutations only in mitosis. With respect to all conclusions drawn from our simulation in the present work, this is qualitatively and quantitatively identical to the situation, in which mutations would occur during pre-meiotic DNA replication. This is demonstrated in a series of comparative and competitive benchmarks (Supplementary Figure 2A-D). However, our simplified simulation framework does not consider the possibility of other meiotic mutations (e.g. during meiotic recombination) that may have a quantitative or qualitative impact on the simulation readout. This possibility is disregarded, because there is currently no data available that reports on this specific class of mutations.

Yeast mutator experiment
Haploid yeast strains NKY289 (MATa lys2 ura3 ho::hisG) and NKY292 (MATα lys2 ura3 leu2::hisG ho::LYS2) [9] in the well-sporulating SK1 background were used. The endogenous MSH2-promoter was substituted using PCR targeting and plasmids pYM-N30 and pYM-N31 as templates for PCR as described [10]. The correct integration of the PCR product was validated using PCR. Competent cells and selection of transformants were done using YP-medium containing 2% galactose and 2% raffinose (YP-gal/raff). Upon mating the spore viability of diploids was tested using tetrad dissection and was found to be identical to the wild type strains. For sporulation cells were grown for 24 h on YP-gal/raff plates at 30°C followed by sporulation on plates containing 1% KAc and each 0.02% of raffinose and galactose at 23°C for 36 -48 hours. For sporulation of cells during a mutation accumulation experiment, 0.2 ml of cells grown in liquid YPD culture (2x10 7 cells) were plated on a YPgal/raff plate, grown for 24 h, washed off with sterile water and plated on SPO plates. Random spores were prepared using washed-off cells in water and Zymolyase 100T (0.2 mg/ml, Seikagoku) and Sulfatase (10%, Sigma) for 1 hour, followed by vigorous vortexing of the cells with acid-cleaned glass beads (1 volume beads, 1 volume cells) for 6 min in total. Spores were then washed two times with 100 mM sodiumcitrate (pH 5.8) containing 1% Triton X100 and once with water. This procedure disrupted all non-sporulated cells and all asci, and also disrupted most of the interspore bridges that were reported to keep pairs of spores together [11]. Most of the dyads observed in the FACS formed after the Triton X-100 washes upon dilution of the spores in the buffer used for FACS, but a few remaining dyad pairs linked by interspore bridges may not have been disrupted. Interspore bridges are likely to keep spores together, which involve genomes separated during meiosis I. This explains why the rate of diploid formation resulted as 53% in sorted dyads in the contest of the MATcentromere linkage and not exactly as 50% in both the wild type strain and at time point 0 h of the GalS-MSH2 strain.
Germination of FACS-sorted spores was investigated following growth on YPD plates for three days by looking at the individual spores using a tetrad dissection microscope (Singer Instruments). Successful germination was scored when a spore had formed an extension (which may be the shape of a bud or more tube-like) of at least one spore diameter in size. In most cases spores formed micro-colonies that consisted frequently of cells of aberrant shape. A spore was considered to be "dead" if it failed to form a colony visible by eye. Counting or approximate estimation of the number of cells per colony was used to classify the lethal phenotypes. The experiment was conducted with two different clones of the GalS-MSH2 strain, which both yielded identical results.
For mating testing, two assays were used. Pheromone secretion was tested using a halo-assay and by using pheromone-sensitive strains. Mating was tested using complementation of rare auxotrophic mutations.

Supplementary Figure 1: Mutations and selection for reproductive fitness in S. digitalis
(A) Mutations that inactivate essential genes occur at random in mitosis. Mutation of both alleles of an essential gene leads to the death of the individual. Surviving individuals (that carry at least one functional copy of each essential gene) are subjected to meiotic recombination followed by mating of the meiotic products (spores) with other spores. Mating can occur within the tetrad (intratetrad mating/inbreeding) or among spores from different tetrads (outbreeding/amphimixis). The chromatids involved in mating carry a specific load of lethal mutations. Only homozygotisation of mutated essential genes leads to death of an individual.
(B) Two mechanisms lead to the death of individual in the simulation: a random removal of individuals due to limitations in the nutritional supply (starvation) and a selective removal of individuals with homozygous mutations in essential genes. The parameters that govern the recombination of genomes in meiosis and mating (chromosome architecture, inbreeding and outbreeding, crossover frequency) affect the mutational load and its distribution across the genome and thereby the frequency of homozygotisation of haploid lethal mutations.

Supplementary Figure 2: Mutational robustness R max of S. cerevisiae chromosome IX and random chromosomes
Scatter plots and histograms of mutational robustness (R max ) benchmarks performed for populations with S. cerevisiae chromosome IX and for populations with random architectures. In the first series of experiments, mutations were applied in mitosis (A), in meiosis (B) or in mitosis and meiosis (C). Notably, almost the same mutational robustness results for these different scenarios. As an additional test, two populations of S. cerevisiae chromosome VI carrying individuals were subjected to a competitive advantage simulation. Mutations were applied exclusively in mitosis in population #1 and exclusively in meiosis in population #2. The histogram in (D) shows the competition wins, indicating comparable performance of both populations.
In the second series of experiments, populations with S. cerevisiae chromosome IX and populations with random architectures were analyzed for their mutational robustness (R max ), considering a contribution of mating type switching of either 10% (E) or 50% (F) during breeding following meiosis. As a reference, the maximal robustness in the absence of mating type switching is indicated by a grey dashed line. At 10% mating type switching the maximal robustness is decreased by 8%, while it is increased by 42% at 50% mating type switching.

Supplementary Figure 3: The effect of essential gene clustering on mutational robustness at different inbreeding rates
Mutational robustness R max of chromosomes with different levels of essential gene clustering (1-100 clusters, 100 EGs in total) for different mating type configurations (+/-MAT) and for different inbreeding fractions (0-100%). In a chromosome architecture with n essential gene clusters, each cluster contains x/n essential genes, where x is the total number of essential genes in the chromosome. Thus, the architecture with one essential gene cluster represents a maximally clustered genome, while the architecture with 100 essential gene clusters represents a maximally unclustered genome. R max is color-encoded (left) and provided in dependency of the genome architecture and as a function of the inbreeding fraction. The plot (right) shows the average mutational robustness max R (average of R max for the entire inbreeding domain). Error bars indicate SD.

Supplementary Figure 4: Evolution of pericentromeric EG clustering requires a mating type
Levels of essential gene clustering obtained in the evolution of the small model architecture shown in Figure 7C in Main Text. The evolution experiment was performed for different rearrangement rates, mutations rates and mating type configurations (+/-MAT). The chromosomes contain five essential genes, five non-essential genes, four recombination hotspots and five recombination coldspots. The initial genome architecture is maximally unclustered (resulting in a clustering score of ν = 5). The evolution of the architectures was simulated for 100,000 generations. Without a mating type, the clustering score does not increase beyond the level found in random populations (between 8 and 9). In the presence of a mating type, the architectures evolve towards a clustering score between 10 and 13 for rearrangement rates below 10 -3 per genetic element and generation. The highest possible level of essential gene clustering is reflected by a clustering score of 25 (all five essential genes in a single cluster without recombination hotspots). "rnd" indicates the average clustering score of randomly generated genomes (ν = 8.5). MAT were allowed to evolve at different mutation rates R as indicated (n red = 495, n green = 500, n blue = 500 experiments). Similar to the analysis in Figure 8A in Main Text, clustering was scored by measuring both the size of the largest essential gene cluster and the average size of the remaining clusters. For genomes containing a MAT, the largest cluster was always observed to be linked to the MAT. The scatter plot shows the scores obtained after an evolution period of 200,000 generations. Score distributions for the different populations are spanned along the axes (including reference distributions for randomly generated populations of the same size as in the +MAT scenario at R = 1). The unclustered starting architecture provides a clustering score of ν = 131. Random architectures with a chromosome X genetic content provide a clustering score of ν = 221 ± 23 (SD), whereas the chromosome X architecture itself scores ν = 408. The percentages indicate the fraction of experiments yielding genomes with a clustering value ν at least 2σ above the average score of randomly generated genomes (i.e. ν ≥ 267).

Supplementary Figure 6: Competition analysis of mating type switching populations vs. non-switching populations at different levels of deleterious pre-load
Survival competition experiments of populations with S. cerevisiae chromosome VI subjected to different levels of mating type switching (2%, 10% and 25%) versus populations that do not perform mating type switching (n = 2,000 experiments per matrix). Before the start of the competition, 0, 2, 4, 6, 8 or 10 essential genes were deactivated (distributed on both homologues). When starting with two or less essential gene mutations in the entire chromosome, the switching populations clearly outperformed the non-switching populations for mutations rates between 10 -3 and 10 -1 . The situation is reversed in the presence of more than two mutations in the chromosome: non-switching populations win against switching populations. At high mutation rates (R = 1), non-switching populations have a strong quantitative advantage, even in the absence of a mutational pre-load. Only when competing with populations subjected to high fractions of mating type switching (≥ 25%) with a mutational pre-load of less than four essential gene mutations, the switching populations outperform non-switching populations at R = 1.

Supplementary Figure 7: Number of crossovers and ORFs in S. cerevisiae chromosomes
Crossover frequencies and ORF content of the 16 chromosomes of S. cerevisiae. Data was compiled from www.yeastgenome.org.

Supplementary Figure 8: Competition analysis of crossing over rates in yeast chromosomes
Average competitive advantage in the experiments shown in Figure 10B/C in Main Text. The direct competition of the S. cerevisiae chromosome IX crossover rate (c IX ) versus altered crossover rates (shown in red) reveals a global performance maximum at the S. cerevisiae chromosome IX rate. In this regime, S. cerevisiae chromosome IX also performs particularly well against random architectures (shown in blue). The SEM errors bars are smaller than the data circles.

Supplementary Figure 9: FACS sorting of spores and dyads
(A) Spores were gated based on green autofluorescence of the cells and ultraviolet autofluorescence of the spore wall (R1, excitation/emission: 326/404 nm). Spores (R3) and dyads (R2) could be distinguished using shape parameters determined from side and forward scatter.
(B) To assess the sorting specificity, 500 spores and 500 dyads were sorted onto an agar plate in 50 groups of 10 and counted using a microscope equipped with a 20x magnification air lens. The dyad plate was contaminated with tetrads (0.2%), triads (3.2%) and single spores (0.7%), whereas the spore plate was effectively contamination-free.
(C) Examples of sorted spores and dyads grown on YP-galactose/raffinose (2% each) plates. Colony formation was scored using ImageJ (NIH, Bethesda) to count colonies in the plate images. A spore was considered to be viable when it formed a colony that could be clearly identified by the image analysis procedure.

Supplementary Figure 10: Germination efficiency and colony size distribution of FACSsorted single spores
Histogram of the distribution of colony sizes of sorted single spores upon mutation accumulation for 33-36 generations in the conditional Msh2 mutator strain. Sorted spores that did not give rise to visible colonies were investigated with microscopy and categorized according to their approximate cell counts. Visible colonies were categorized according to their diameter relative to the average wild type colony size (wild type = colonies with no obvious growth defect).

Supplementary Figure 11: Mating success rates in the presence of mutational load
A simplified model that illustrates the increase in mating success due to essential gene clustering and MAT-linkage in the case of pure inbreeding and a two-gene configuration with a single recombination hotspot. The model considers a static scenario of maximum load (50% inactivated EGs). The genome undergoes a meiotic duplication and a meitotic recombination of two sister chromatids. The six possible pairings of the resulting four haploids are shown as schematic illustrations. Large red crosses indicate lethal combinations. All possible architectures (unclustered = hotspot between genes, clustered = non-separated genes), mutation distributions (on one chromosome or on homologue strands) and mating type configurations (with/without MAT) were considered. Clustered architectures provide higher survival rates than unclustered architectures. The presence of a MAT further increases the chance of obtaining a viable combination. No lethal combination is possible for a clustered MAT-linked architecture (100% mating success).
This scenario was chosen to highlight the correlation between clustering and the viability after mating. It does not reflect the complex processes of the dynamic life cycle, in which the load is also affected by the linkage relationship of the essential genes, as determined by their distribution and the distribution of meiotic recombination hotspots. In the dynamic scenario, populations with different chromosome architectures will accumulate different levels of load, which are specific to their architecture and the environmental conditions. This circumstance will affect the lethality caused by homozygotisation of lethal mutations and the associated purging of lethal load.

Supplementary Figure 12: Purging and survival rates in the evolution of clustering experiment
Average clustering scores (A), purging ratios (mitosis over meiosis, B), inbreeding success rates (C) and outbreeding success rates (D) as a function of simulated generations for the three types of evolution experiments discussed in Figure 8A in

Video S1: Maintenance of EG clustering at low and at high mutation rates
Maintenance of essential gene clustering in evolving inbreeding populations for two of the data points in Figure 7A in Main Text. The right panel corresponds to data point #1, while the one to the left corresponds to data point #2. The graphs at the top show the genomic element densities for the entire population. Chromosomes are vertically aligned. The color-coding reports the differences in density of EGs and recombination hotspots (sliding window analyses of individual genomes, window size is 20 elements). Histograms of the sliding window analyses are shown at the bottom. Initially, all EGs are located on one side of the chromosomes, while recombination hotspots form a cluster in the middle of the non-essential genes (at the other end of the chromosome). At high mutation rates, the EG clustering is maintained (left, movie shows a period of 100,000 generations). At low mutation rates, EG clustering is not maintained and the EGs become distributed in random patterns (right). Both simulations used identical values for all parameters except for R.

Video S2: Evolution of EG clustering
Visualization of the evolution of clustered genome architectures as a function of time in a population representative for the high-R +MAT evolution experiment shown in Figure 8A in  Conservative estimate of essential gene clustering using whole-genome data from Mancera et al. (2008) [1]. Note that these data yield a clustering score for yeast chromosome IX that differs from the data provided by Pal and Hurst (2003) [12]. The data set used for the initial chromosome IX architecture is based on the hotspot distribution from Gerton et al. (2000) [5], whereas the data from Mancera et al. (2008) [1] used for the analyses underlying Table 1 report on the distribution of actual crossover events. Detailed information about the architectures used in our simulation experiments are provided in Text S1, Section E.

Table 2: The parameters and simulation modules of S. digitalis
The table lists all parameters, modules and simulation functionality in S. digitalis. The valid parameter space is indicated for each parameter. Some parameters are not accessible via the function interface and are therefore listed as "internal" parameters (see second part of the table).   S. cerevisiae chromosome IX building blocks (35 essential genes, 172 nonessential genes, 58 recombination hotspots, 148 recombination coldspots) Note: random chromosomes contained the same number of elements as the yeast chromosome, but using a random distribution that was calculated independently for each simulation run.
Experimental settings for Figure 4C: Survival competition of S. cerevisiae chromosome IX vs. random architectures for a wide range of mutation rates and for different population sizes population size 150; 1,500; 15,000 crossover interference Active mating type active and linked to the centromere of an (virtual) chromosome; the centromere of the simulated chromosome is located at position 333, which is equivalent to the centromere-position of chromosome IX. recombination rate S. cerevisiae chromosome IX rate inbreeding percentage 0-100% in 25% steps mutation rate R 10 -4 , 3x10 -4 , 10 -3 , 3x10 -3 , 10 -2 , 3x10 -2 , 10 -1 , 3x10 -1 , 1 simulation time max. 40,000 generations; the simulation stopped as soon as one species became extinct.
total statistics 50 experiments per mutation rate and inbreeding fraction chromosome architecture S. cerevisiae chromosome IX building blocks (35 essential genes, 172 nonessential genes, 58 recombination hotspots, 148 recombination coldspots) Note: random chromosomes contained the same number of elements as the yeast chromosome, but using a random distribution that was calculated independently for each simulation run.
Experimental settings for Figure 4D: S. cerevisiae chromosomes were digitized using information about the distribution of approximately 4,000 crossover in 50 meiosis [1]. Essential gene locations were taken from www.yeastgenome.org. Note: random chromosomes contained the same number of elements as the yeast chromosome, but using a random distribution that was calculated independently for each simulation run.
Experimental settings for Figure 5A: Mutational robustness R max of a clustered chromosome (seven clusters) as compared to random chromosomes with the same genetic building blocks  S. cerevisiae chromosome IX building blocks (35 essential genes, 172 nonessential genes, 58 recombination hotspots, 148 recombination coldspots) Note: random chromosomes contained the same number of elements as the yeast chromosome, but using a random distribution that was calculated independently for each simulation run.
Experimental settings for Figure 10A: Mutational robustness at different recombination rates population size 100 crossover interference Active mating type active on the same chromosome recombination rate between 1 and 58 crossovers per chromosome and meiosis inbreeding percentage 0-100% in 20% steps (figure shows averaged results)   active on the same chromosome, location at the respective evolved position in the evolution products and at position 427 for S. cerevisiae chromosome X recombination rate S. cerevisiae chromosome X rate inbreeding percentage 0-100% in 25% steps mutation rate R 10 -3 , 10 -2 , 10 -1 , 1 simulation time max. 10,000 generations, the simulation stopped as soon as one species became extinct. total statistics competitions were performed for n = 100 randomly chosen evolution products from each subset.