Cause and consequences of genome duplication in haploid yeast populations

Whole genome duplications (WGD) represent important evolutionary events that shape future adaptation. WGDs are known to have occurred in the lineages leading to plants, fungi, and vertebrates. Changes to ploidy level impact the rate and spectrum of beneficial mutations and thus the rate of adaptation. Laboratory evolution experiments initiated with haploid Saccharomyces cerevisiae cultures repeatedly experience WGD. We report recurrent genome duplication in 46 haploid yeast populations evolved for 4,000 generations. We find that WGD confers a fitness advantage, and this immediate fitness gain is accompanied by a shift in genomic and phenotypic evolution. The presence of ploidy-enriched targets of selection and structural variants reveals that autodiploids utilize adaptive paths inaccessible to haploids. We find that autodiploids accumulate recessive deleterious mutations, indicating an increased capacity for neutral evolution. Finally, we report that WGD results in a reduced adaptation rate, indicating a trade-off between immediate fitness gains and long term adaptability.


20
The natural life cycle of budding yeast alternates between haploid and diploid phases. Both 21 ploidies can be stably propagated asexually through mitotic division. Both theory and experimental work 22 show that haploids adapt faster than diploids, likely due to recessive beneficial mutations (Orr and Otto

57
Diploids have a greater tendency to form copy number variants (CNVs), especially large deletions 58 (Zhang et al. 2013). Likewise, aneuploidies accumulate at a significantly higher rate in diploids in the 59 absence of selection (N. Sharp, personal communication, Aug. 2017). If structural variants are more 60 frequent, more variable, and more tolerable in diploids, genome duplication may enable access to novel 61 adaptive paths. Given the repeated observation of displacement of haploids by diploids (Table 1), and 62 the absence of clear evidence for instantaneous fitness advantages of isogenic diploidy that is broadly 63 applicable across experiments, it is possible that selection for and maintenance of diploidy is a complex 64 process involving both direct selection on ploidy state and second order selection, or selection for 65 indirect fitness benefits associated with higher ploidy.

66
Here we show recurrent WGD in 46 haploid-founded populations during 4,000 generations of 67 laboratory evolution in rich media. We track the dynamics of genome duplication across the haploid-68 founded populations, revealing that autodiploids fix by generation 1,000 in all 46 populations.

69
Competitive fitness assays show that WGD provides a 3.6% fitness benefit in the selective 70 environment. We find that the immediate fitness gain is accompanied by a loss of access to recessive 71 beneficial mutations. As a consequence, the rate of adaptation of autodiploids slows. Sequencing of the 72 evolved genomes indicates that autodiploids have increased access to structural variants and largely 73 utilize a different spectrum of mutations to adapt compared to haploids. Finally, we show that 74 autodiploids are buffered from the effects of recessive deleterious mutations, consistent with a long-75 term benefit to maintaining a diploid genome and loss of redundancy following WGD.  Kolmogorov-Smirnov test, α=0.05) with a mean of 91 ± 20 (Fig. S2A)

90
Autodiploids are detected early, sweep quickly, and exhibit a fitness advantage

91
We determined the fitness effect of genome duplication by directly competing MATa/a 92 autodiploids against an otherwise isogenic haploid MATa reference. To control for possible artifacts of 93 construction, we independently constructed and competed 10 MATa/a diploids. All 10 MATa/a 94 autodiploid reconstructions exhibit a relative fitness advantage significantly higher than a control 95 haploid strain (Welch's t-test, t=16.28 df =19, p<.001). Genome duplication alone in the absence of any 96 other variation provides a mean fitness benefit of 3.6% in these experimental conditions (Fig. 1A).

97
To determine the timing of duplication events, we performed time-course DNA content staining 98 on cryoarchived samples for 16 populations (8 of each mating-type). Autodiploids arise quickly in all 16 99 populations, fixing by generation 1,000 in all but 2 populations (Fig. 1B, Fig. S3, Fig. S4). Diploids are 100 present at 2% -11% in 11/16 populations at generation 60, the earliest time point available for assay.

104
We examined whether the degree of parallelism observed in ploidy dynamics can be attributed 105 to ancestral ploidy polymorphisms present at the onset of the experiment. Three lines of evidence 106 support the independent origin of autodiploidy in this experiment. First, the cultures were initiated from 107 two starting strains (MATa and MATα). There is no significant difference in autodiploid frequency 108 between mating-types at any generation ( Fig. S3), meaning if autodiploids did, in fact, arise in both 109 independent inoculating cultures, they would have had to achieve roughly the same frequency, which is 110 highly unlikely. Second, no diploids were detected by DNA content staining in any populations at 111 generation 0, indicating autodiploids were not present in the inocula above our detection limit of 1%.

112
Third, computational simulations show that low frequency autodiploids are insufficient to explain the 113 recurrent observation of autodiploid fixation events in all 46 replicate populations. Autodiploids with a 114 3.6% fitness advantage starting at a frequency of 0.01, the highest frequency we modeled, have a 115 probability of fixation an a given population of 0.88 and therefore the chance of fixation in all 46 116 populations would be 2.5 x 10 -3 (Fig. S5). Taken together, this argues that, while ancestral autodiploids 117 may have swept in some populations, ancestral ploidy variation is insufficient to explain autodiploid 118 fixation in all 46 populations. Therefore independent, parallel WGD events during the evolution 119 experiment are necessary to explain the recurrent fixation reported here.

120
To examine how the shift to diploidy impacted the dynamics of adaptive evolution, we measured 122 population fitness for all populations at ~300-generation intervals. Mean time-course fitness estimates 123 show a change in slope following 1,000 generations. This corresponds roughly to the time that 124 autodiploids have fixed in most focal populations and are high frequency in the remaining populations 125 (Fig. 1B). We compared the rate of adaptation before and after the fixation of diploids in 13 focal 126 populations for which quality fitness data was available. Because many factors, including epistasis, 127 could explain a change in adaptation rate over time, we used a repeated measures ANOVA to compare 128 the effect of ploidy on adaptation rate using time-course fitness data from diploid-founded populations 129 that were evolved in parallel (Marad and Lang 2017) (Fig. 1C). The interaction of founding ploidy and  genes whose protein products localize to the cell periphery (p = 0.001). Cell periphery targets include 146 CCW12 and KRE6, which both appear to be under extremely strong selective pressure when using the 147 probability metric as a proxy for strength of selection. Interestingly, a tRNA gene, tL(GAG)G, was also 148 identified as a common target of selection (Fig. S6). This is the first evidence of adaptive tRNA 149 mutations in laboratory yeast evolution.

150
To better understand the functional basis of adaptation, we examined the distribution of 151 mutations within each gene (Fig. 2B). Three broad patterns emerge. First, we observe selection for 152 loss-of-function alleles, e.g. 9 of 11 mutations in WHI2 are high impact (frameshift or nonsense).

153
Adaptive loss-of-function alleles are common in experimental microbial evolution (Cooper et al. 2001; 154 alleles. For example, only missense and synonymous mutations are seen in PDR5. Finally, we observe 156 mutations in common targets that cluster within specific domains. This is illustrated by the clustering of 157 mutations in the C-terminus of both KRE6 (n=21) and STE4 (n=6).

158
We compared the common targets of selection identified in autodiploid clones to those identified 159 with the same approach in a comparable haploid dataset (Lang et al. 2013) (Fig. S7). We identify 160 several haploid-and autodiploid-enriched targets (Fig. 2C) mutations. We find that the homozygous mutations are not distributed randomly throughout the 168 genome; instead, they tend to cluster in particular regions of the genome (Fig. 3). These clusters,

207
Essential genes are also present within two of the large deletions observed in autodiploids ( Table 2).

208
To experimentally validate that recessive lethal mutations accumulate in autodiploids, we 209 sporulated three MATa/a from three different populations and performed tetrad dissections. Clones

210
A02a, B01a, and C03b were selected because they contain no identifiable aneuploidies that would 211 complicate measures of spore viability. Out of 20 total dissected tetrads (80 total spores) per clone, 212 spore viability ranged from 4% to 66% in evolved autodiploid clones (Fig. 5B). Further, a substantial 213 fraction of germinated spores developed morphologically small colony sizes relative to controls. We 214 compared observed spore viability to expected viability based on the number of high impact mutations 215 in genes annotated as essential. The only clone for which we observed four-spore viable tetrads, B01a,

216
is also the only clone with no predicted recessive lethal mutations. Nonetheless, both A03a and B01a 217 have significantly lower spore viability than expected (Fig. 5B). This in part may be due a genetic load 218 imposed by segregating deleterious alleles. Consistent with our sequencing data, these data indicate

230
Here, we show that experimental evolution of haploid Saccharomyces cerevisiae results in rapid and 231 recurrent WGD. Clones with duplicated genomes arise early in all 46 populations and fix rapidly. We

232
show that the fixation of autodiploids is due to a high rate of occurrence and a large fitness effect 233 conferred by WGD.

234
Although the invasion and subsequent fixation of autodiploids in haploid-founded lineages has 235 been reported before in yeast (see Table 1), a clear fitness advantage to diploidy has not always been

244
The recurrent and remarkably parallel manner in which autodiploids arise and fix points to not 245 only a large fitness effect, but a high rate of occurrence. Our previous work has shown that parallel that some autodiploids were present in the founding inoculum, they are below our 1% detection limit.

250
Autodiploids at this low of a frequency in the incoculum is not sufficient to explain the extent of fixation

331
The same masking effect that stifles recessive beneficial mutations is also predicted to permit

385
To measure the fitness effect of autodiploidy, fitness assays were performed as described

424
The Nextera protocol was followed as described previously (

436
Zygosity was determined based on read depth and allele frequency (Fig. S2B). Mutations were 437 classified as fixed if present in all clones from a population. Clones were genotyped for MAT alleles by 438 identifying mating-type specific sequences within the demultiplexed FASTQ files.

439
Clone genomes were each independently queried for structural variants. Following BWA 440 alignment, coverage at each position across the genome was calculated. Aneuploidies were detected 441 by calculating median chromosome coverage and dividing this by median genome-wide coverage for 442 each chromosome, producing an approximate chromosome copy number relative to the duplicated 443 genome ( Fig. 4; Dataset 2). CNVs were detected by visual inspection of chromosome coverage plots 444 created in R ( Fig. S10; Dataset 3).

446
Variants identified by SNPeff were used to infer a phylogeny based on 7,932 sites containing 447 4,742 variable sites, either SNPs or small indels (Fig. S8).  defined as genes with five or more CDS mutations and a corresponding probability of less than 0.1% 463 ( Fig. 2A). Notably, analysis using only nonsynonymous mutations identified largely the same set of 464 common targets of selection as did analysis using all CDS mutations. To determine which targets of 465 selection are impacted by ploidy, our recurrence approach was used to analyze mutations in a 466 previously published MATa haploid dataset (Fig. S7) (Lang et al. 2013). We compared the probability of 467 the observed number of CDS mutations in each gene between ploidies (Fig. 2C)       For each haploid-founded population, adaptation rate was calculated before and after autodiploid fixation, which occured on average at generation 600. Adaptation rates for diploid-founded populations were calculated from Gen 0-600 and Gen 600-4000. H02a A11b E12a Fig. 4 Detection of Aneuploidies. For each sequenced sample, coverage across each chromosome was compared to genome-wide coverage. Based on DNA content staining, baseline ploidy was assumed to be 1N for haploids and 2N for autodiploids. Euploidy is indicated by empty circles: haploid -green, autodiploids -blue. Aneuploidies are shown as filled circles and labeled by clone. (frameshift, nonsense) and low impact mutations (synonymous, intronic) in essential genes in haploids (green) and autodiploids (blue). Above each bar is the ratio of mutations in essential genes to mutations in all genes. B) Clones from three evolved diploid populations were sporulated and dissected. Spore viability and small colony size reflect recessive lethal and recessive deleterious mutations, respectively.