Differential paralog divergence modulates evolutionary outcomes in yeast

Evolutionary outcomes depend not only on the selective forces acting upon a species, but also on the genetic background. However, large timescales and uncertain historical selection pressures can make it difficult to discern such important background differences between species. Experimental evolution is one tool to compare evolutionary potential of known genotypes in a controlled environment. Here we utilized a highly reproducible evolutionary adaptation in Saccharomyces cerevisiae to investigate whether other yeast species would adopt similar evolutionary trajectories. We evolved populations of S. cerevisiae, S. paradoxus, S. mikatae, S. uvarum, and interspecific hybrids between S. uvarum and S. cerevisiae for ~200-500 generations in sulfate-limited continuous culture. Wild-type S. cerevisiae cultures invariably amplify the high affinity sulfate transporter gene, SUL1. However, while amplification of the SUL1 locus was detected in S. paradoxus and S. mikatae populations, S. uvarum cultures instead selected for amplification of the paralog, SUL2. We measured the relative fitness of strains bearing deletions and amplifications of both SUL genes from different species, confirming that, converse to S. cerevisiae, S. uvarum SUL2 contributes more to fitness in sulfate limitation than S. uvarum SUL1. By measuring the fitness and gene expression of chimeric promoter-ORF constructs, we were able to delineate the cause of this differential fitness effect primarily to the promoter of S. uvarum SUL1. Our data show evidence of differential sub-functionalization among the sulfur transporters across Saccharomyces species through recent changes in noncoding sequence. Furthermore, these results show a clear example of how such background differences due to paralog divergence can drive significant changes in evolutionary trajectories of eukaryotes.

Understanding how organisms adapt to their environment is a fundamental goal of 3 evolutionary biology. This goal has been complicated by the dependence on the 4 reconstruction of historical events to make inferences about selective pressures and 5 evolutionary mechanisms. Furthermore, it can be difficult to pinpoint genetic variation 6 that causes new phenotypes of interest amid very divergent genomes. One approach to greater selective effect in S. uvarum. To test this, we performed additional experiments 1 to determine the functional contribution of each gene from both species.  From comparisons with the reconstructed ancestral genome [33], SUL2 appears to be the 1 5 ancestral copy of the sulfur transporter, with SUL1 being a more recent gene duplicate 1 6 after a small-scale duplication (SSD) event. Amino acid conservation between SUL1 and 1 7 SUL2 in S. cerevisiae is 62.5% and 61.3% shared identity in S. uvarum, whereas SUL1 1 8 from S. cerevisiae and SUL1 from S. uvarum share 84% identity and SUL2 from S. 1 9 cerevisiae and SUL2 from S. uvarum share 87% identity, indicating that the sulfate 2 0 transporter genes are correctly annotated. To test whether the functions of these genes may have diverged between these species, 1 we measured the fitness effects of having additional copies of each gene. Previous 2 studies have shown that the addition of SUL1 on a low copy plasmid in S. cerevisiae 3 increases the fitness of the strains by ~40% [26]. To determine the effect of additional 4 copies of SUL1 and SUL2 from S. cerevisiae and S. uvarum, we transformed S. cerevisiae 5 with ARS/CEN plasmids individually containing each SUL gene along with 500bp 6 upstream of the coding region. Fitness assays were also attempted in the S. uvarum 7 background; however, the S. cerevisiae CEN plasmid we used was lost at a high rate, as 8 previously observed [19], precluding clear results. We performed chemostat competition 9 experiments between strains harboring additional copies of each gene in S. cerevisiae.

0
The pairwise competitions provided fitness data that allowed us to more precisely SuSUL2. The strain with the SuSUL1 gene had the lowest fitness effect of all genes 1 5 tested (Fig 3). This result suggests that SUL2 may have maintained a similar function 1 6 between the two species, but SUL1 function may have diverged. In support of our 1 7 original hypothesis, the SUL2 gene from S. uvarum (SuSUL2) conferred a greater fitness 1 8 effect than the S. uvarum SUL1 (SuSUL1). This is also consistent with our predictions 1 9 based on the evolution experiments, suggesting that SuSUL2 amplification may have a 2 0 greater selective benefit than amplification of SuSUL1.  In addition to testing the fitness effects of each SUL1 and SUL2 gene independently, we 1 3 also investigated the amplification preference in the context of having all alleles present 1 4 in one genome. Given the results from the single gene plasmid experiments above, we 1 5 predicted that ScSUL1 would be the preferred allele for amplification. We created de 1 6 novo S. cerevisiae/S. uvarum hybrid strains and subjected them to hundreds of 1 7 generations of growth in sulfate-limited continuous culture. Evolved strains were then 1 8 analyzed by aCGH to determine differences in genome content from their ancestral 1 9 strains. Amplification of segments containing the SuSUL1 or SuSUL2 gene was never observed 2 2 in 16 clones from 8 independent populations, and SuSUL1 was even found deleted in one 2 3 evolved clone, displaying loss of heterozygosity at this locus (S3 Fig). In contrast, the S. 1 cerevisiae copy of SUL1 was found amplified in 14/16 evolved clones (Fig 4). Copy 2 numbers estimated from the array CGH data ranged from 3 to as many as 20 copies of 3 SUL1. Centromere-proximal breakpoints varied from population to population, but 4 amplicons extended to the most distal telomeric probe in all cases. Additional 5 rearrangements were rarely observed in these strains (S3 and S4 Figs). When all four 6 alleles are present in the same genome, ScSUL1 amplifications are preferentially 7 recovered, suggesting that ScSUL1 amplification yields the greatest fitness advantage in 8 this particular environment and genomic context.   Deletion of sulfate transporter genes display differential effects between S. cerevisiae We have shown that the addition of extra copies of each gene results in an increased 1 fitness in S. cerevisiae, with ScSUL1 yielding the greatest fitness increase, a result that 2 corresponds to the amplification preferences in evolved strains derived from an 3 interspecific hybrid. Unfortunately, we were not able to test the fitness effect of plasmids 4 carrying additional copies of SUL1 vs. SUL2 directly in S. uvarum. However, we were 5 able to delete SUL1 and SUL2 in both S. cerevisiae and S. uvarum backgrounds to 6 determine the relative fitness contributions of these loci in each background. We created 7 sul1Δ and sul2Δ haploid strains and measured the competitive fitness of each null mutant 8 in sulfate-limited conditions. We competed the sul1Δ and sul2Δ strains within each 9 species against each other to calculate the fitness effect of each mutant. In S. cerevisiae, 1 0 the sul2Δ strain outcompeted the sul1Δ strain, suggesting that SUL1 in S. cerevisiae is the 1 1 gene that is more important for growth in sulfate-limited conditions. Conversely, in S. 1 2 uvarum, the sul1Δ strain out competed the sul2Δ strain, suggesting that SuSUL2, rather, 1 3 is the gene that is more important for growth in sulfate-limited conditions (Fig 5). Taken   1  4 together with the fitness data from increasing the copy number of each gene, these data 1 5 suggest differential SUL1 and SUL2 fitness contributions across these two species despite 1 6 the genes' similarity in amino acid composition. In order to determine where the divergence in relative fitness effects between SUL1 and 9 SUL2 in S. cerevisiae and S. uvarum occurred in evolutionary history, we tested the 1 0 fitness of SUL1 and SUL2 from S. paradoxus and S. mikatae-two other species of the 1 1 sensu stricto clade-and SUL2 from Naumovozyma castellii, a more distant species that 1 2 has not undergone gene duplication of this locus. We cloned the genes along with 500bp when competed against a plasmid-free strain. This experiment allowed us to calculate the 1 6 relative fitness coefficient of each strain. All strains showed significantly higher fitness 1 7 than wild type S. cerevisiae, with the relative fitness coefficients ranging from 12.7% to 1 8 37.2%. The S. cerevisiae SUL1 (ScSUL1) plasmid conferred the greatest fitness benefit of 1 9 37.2% (Fig 6). The strains containing SUL1 from S. paradoxus and S. mikatae conferred 2 0 a greater fitness advantage than SUL2 from the respective species. In N. castellii, the 2 1 singleton SUL2 conferred a fitness advantage of 36.3% (Fig 6). These results suggest 2 2 that the last common ancestor of S. cerevisiae, S. paradoxus, and S. mikatae may have 2 3 acquired adaptive mutations in SUL1. Alternatively, the S. uvarum SUL1 paralog may 1 have acquired mutations that decreased its fitness only in that lineage.  From these data, we can make predictions about the types of genomic events that would 1 1 occur if we evolved S. paradoxus and S. mikatae under sulfate limited conditions. Since populations with an amplification containing the SUL1 locus in S. paradoxus and one 1 8 population in S. mikatae (Fig 7). Other aneuploidy and segmental amplifications 1 9 occurred in addition to the SUL1 locus amplification in the evolved populations (S5 and 2 0 S6 Figs); however, none of these copy number variants included the SUL2 locus. Overall, these data are consistent with the previous gene function measurements of each 2 2 allele in S. cerevisiae, indicating that SUL1 is more adaptive when amplified in S. 1 paradoxus and S. mikatae. The species-specific relative fitness contributions among SUL genes are largely 1 7 driven by promoter sequences. Based on the similar results across S. cerevisiae, S. mikatae, and S. paradoxus, we  were transformed with the individual plasmids carrying chimeric SUL constructs and 8 competed against a plasmid-free strain to calculate the relative fitness coefficient of each 9 strain in sulfate-limited media. Additionally, the non-chimeric alleles were also tested 1 0 against a plasmid-free strain, with a total of 16 alleles tested. The fitness coefficient values ranged from 0.2 to 38% after correcting for the cost of promoter, the SuSUL1 ORF now had a fitness advantage versus the SuSUL2 ORF, 1 5 opposite to the result obtained when each ORF was driven by its native promoter. All fitness. This result suggests that expression differences between the two species may 1 8 largely explain the differential fitness effects of the two SUL1 genes. Interestingly, the ORFs, with the ScSUL1 coding region yielding the highest relative fitness. However, 2 3 when promoters of SuSUL1 or SuSUL2 were paired with the other three ORFs, we 1 identified a different ranking of fitness patterns, with the SuSUL1 coding region yielding 2 the highest fitness. We did not attempt to further dissect these apparent epistasic 3 interactions between the promoters and coding regions; however, such complex genetic 4 interactions have been observed in other contexts [35][36][37]. Each bar corresponds to the mean of 4 or more replicates ±SD. Since the results from the chimeric constructs suggested that the promoter region is 1 7 largely responsible for the differences in fitness, we sought to measure gene expression  cerevisiae strains grown at steady state in sulfate-limitation. We found that the 2 1 expression level of the ScSUL1 chimera with the promoter from SUL1 from S. uvarum 2 2 (P SuSUL1 -ScSUL1) was significantly reduced in comparison to the other promoters (Fig   2  3   9A). We also found a modest correlation between expression level and the fitness value 1 of each construct (R 2 =0.75) (Fig 9B). This result demonstrates that the differences 2 between the fitness contributions of the two transporter genes may be due to gene 3 expression differences. In this work, we used comparative experimental evolution to investigate how paralog the Saccharomyces clade. We identified differential sub-functionalization of gene 2 0 duplicates between paralogs that encode sulfate transporter genes in S. cerevisiae and S.  The other populations that did not amplify SUL1 or SUL2 may contain other events that 1 1 may be equally or more beneficial than either amplification, or may require additional 1 2 time for the amplification event to occur and rise to high frequency (>200 generations). suggesting that amplification events are dynamic and may depend on longer time scales 1 6 to occur and/or achieve high frequency. These findings also demonstrate that other 1 7 means to achieve an increased fitness in sulfate limitation exist, since both S. paradoxus 1 8 and S. mikatae are able to adapt to this condition without amplifying either of the SUL 1 9 genes. Further work will be required to understand the genetic differences that mediate 2 0 these other evolutionary trajectories. Our results contribute to ongoing efforts to understand the mutations that drive 1 adaptation, a long-standing question in evolutionary biology. There are examples of 2 parallel molecular evolution that occur across genetic backgrounds for many traits [4,38-3 42], suggesting that genetic background plays a relatively unimportant role in 4 determining the outcome of adaptation at the molecular level. A more recent study, 5 however, tested how genetic differences between strains of bacteria influence their 6 adaptation to a common selection pressure and found that parallel evolution was more 7 common within-strains than between-strains, implying that genetic background has a 8 detectable impact on adaptation [43]. Taken together, it is unclear to what degree genetic 9 background impacts the mechanism and rate of adaptation to a novel selection pressure.

0
Our study has identified differential locus parallelism between sulfate transporter loci in  To further investigate the effect of genetic context and whether this was due to coding or 1 5 non-coding variation, we generated chimeric alleles of promoter and coding regions 1 6 between S. cerevisiae and S. uvarum SUL1 and SUL2 genes. We identified poor fitness that are crucial for growth in sulfate limitation [34]. This same approach could be 1 applied to the promoter region of SUL1 in S. uvarum to determine which sequences are 2 responsible for these differences in activity.  In addition to metal exposure, nutrient limitation is also a likely scenario experienced by 2 2 wild and industrial yeast strains. Growing evidence suggests that domesticated 2 3 Saccharomyces species have been exposed to sulfate related selective pressures through 1 the selection for favorable characteristics associated with brewing. In lager brewing 2 yeast, increased sulfite production is important for its antioxidant properties and for 3 preserving favorable flavor profiles [53]. Saccharomyces pastorianus is a lager brewing 4 species found only in the brewing environment and appears to be an allotetraploid hybrid The strains used in this study are listed in Table S1. The S. cerevisiae strains used in this targeting approximately 175bp upstream of the ATG (Table S3). To test the fitness due to the amplification of SUL1 or SUL2 from each species, we 2 transformed DBY7283, a ura3 S. cerevisiae MATα strain, with a low-copy plasmid [58].

3
Phusion PCR was used to amplify 500bp upstream and 5bp downstream of the stop 4 codon of SUL1 and SUL2 from S. cerevisiae, S. uvarum, S. paradoxus, and S. mikatae, 5 and SUL2 from S. castelli. Each SUL1 and SUL2 gene was blunt cloned into pIL37 using 6 primers listed in Table S3. All plasmids used in this study are listed in Table S2. The competitive assay against strains also containing additional copies of each gene. The chimeric plasmids were created by amplifying 500bp upstream of the start codons of 2 2 the SUL1 and SUL2 ORFs from S. cerevisiae and S. uvarum and cloning each upstream 2 3 region into YMD2307 using primers with added SnaBI sites at the 3' end (Table S3). 1 Each plasmid was digested with SnaBI and SUL1 or SUL2 from S. cerevisiae or S. 2 uvarum was ligated immediately adjacent to the previously cloned upstream region, 3 creating a total of twelve different chimeric strains. were then rinsed with water for 5 minutes and spun dry. Hybridization conditions were optimized to maximize specificity. DNA from S. volume and temperature, and wash stringency conditions. As expected because of the 2-1 9 tier design strategy, less than 5% (563/11263) showed evidence of cross-hybridization 2 0 with signal significantly over background levels in both channels. These probes were 2 1 filtered out of all hybrid datasets. Data were linearly normalized and filtered for spots with intensity of at least 2 times over 1 9 background in at least one channel. Manually flagged spots were also excluded. These 2 0 filters were adequate to routinely filter out >95% of empty spots and retain >95% of 2 1 hybridizing spots. To determine if SUL1 would amplify in S. uvarum, a single colony from a sul2 strain was 1 8 inoculated into sulfate-limited chemostat medium and inoculated from four independent 1 9 clones into four ministat chambers as previously described. Array CGH was performed 2 0 on the four populations after 260 generations using the ancestral strain as the reference 2 1 (sul2∆ deletion strain). Yeast samples for real-time PCR analysis were collected directly from the culture 1 vessels, when the cultures reached steady state (approximately 3 days at ~25 2 generations). The cells were filtered on Nylon membrane (0.45μm pore size) and 3 immediately frozen in liquid nitrogen and stored at -80ºC until RNA extraction. The pairwise competition experiments were performed in ministats . Each competitor 8 strain was cultured individually. Upon achieving steady state, the competitors were 9 mixed in the indicated ratio. Each competition was conducted in two biological replicates 1 0 for 15 generations after mixing. Samples were collected and analyzed three times daily.

1
The proportion of GFP+ cells in the population was detected using a BD Accuri C6 flow using the nanodrop. 90μg of RNA was cleaned-up using the Qiagen RNA easy kit 2 0 according to the manufacturer's instructions (Qiagen). Contaminating DNA was removed 1 by using Rapid DNase out removal kit on 2μg of RNA in a 100μL reaction (Thermo).

2
Oligonucleotides for real-time PCR are listed in Table S3. One microgram of total RNA Genomic DNA libraries were prepared for Illumina sequencing using the Nextera sample  Read data have been deposited at the NCBI under BioSample accessions:  Copy-number variations (CNVs) were detected by averaging the per-nucleotide read 5 depth data across 100bp windows. For each window, the log 2 ratio in read depth between 6 the evolved and parental strain was calculated. The copy number was calculated from the 7 log 2 ratios and plotted using the R package DNA copy [62]. Smukowski Heil for their helpful comments on the manuscript. We also thank Lory Koshland for contributing to the initial experimental design, creating yeast strains, and 1 8 purchasing the oligonucleotides used for the microarrays. This work was funded by NSF Research and a Rita Allen Foundation Scholar. MS was funded by NSF GSRF DGE- Society for Microbiology. ABS was funded by F30CA165440 and IL by F32GM090561.

1
Funding was also from P50 GM071508 to the Lewis-Sigler Institute and from the 2 Howard Hughes Medical Institute to Doug Koshland and Yixian Zheng.