Conceived and designed the experiments: DG DB MJD. Performed the experiments: DG CMT HTJ DAP AW CGD MJD. Analyzed the data: DG MMD MJD. Wrote the paper: DG MMD DB MJD.
The authors have declared that no competing interests exist.
The experimental evolution of laboratory populations of microbes provides an opportunity to observe the evolutionary dynamics of adaptation in real time. Until very recently, however, such studies have been limited by our inability to systematically find mutations in evolved organisms. We overcome this limitation by using a variety of DNA microarray-based techniques to characterize genetic changes—including point mutations, structural changes, and insertion variation—that resulted from the experimental adaptation of 24 haploid and diploid cultures of
Adaptive evolution is a central biological process that underlies diverse phenomena from the acquisition of antibiotic resistance by microbes to the evolution of niche specialization. Two unresolved questions regarding adaptive evolution are what types of genomic variation are associated with adaptation and how repeatable is the process. We evolved yeast populations for more than 200 generations in nutrient-limited chemostats. We find that the phenotype of adapted individuals, as measured using global gene expression, is much less variable in clones adapted to sulfate limitation than either glucose or phosphate limitation. We comprehensively analyzed the genomes of adapted clones and found that those adapted to sulfate limitation almost invariably carry amplifications of the gene encoding a sulfur transporter, but the mutations in individuals adapted to glucose and phosphate limitation are much more diverse. This parallelism holds true at the level of single-nucleotide mutations. Although there may be other paths to adapt to sulfate limitation, one path confers a much greater advantage than all others so it dominates. By contrast, there are a number of ways to adapt to glucose and phosphate limitation that confer similar advantages. We conclude that the reproducibility of evolution depends on the specific selective pressure experienced by the organism.
The study of organismal evolution at the molecular level is a potent means of understanding how genomes evolve in response to selective pressures. Most kinds of evolutionary analysis are necessarily retrospective: individuals are sampled from a population in the present, genetic variation is assessed, and inferences about the past action of evolutionary forces are drawn from the patterns of observed variation. Yet by their nature, retrospective analyses based on variation at a snapshot in time cannot directly address the dynamics of evolution. Experimental evolution of microbes provides an alternative to this retrospective approach: short generation times and the ease of maintaining sizable populations make it feasible to observe adaptation in real time. The study of experimental evolution of microbes in controlled laboratory environments has a long history, beginning with the demonstration by Luria and Delbrück
First, we do not know how many mutations we expect cells to accumulate in a given time in a given environment, nor what fraction of these mutations will be neutral or contribute to large or small increases in fitness, although a few recent studies in phage
Microarray-based genomic technologies provide tools to tackle some of these basic questions, by allowing us to systematically find most mutations genome-wide in evolved strains, and track their fates through the experiments
In this paper we begin to address the above questions by following evolutionary adaptation of cultures of the single-celled eukaryote,
Consistent with earlier work
These factors also determine the level of both genotypic and phenotypic variation between independent populations evolving in response to the same nutrient limitation. We found that the phenotype of adapted individuals, as measured using global gene expression, is much less variable in clones from cultures adapted to sulfate limitation than either glucose or phosphate limitation. This is also reflected in the genotypic diversity among cultures; sulfate adapted clones almost invariably carried amplifications of
We studied 24 prototrophic populations evolving in chemostats in defined media in one of three conditions: glucose limitation, sulfate limitation, or phosphate limitation (
Population Name | Limitation | Strain Background | Ploidy | Mating Type | Generations of evolution |
G1 | glucose | S288c | 1N | MATα | 182 |
G2 | glucose | S288c | 1N | MATa | 311 |
G3 | glucose | S288c | 2N | MATa/α | 238 |
G4 | glucose | S288c | 2N | MATa/α | 237 |
G5 | glucose | CEN.PK | 1N | MATa | 164 |
G6 | glucose | CEN.PK | 1N | MATα | 162 |
G7 | glucose | CEN.PK | 2N | MATa/α | 237 |
G8 | glucose | CEN.PK | 2N | MATa/α | 328 |
P1 | phosphate | S288c | 1N | MATα | 180 |
P2 | phosphate | S288c | 1N | MATa | 316 |
P3 | phosphate | S288c | 2N | MATa/α | 222 |
P4 | phosphate | S288c | 2N | MATa/α | 205 |
P5 | phosphate | CEN.PK | 1N | MATa | 161 |
P6 | phosphate | CEN.PK | 1N | MATα | 217 |
P7 | phosphate | CEN.PK | 2N | MATa/α | 225 |
P8 | phosphate | CEN.PK | 2N | MATa/α | 201 |
S1 | sulfur | S288c | 1N | MATa | 297 |
S2 | sulfur | S288c | 1N | MATa | 188 |
S3 | sulfur | S288c | 2N | MATa/α | 306 |
S4 | sulfur | S288c | 2N | MATa/α | 256 |
S5 | sulfur | CEN.PK | 1N | MATa | 297 |
S6 |
sulfur | CEN.PK | 1N | MATα | 122 |
S7 | sulfur | CEN.PK | 2N | MATa/α | 303 |
S8 | sulfur | CEN.PK | 2N | MATa/α | 250 |
Generation zero is defined as the point at which chemostat flow was initiated.
A clone from this population, S6c1, has previously been reported in
We performed 24 independent experimental evolutions in chemostats under three different nutrient limitation regimes. Evolutions were performed using two different strain backgrounds that are amenable to long-term cultivation in chemostats. All strains were wildtype prototrophs.
In order to characterize the nature and diversity of the adaptive responses to nutrient limitation, we measured culture parameters and gene expression patterns in the final evolved populations and clones derived from these populations.
We detected changes in culture physiology over the course of the evolutionary experiments, consistent with improved fitness. Clones isolated from the evolved cultures and established in independent chemostats displayed similar physiological properties as the populations from which they were derived. 18 of 24 cultures showed an increase by an average of 11% dry weight yield when compared to their respective ancestral cultures. Phosphate- and glucose-limited cultures tended to increase in dry weight more often than sulfate-limited ones, consistent with the smaller proportion of cell mass contributed by sulfate compared to phosphate or glucose
To investigate how adaptation was reflected in altered transcriptional programs, we determined global gene expression phenotypes of two evolved clones from each population (N = 48) and a subset of complete evolved populations (N = 15). RNA from evolved and ancestral strains grown in matched conditions in chemostats was co-hybridized to DNA microarrays (see
We performed two-dimensional hierarchical clustering of the resulting data matrix (
(A) Gene expression data, presented as the log2-transformed ratio of each gene's expression value in the evolved versus ancestral strain, were hierarchically clustered on both axes (y-axis, 5443 genes; x-axis, 48 clone and 15 population samples). The dendrogram for the clustered experiments (x-axis) is color-coded by nutrient limitation (sulfate-limitation in red, glucose-limitation in green, and phosphate-limitation in blue). Orange horizontal bars represent groupings where the two clones and their corresponding population sample are more correlated with each other than with any other experiments. Glucose expression states fall into three phenoclusters (Gluc1, Gluc2, Gluc3) while phosphate expression states fall into four (Phos1, Phos2, Phos3, Phos4). (B) Density estimates of the distribution of pairwise pearson correlations (N = 112) of the expression states of clones selected under three different nutrient limitations. Clonal isolates from independent sulfate-limitation evolutions (red) were more similar to each other (median pearson distance = 0.425) than those obtained from independent glucose (green, median pearson distance = 0.152) or phosphate (blue, median pearson distance = 0.088) evolutions. The three distributions were compared using the Wilcoxon-Mann-Whitney rank-sum test. The distributions of pairwise correlations between sulfate and glucose clones are significantly different (U = 3097, p-value = 5.9×10−11) as are the distributions between sulfate and phosphate clones (U = 2545, p-value = 1.54×10−14). The distributions of pairwise distances between phosphate and glucose clones are not significantly different (U = 7103, p-value = 0.08681).
To quantify these observations, we calculated the distribution of pairwise pearson correlations for all expression profiles. This metric, which ranges from −1 to 1, provides a measure of the difference between phenotypic states. Positive values indicate similarity between pairs of expression states, while negative values reveal divergent expression states and values near zero indicate the absence of any relationship. Using this measure, we found substantial phenotypic homogeneity in the individual evolved populations. On average the two clones within a population are well correlated in expression (average pairwise correlation greater than 0.70;
Selection | Phosphate | Sulfur | Glucose |
Intrapopulation pairwise pearson distance between clones within populations | 0.84±0.17 (n = 8) | 0.70±0.17 (n = 8) | 0.70±0.19 (n = 8) |
Intrapopulation pairwise pearson distance between clones and population sample | 0.66±0.13 (n = 16) | 0.76 (n = 2) | 0.68±0.20 (n = 12) |
Interpopulation pairwise pearson distance between clones within limitations | 0.09±0.31 (n = 112) | 0.40±0.20 (n = 112) | 0.16±0.32 (n = 112) |
We computed pairwise pearson distances. The value presented is the mean of all pairwise pearson distance±one standard deviation. n indicates the number of pairwise distances calculated.
The distribution of the pairwise pearson correlations can also be used to compare the diversity of the patterns of gene expression among the several populations that were evolved independently under the same nutrient limitation. Analysis of these distributions for the three different nutrient limitation regimes (
One motivation for undertaking experimental evolution studies is to try to understand the adaptive metabolic strategies available to yeast cells. Conceivably, there might be condition-independent efficiencies possible. We could expect to observe these as red or green stripes across all of
We performed gene ontology (GO) term enrichment analysis for clusters of genes with correlation coefficients greater than 0.7 that contained more than 25 genes using the program GOLEM
Glucose | Phosphate | Sulfur | |||||
Term | p-value | Nclass/Ngenome (%) | p-value | Nclass/Ngenome (%) | p-value | Nclass/Ngenome (%) | |
Cellular lipid catabolic process | 0.0056 | 7/23 (30.4) | |||||
Sulfur metabolic process | 1.39×10−9 | 23/73 (31.5) | |||||
Response to toxin | 6.85×10−8 | 20/65 (30.8) | |||||
Nitrogen metabolic process | 8.25×10−7 | 42/280 (15.0) | |||||
Oxidation reduction | 0.00517 | 33/268 (12.3) | |||||
Carbohydrate metabolic process | 8.86×10−9 | 47/261 (18.0) | |||||
Vitamin metabolic process | 0.00075 | 19/88 (21.6) | |||||
Cell wall organization/biogenesis | 0.00401 | 30/203 (14.8) | |||||
Transmembrane transporter activity | 0.00051 | 31/308 (10.1) | 0.00037 | 38/308 (12.3) | |||
Oxidoreductase activity | 0.00456 | 35/304 (11.5) | |||||
Hydrolyzing O-glycosyl compounds | 1.25×10−5 | 14/40 (35.0) | |||||
Peroxisome | 0.00825 | 10/54 (18.5) | |||||
Integral to membrane | 0.00135 | 90/1065 (8.4) | 9.44×10−6 | 107/1065 (10.0) | |||
Cell wall | 8.98×10−13 | 32/103 (31.1) | |||||
Alcohol biosynthetic process | 5.91×10−10 | 20/56 (35.7) | |||||
Cofactor metabolic process | 2.15×10−8 | 36/184 (19.6) | |||||
Tricarboxylic acid cycle | 4.2×10−8 | 14/29 (48.3) | 0.0071 | 9/29 (31.0) | |||
Nitrogen compound metabolic process | 8.3×10−7 | 43/280 (15.4) | |||||
Oxidation reduction | 2.34×10−6 | 41/268 (15.3) | |||||
Iron homeostasis | 0.0055 | 10/33 (30.3) | 1.47×10−5 | 11/33 (33.3) | |||
Nucleotide metabolism | 0.00093 | 26/171 (15.2) | |||||
Vitamin metabolic process | 0.00447 | 17/88 (19.3) | |||||
Response to toxin | 0.0078 | 14/65 (21.5) | |||||
Generation of precursor metabolites and energy | 0.00381 | 38/313 (12.1) | 0.0048 | 36/313 (11.5) | |||
Amino acid and derivative metabolism | 0.00345537 | 11/33 (33.3) | |||||
Regulation of translation | 0.0058 | 32/358 (8.9) | |||||
Posttranscriptional regulation of expression | 0.00915 | 32/366 (8.7) | |||||
Oxidoreductase activity | 1.18×10−8 | 48/304 (15.8) | |||||
Iron ion binding | 9.79×10−5 | 22/115 (19.1) | 5.76×10−7 | 21/115 (18.3) | |||
Transmembrane transporter activity | 0.00015 | 40/308 (13.0) | |||||
ATP-dependent helicase activity | 3.83×10−6 | 18/71 (25.4) | |||||
Pyrophosphatase activity | 0.00119 | 37/321 (11.5) | |||||
Isocitrate dehydrogenase activity | 0.00775 | 4/5 (80) | |||||
Transition metal ion binding | 0.0038 | 43/519 (8.3) | |||||
Plasma membrane | 0.00012 | 40/312 (12.8) | |||||
Phosphopyruvate hydratase complex | 0.0071 | 4/5 (80) |
Genes +/−1.5 SD from the mean of summed expression changes across all experiments were analyzed for statistically significant GO term enrichment by computing a p-value using the hypergeometric distribution (The background set of genes was 5443, the number measured in microarray experiments). Nonredundant, statistically significant enriched GO terms (Bonferonni corrected p<0.01) are listed below. The number of genes annotated to each GO term that were +/−1.5 SD from the mean (Nclass) were divided by the number of genes within the genome that are annotated to that GO term (Ngenome) to determine a fractional representation of each term (%).
Most differentially expressed genes showed increases and decreases in expression in only a single selective regime. Notable in this regard are the strongly reduced expression of genes explicitly involved in fermentation (“alcohol biosynthetic process” in
In order to assess the coordination in gene expression changes more directly at the level of metabolic pathways, we analyzed gene expression data using the Saccharomyces Genome Database (SGD) Pathways Tools Omic Viewer (
To identify the range of genotypic responses to adaptation to nutrient-limited environments, we comprehensively characterized the genomes of evolved clones using several microarray-based methods that in combination identify the suite of structural, insertional and nucleotide variants in each clone
We analyzed structural variation in the genomes of evolved clones using microarray comparative genomic hybridization (CGH). We identified extensive structural variation in the genomes of clones recovered from the three nutrient limitation regimes (
(A) Amplified fragments that include the gene
We computed a running average of log2 ratios between evolved and ancestral genomes determined using CGH across 7 consecutive genes. Contiguous regions deviating from wildtype ploidy levels are colored red for amplifications and green for deletions. Regions that did not deviate from wildtype copy number are in gray. Centromeres of each of the 16 chromosomes are indicated by black dots.
Amplification of
We searched for identical sequence at these boundaries and found that the longest identical sequence match between the left and right breakpoints was only 7 bp for S2c1 and 6 bp for S4c1. Previously, we had identified minimal sequence homology (3bp) bounding a deletion of the gene
In all but one case, clones isolated from the same population shared breakpoints. Since this is unlikely to occur by chance, this indicates that mutations found in both clones reflect a single initial event, which then spread to high frequency in the population.
We detected a second class of prevalent structural variation in evolved clones, consisting of gross chromosomal events. Segmental changes in copy number were detected by CGH (
Certain regions of the genome appeared to be particularly susceptible to large-scale genomic events. For example, one copy of the right arm of chromosome XIV was deleted in three diploid sulfate- or glucose-limited cultures (
In addition to their association with chromosomal rearrangements, transposons are themselves a rich source of genomic variation with important roles in evolution
Two retrotransposition events occurred within genes. We identified a novel insertion site on chromosome IX in
To identify single-nucleotide changes (SNPs) accumulated over the course of the evolutions, we hybridized DNA from 10 evolved clones to overlapping tiling microarrays and used the SNPScanner algorithm to detect candidate mutations
We sequenced predicted SNPs using targeted Sanger sequencing reactions. In total we confirmed 34 mutation events in ten clonal isolates (
Population | Clone ID | SNPs | Intergenic | Genic | Synonymous | Missense | Nonsense | Alleles (% in final population±95% CI) |
G1 | G1c1 | 9 | 1 | 8 | 2 | 6 | 0 | |
Intergenic between |
||||||||
G1 | G1c2 | 1 | 1 | 0 | 0 | 0 | 0 | |
Intergenic between |
||||||||
G2 | G2c2 | 4 | 0 | 4 | 0 | 3 | 1 | |
G4 | G4c1 | 3 | 0 | 3 | 0 | 2 | 1 | |
P1 | P1c2 | 2 | 0 | 2 | 0 | 2 | 0 | |
P2 | P2c2 | 1 | 0 | 1 | 1 | 0 | 0 | |
P3 | P3c2 | 3 | 1 | 2 | 0 | 2 | 0 | |
Intergenic between |
||||||||
S1 | S1c1 | 3 | 0 | 3 | 2 | 0 | 1 | |
S2S2 | S2c1S2c1 | 3 | 0 | 3 | 0 | 1 | 2 | |
S4S4 | S4c1 | 3 | 2 | 1 | 0 | 1 | 0 | |
Intergenic between |
||||||||
Intergenic between |
||||||||
We predicted the presence of SNPs using the SNPScanner algorithm on tiling microarray data. In order to confirm the prediction and identify the sequence change we sequenced the locus using PCR and Sanger sequencing.
Of the 27 mutations that occur within genes, 22 (81%) result in nonsynonymous codon changes or truncating mutations. Notably, no mutations were found in genes encoding transporters or with obvious connection to nutrient import in the cell. No significant gene ontology (GO) term enrichment was found for the 23 different genes in which nonsynonymous SNPs or insertion events occurred. A number of mutated loci in glucose evolved clones have known roles in carbon metabolism:
We investigated the extent to which selection drove the observed genotypic changes and the dynamics by which adaptation occurred.
In order to measure the frequency of genomic rearrangements and to possibly detect structural variants not identified in the selected clones, we subjected DNA extracted from eleven population cultures to CGH. We detected measurable changes in segmental copy number in five population samples (
In order to assess the frequencies of point mutations, we developed a quantitative sequencing protocol (
To ascertain the history of each mutation through the experiment, we determined allele frequencies in archived population samples throughout the course of the evolution for the six populations in which we identified significant mutations (
We determined allele frequencies for SNPs identified at detectable frequencies in the final population sample using quantitative sequencing. (A) clone G1c1 (red,
Clone | Mutation | Selection Coefficients (coefficient±95% CI) | Generations to maximum frequency | Average final allele frequency | Estimated proportion at t = 0 | Subpopulation size at t = 0 |
1.126±0.058 | ||||||
1.076±0.029 | ||||||
1.074±0.021 | ||||||
1.101±0.023 | ||||||
1.067±0.020 | ||||||
181.5 | 26.2 | 2.06×10−8 | 2000 | |||
1.045±0.013 | ||||||
1.063±0.360 |
||||||
276.7 | 20.9 | 4.90×10−7 | 50000 | |||
1.025±0.098 | 188.5 | 15.55 | 1.75×10−3 | 1.75×108 | ||
1.069±0.0107 | ||||||
1.068±0.0130 | ||||||
191.5 | 57.9 | 2.87×10−6 | 2.87×105 |
not statistically significant.
We determined allele frequencies during the populations’ histories. In order to determine the maximum fitness advantage attributable to the identified mutations we identified the generation at which the allele frequency was the greatest and determined fitness coefficients based on the rate of allele frequency increase to that point. Assuming the fitness benefit has been constant over the evolution experiment we inferred the size of the subpopulation at the commencement of the evolution experiment.
While most allele frequencies increased monotonically, we observed three anomalous alleles. A mutation in
We sought to directly verify the selective advantage conferred by mutations in evolved clones. To do so, we performed fitness assays for two informative cases by competing mutant strains against the ancestral strain (see
For strains containing multiple significant point mutations, we sought to determine which mutations are adaptive, which are neutral hitchhikers, and which may interact epistatically. Competition among segregants derived from a backcross breaks the whole genome linkage imposed by asexual propagation and allows for comparison of all combinations of alleles. This results in a mixed population: if there are only two unlinked mutations of interest, the probability of having just one mutation in a segregant is 0.5 and the probability of having both or neither is 0.25. We backcrossed P1c2, in which we had identified mutations in
Given the inferred selection coefficients and the observed frequencies of mutations over the course of the evolutions, we can estimate the time at which adaptive mutations must have occurred. Specifically, the frequency of a mutation at time t, p(t), increases as
Based on our measurements of the selective advantages of mutant clones, which except in sulfate limitation ranged from 5–12%, and the times at which they reached detectable frequencies, we estimate that the mutations must have been present in 102 to 106 cells (depending on the specific mutation) at the initiation of chemostat flow (
In these calculations we have assumed that the fitness advantage of a mutant allele is constant over the course of the evolution. Yet we have calculated this fitness advantage on the basis of allele frequency data from when the mutant is relatively common. We cannot rule out the possibility that the mutations have frequency-dependent effects, such that the fitness advantage of a mutant is greater when it is rare than when it is common. It is also possible that the effective fitness advantage of a mutant declines as that mutant becomes more common because other beneficial mutants in competing backgrounds are also becoming more common. If this were the case, then mutations may have occurred later in the experiments than we have estimated. More sensitive allele frequency measurements will help resolve these questions.
Despite the importance of evolutionary ideas in every aspect of biology, there has been relatively little direct experimental data describing the processes and mechanism that underlie evolution. Only recently, through rapidly advancing genome technology, has it become practical to study directly the genetic basis of evolutionary change in an experimental setting. In this study, we analyzed experimental evolution in chemostats with DNA microarray technology, to assess genome-wide variation in gene expression and DNA copy number, and with a practical and affordable method for detecting single-nucleotide changes relative to the sequences of our starting yeast strains. We used these tools to begin to understand the phenotypic and genetic changes characteristic of the evolution of yeast in response to consistent glucose, sulfate, and phosphate limitation in the chemostat.
The main finding of our study is the nature, identity and dynamics of the mutations that occur over the course of these evolution experiments. These mutations confer a selective advantage ranging from ∼5% to as much as 50% per generation. One prevalent class of mutations consists of massive structural genomic alterations, consistent with our earlier observations
The majority of structural variation, however, is found in other regions of the genome. The reasons for the selective advantage of these variants are less clear, but their repeated observation points to their adaptive value. Interestingly, a recently reported mutation accumulation experiment in yeast also revealed significant aneuploidy in mutation accumulation experiments
Whole genome resequencing using tiling microarrays and sanger sequencing revealed that only a small number of point mutations accumulate during these experiments. We estimate that we have found >85% of these mutations, and as new technologies develop, we hope to eventually detect all mutations, including those in difficult repetitive sequences. Our number of acquired mutations is consistent with other microbial experimental evolution studies that have attempted to comprehensively identify mutations. Using a combination of microarray- and mass spectrometry-based sequencing, Herring et al.
The relative merits of haploid and diploid states with respect to adaptation have been hotly contested
We inferred that the batch phase of growth has a large effect on the parallelism of evolutionary paths in our experiments. During the initial batch phase of growth, the population size doubles every generation, which tightly constrains the time at which mutations occur and means that beneficial mutations are very unlikely to be lost by genetic drift. For example, if there is a class of mutations with a total mutation rate such that on average one such mutation will typically occur after 13 generations, such a mutation will almost always occur sometime between generation 10 and 16. However, when a mutation occurs later than average, it will be present at much lower frequency at the end of batch phase, and hence take substantially longer to spread through the population. The length of this delay depends dramatically on the fitness effect of the mutation: a mutation providing a 50% fitness advantage which occurs 3 generations later than average in batch phase will take 3 extra generations to reach a population frequency of 5%; a mutation of 10% advantage will take 20 extra generations, and a mutation of 1% advantage will take an extra 200 generations.
It follows that if a beneficial mutation that provides a large fitness advantage (of order 50%) occurs at a high enough rate to happen during the batch phase, it will almost always reach a substantial frequency within the population by the end of our experiment. On the other hand, a beneficial mutation of small fitness effect (of order 1%) will typically not do so unless it happens to occur very early in the batch phase (and even in this case if a larger-effect mutation occurs much later, the larger-effect mutation can reach high frequency more quickly). A mutation with effect of order 10% is an intermediate case; it will only reach substantial frequency within the population by the end of our experiment if it occurs early enough in batch phase. However, if mutations of roughly this effect occur at a rate of 10−7 or more, at least one such mutation will almost always occur early enough in batch phase to be observed at substantial frequency by the end of our experiment. In this case, the mutation that happens to occur first will typically be the one we observe. In other words, if beneficial mutations of large effect (of order 50%) are sufficiently common that they occur during batch phase, they will almost always occur and take over regardless of what smaller-effect mutations are already present. On the other hand, if large-effect mutations are rare, then the first beneficial mutation of intermediate effect (of order 10%) will typically dominate, because later mutations (even of slightly larger fitness effect) will be at large initial numerical disadvantage.
These dynamics appear to explain the extent of parallelism we observe between populations. In the sulfate-limited evolution, there is a class of
Our observations point to an important principle for adaptive phenomena in natural populations and disease: the diversity of adaptive outcomes will vary as a function of the distribution of fitness effects of beneficial mutations, which differs dramatically depending on the selective pressure. If there is a single “solution” that confers a vastly greater selective advantage, that path will be repeatedly observed. Conversely, a diversity of equally beneficial “solutions” will result in a reduction in the reproducibility of adaptation. An illustrative example of this principle is the recent report of selection for resistance of lung cancer cells to gefitinib or erlotinib using longterm culturing of cells in the presence of the drugs
Our experiments have identified the outcomes of adaptation to defined environments and revealed the diversity of genomic variation in clonal representatives of adapted populations. Our findings suggest a number of questions that should be addressed. First, it is critical to determine a neutral mutation rate for genome amplifications, deletions and rearrangements as well as the neutral mutation rate for retrotransposition. A recent report has provided new insights into these rates, indicating that large genome events occur at much greater frequencies than nucleotide changes
Detailed protocols can be found at
Continuous cultures were established using published methods
Experiments were started by initially growing cultures in 300mL of the appropriate defined media in batch phase. Once the cultures reached saturation, chemostat flow was initiated. Cultures were grown at a dilution rate of 0.17 volumes/hour. Daily samples were taken from the overflow in order to determine optical density at 600 nm, cell count and viability; perform microscopy; and make archival glycerol stocks. We confirmed that all evolved haploid clones maintained the same mating type as the founder by backcrossing the evolved strain to the isogenic ancestral strain of the opposite mating type. Clones from three of the twelve evolved diploid populations exhibited reduced sporulation efficiency, but did not mate inappropriately.
Chemostat samples were harvested by fast filtration and frozen immediately in liquid nitrogen. Gene expression differences can be caused by differences in strain background, mating type, ploidy, and nutrient limitation in cultures
To obtain a measure of experimental variation in cultivation and microarray procedures we established two independent cultures under identical conditions. We co-hybridized labeled mRNA from both cultures to an expression microarray and analyzed the distribution of the ratios at each microarray feature. The mean of the normally distributed data was equal to zero with a standard deviation of 0.19 log2 units. We defined a threshold of three standard deviations, corresponding to a 1.5-fold expression change (i.e. ±0.585 log2 units), for significant gene expression changes, consistent with previous reports for chemostat experiments
CGH in
PFGE and gel band DNA extraction was run as described
Transposon Specific Extraction (TSE) was performed as described
Affymetrix Yeast Tiling Array 1.0R arrays were hybridized with total genomic DNA and analyzed as previously described
The raw data files (.CEL) are available at
Clonal competition assays were performed using two different drug resistant markers. For testing the fitness of the haploid clone G1c1 a spontaneous canavanine drug-resistant mutant (CanR) was selected. Two 300 mL chemostats were inoculated with either the evolved strain marked with CanR or the ancestral strain, which is sensitive to canavanine (CanS). Cultures were brought to steady-state conditions over a period of ∼10 generations. 15 mL from the chemostat containing the ancestral strain was removed and replaced with 15 mL from the chemostat containing the CanR marked clone, corresponding to an initial population mix of 5% evolved clone and 95% ancestral clone. We sampled the chemostat an average of every 3 generations for approximately 50 generations. Cells were sonicated, diluted and plated on rich nonselective media and grown for 2 days at 30°C. We counted >200 colony forming units using sterile methods. Cells were then replica-plated to synthetic complete minus arginine media containing 60mg/L canavanine and allowed to grow at 30°C for 3 days. CanR cells were identified as fully formed colonies.
To test the fitness increase due to amplified copies of
As a control, we marked the ancestral wildtype strain with CanR and competed it against the isogenic CanS ancestor. This analysis revealed a slight relative fitness advantage of CanR in glucose limiting chemostats of 1.015. Given that this selective advantage is an order of magnitude less than that determined for the evolved clone we considered this contribution to fitness to be negligible.
We calculated the proportion of evolved cells (p) in the population and determined the proportion of ancestral cells q = 1−p. Fitness coefficients were computed by regressing ln(p/q) against the number of generations as described
We used quantitative sequencing to determine allele frequencies for all single nucleotide polymorphisms. Following confirmation of mutations identified using SNPScanner, we PCR amplified these genomic segments directly from the frozen glycerol stocks of heterogeneous populations. PCR products were cleaned up using Qiagen PCR cleanup kits and then sequenced in two different reactions using the forward and reverse primer. Control reactions using pure ancestral and evolved clones were also performed.
The relative allele frequency was determined from the resulting sequence traces using the program PeakPicker
We developed TaqMan allelic discrimination assays for a subset of single nucleotide polymorphism alleles. Custom TaqMan SNP Genotyping Assays probe/primer sets consisting of an allele specific FAM and VIC probe and common primers were obtained from Applied Biosystems. Sequences for primer and probes are available upon request. In order to facilitate high-throughput analysis we developed a protocol for directly genotyping whole cells from frozen glycerol stocks. We diluted glycerol stocks 1∶5 in water and then added 2.75 µL to 2.50 µL of 2X TaqMan Universal PCR Mix (Applied Biosystems) and 0.25 µL of 20x Primer/Probe mix to a final volume of 5 µL. Otherwise, we followed standard procedures for allelic discrimination plate reads as suggested by the manufacturer (Applied Biosystems).
We used allele frequency estimates from the evolution experiments to estimate the fitness coefficients for each mutation and each clone. For each mutation we used data from the first non-zero allele frequency until the allele frequency reached a maximum. This is not necessarily the last measured point as some alleles appeared to be stabilizing or even decreasing after initially increasing. By excluding those points our computed values are a maximal estimate of the fitness coefficient. Fitness coefficients were calculated by regressing ln(p/q) against the number of generations as described for the fitness assays above. We also computed a fitness coefficient for each clone by treating all allele frequency measurements at each time-point as independent measurements of a given clone's frequency.
Mutation rates were determined by fluctuation test
Pairwise pearson correlations were computed between the 16 clonal isolates within each selection. We excluded comparisons between the two clones derived from the same population and analyzed the resulting 112 comparisons. Probability density estimates were computed in R. In order to compare the three distributions we performed a Mann-Whitney rank-sum test.
Microarray analysis of predicted tranlocation PFGE bands.(A) Sample gel showing new bands in both clones from population G8. Chromosome ladder from wt strain shown at right. (B) Microarray analysis of all gel bands for predicted translocations (see
(0.54 MB PDF)
Identification of a long terminal repeat (LTR) insertion in
(0.12 MB PDF)
CGH analysis of population samples to determine allele frequencies. We performed CGH on DNA samples derived from population samples harvested from the endpoint or near the endpoint of the evolutions. Copy number variants present at detectable frequencies in the population are indicated in red or green. Only population samples with detectable copy number changes are shown. Calculated frequencies: G6, 2 copies of chromosome 12 (24%); P3, 1 copy of chrIII segment (20%), 3 copies of chrV segment (47%); P5, 2 copies of chrVI segment (17%), 2 copies of chrXIII (17%); P6, 2 copies chrV segment (23%), 2 copies chrVI segment (23%); P7, 3 copies of chrIV (69%), 3 copies of chrVI (77%), 3 copies of chrX (75%), 3 copies of chrXVI (73%), 4 copies of chrXIII (66%).
(0.41 MB PDF)
Representative results of quantitative sequencing. We estimated allele frequencies by analyzing the electropherogram data using the program PeakPicker
(3.36 MB PDF)
Representative results of Taqman allelic discrimination. We analyzed a subset of allele frequencies using Taqman allelic discrimination assays. Custom probes and primer sets were manufactured for each allele. 96 samples were analyzed in quadruplicate using an ABI 9700T plate reader. In each plate we included a no template control (green asterisk), allele control (pink triangle) and evolved allele control (gray plus signs). We used a custom k-medians clustering algorithm to assign genotypes to ancestral (blue circles) or evolved (red circles) state.
(0.10 MB PDF)
Comparison of SNP allele frequency estimations using quantitative sequencing and Taqman allelic discrimination. We compared allele frequency estimates using quantitative sequencing (red) with those obtained by genotyping clonal isolates using TaqMan allelic discrimination analysis (green) for two alleles: (A)
(0.28 MB PDF)
Clonal competition assays reveal fitness benefit per culture generation. We competed clonal isolates against the ancestral strain as described (see
(0.25 MB PDF)
Meiotic separation of alleles identifies advantageous allele. Following a backcross of evolved clone P1c2 to the ancestral strain meiotic segregants were isolated and competed against one another in a chemostat. We performed two experiments: (A) one in which only the MATa segregants were included and (B) one in which only MATα segregants were included. In both experiments the
(0.22 MB PDF)
Complete GO term enrichment results for clusters 1-13 summarized in
(0.21 MB XLS)
Summary of structural variation in genomes of evolved clones.
(0.09 MB DOC)
Segregation analysis of
(0.04 MB DOC)
Transposon specific extraction results.
(0.04 MB DOC)
Processed gene expression microarray data for all clones and populations.
(7.69 MB XLS)
Processed CGH data for all clones and populations.
(4.12 MB XLS)
We thank Samuel Leachman for Taqman results.