An evolutionary innovation is contingent on maintaining adaptive potential until competition subsides

After 15 years of the Lenski experiment one of twelve Escherichia coli populations evolved the ability to utilize an abundant but previously untapped carbon source, citrate. Mutations responsible for the appearance of rudimentary citrate utilization (Cit+ phenotype) and for refining this ability have been characterized. However, the complete nature of the genetic and/or ecological events that set the stage for this key innovation remain unknown. We found that there was a slight fitness benefit for introducing an activated citT cassette that mimics the mutation causing Cit+ into the ancestor of the evolution experiment and strains isolated from the population close to when it evolved. However, there was no benefit or even a large deleterious effect in intermediate strains. We conclude that achieving Cit+ was contingent on both an evolutionary trajectory that maintained a potentiated genetic state and the slowing rate of adaptation in this population late in the experiment.

Introduction time points were tested to determine whether they were capable of evolving citrate utilization.
Cit + cells rarely arose in these replay experiments. When they did, the Cit + trait re-evolved more often in clones selected from later time points that were closer to when the citT duplication first arose in the LTEE population.
The phylogenetic distribution of the LTEE strains giving rise to Cit + variants in the replay experiments suggests the existence of at least two critical junctures at which the potential for evolving Cit + increased 8 . By 20,000 generations, the LTEE population had diversified into three long-lived clades that co-existed at least until full citrate utilization (Cit ++ ) evolved at ~33,000 generations. E. coli isolates from all of these groups evolved Cit + in the replay experiments, whereas no strains from earlier than 20,000 generations did, suggesting that all three clades share some determinant of potentiation. A significantly higher proportion of clones from the clade that gave rise to citrate utilization in the LTEE were able to evolve Cit + in the replays, suggesting that they share a second determinant for increased potentiation not present in the other clades. Due to the extreme rarity of Cit + arising in these replay experiments even after months of evolution, it is not realistic to use this approach to further narrow down the genetic basis of potentiation.
In this study, we tested the viability of the critical actualizing mutation for Cit + evolution in a series of pre-Cit + isolates from the LTEE by measuring the effect of activating citT expression on competitive fitness. We found that activating citT expression slightly increased the fitness of the ancestral strain and some later pre-Cit + clones. Unexpectedly, activating citT expression was highly deleterious in certain strains from intermediate time points, and we did not find any strains that benefitted significantly more from this mutation than the ancestral strain did. We conclude that potentiation for the evolution of citrate utilization in the LTEE is due to the interplay of genetic factors in specific strains and the population at large. First, adaptation had to occur via a genetic trajectory that maintained the potential for evolving Cit + by a beneficial mutational step in order for the innovation to remain accessible. Second, the rate of adaptation in the overall population needed to slow to a pace at which early variants with the weakly beneficial Cit + trait could avoid being driven extinct by competitors before refining mutations arose.

Cit + was only slightly beneficial when it evolved in the LTEE
By definition, the first E. coli cell that evolved the citT-activating mutation that was ultimately successful in the LTEE was fully potentiated when this mutation arose. The earliest Cit + descendant of this cell that has been identified is strain ZDB564 from 31,500 generations. At this time Cit + cells were still extremely rare in the population 10 , which means that it is likely that the suite of mutations in ZDB564 is identical to those in the first Cit + cell, or nearly so. Previously, strain ZDB706, a Citrevertant of ZDB564, was isolated by passaging ZDB564 on DM medium lacking citrate to allow for the spontaneous collapse of the rnk-citG duplication to the ancestral single-copy state that lacks a copy of the rnk promoter upstream of citT (Fig. 1a) 10 .
We co-cultured ZDB564 and ZDB706 in DM medium to estimate the effect that the citT duplication had on competitive fitness when it originally arose. These experiments involved reverting an arabinose-utilization allele in one of the two strains to be competed from the inactivated state present in all strains from this LTEE population (Ara -) to the active state (Ara + ) so that cells of each type can be distinguished by the colors of the colonies that they form on indicator plates ( Fig. S1 and Methods) 4 . These Ara + strain variants were assayed to establish that the genetic marker was neutral with respect to fitness and that no secondary mutations affecting fitness had accumulated during strain construction prior to further competition experiments (Fig.  S2). When competing ZDB564 and ZDB706, we found a slight fitness advantage of 2.2% for the presence of the citT-activating duplication in ZDB564 (Fig. 1b). This result that was consistent between the competitions utilizing Ara + ZDB706 or Ara + ZDB564 marked variants (two-tailed ttest, P=0.33, n=12 and 18, respectively).

Development of P rnk -citT knock-in assay for potentiation
We next wanted to add the citT-activating mutation to pre-Cit + strains in order to test our hypothesis that there was a transition in the lineage leading to Cit + such that this mutation became more beneficial once a potentiated genetic background evolved. The effect of adding a plasmid containing the evolved P rnk -citT unit has been tested in previous studies, 8,9 but this approach is problematic because these plasmids are multicopy, whereas only a single activated copy of the citT gene was present in the initial Cit + strains. However, engineering the authentic rnk-citG duplication into the chromosome of a strain is difficult because this configuration is genetically unstable. It readily collapses via homologous recombination if there is not selection to maintain citrate utilization, as was utilized in reverting ZDB564 to the Citvariant ZDB706.
To address these shortcomings, we developed a P rnk -citT knock-in assay, in which a mimic of the evolved configuration is integrated into the chromosome of a pre-Cit + LTEE clone (Fig. 1c).
Briefly, we created an activated citT module linked to an antibiotic selection marker in which the rnk promoter is upstream of the truncated rnk-citG fusion ORF formed by the duplication followed by the complete citT reading frame. To control for any fitness cost imposed by the selection marker, we also made a null module containing only the antibiotic resistance gene.
Both of these cassettes are targeted to integrate into the E. coli chromosome such that they replace the lac operon, which is unrelated to citrate or glucose metabolism. We validated this approach by adding the P rnk -citT module to the fully potentiated Citrevertant, ZDB706, and adding the null module to its neutral Ara + variant. Addition of the P rnk -citT cassette to the fully potentiated Citstrain ZDB706 resulted in increased citT mRNA levels equivalent to those seen in ZDB564, the original Cit + isolate with the actual rnk-citG duplication that evolved in the LTEE (two-tailed t-test, P = 0.65, n = 3) (Fig. 1d). The resulting Cit + variant of ZDB706 had a fitness advantage of 2.4% over the corresponding Citvariant with the null knock-in cassette (Fig. 1b), which was not statistically different from the fitness advantage found for the authentic citT-activating mutation in the pooled ZDB564 versus ZDB706 competitions (two-tailed t-test, P = 0.71, n = 6 and 30, respectively). Therefore, applying the P rnk -citT knockin assay to additional strains allows us to ask: if the citT-activating mutation had evolved in a genetic background that existed earlier in the LTEE, would it have been as beneficial?

Cit + would have been modestly beneficial if it evolved in the LTEE ancestor
As a first step in further elucidating the fitness consequences of evolving rudimentary Cit + on other strains from the LTEE, we performed the P rnk -citT knock-in assay on the ancestral LTEE strain, REL606. We found a slight fitness benefit of 1.0% for the Cit + mutation (Fig. 1b). This effect size is near the limit for the smallest differences that can be distinguished in these types of competitive fitness assays, resulting in relatively weak support for the hypothesis that there was any fitness advantage at all for the REL606 variant with the P rnk -citT module relative to the one with the null module (one-tailed t-test, P = 0.033, n = 12). There was evidence, though also not very strong, that the benefit of the P rnk -citT module in the fully potentiated strain ZDB706 was greater than it was in REL606 (one-tailed t-test, P = 0.018, n = 12 and 6, respectively).
Expression of citT was not quite as high in the REL606 strain with the P rnk -citT module as it was in ZDB706 with the same module (two-tailed t-test, P = 0.00016, n = 3) (Fig. 1d), suggesting that mutations during the LTEE on the lineage leading to Cit + may have altered the strength of the rnk promoter. Overall, the REL606 measurements indicated, surprisingly, that there was likely a modest benefit for a mutation activating expression of citT at the very beginning of the LTEE, and that this benefit may have only slightly improved after further mutations that occurred during the potentiation stage in the evolution of this metabolic innovation.

No evidence for ecological potentiation
Why did the appearance of citrate utilization take so long and why has it not evolved in other LTEE populations? One hypothesis for its rarity is that the evolution of a particular ecology in the population was important for enabling the evolution of Cit + . This type of situation is known to occur, for example, when nutrient cross-feeding between genetically diverged subpopulations yields negative frequency dependence, such that the competitive advantage for a newly evolved strain or a certain subpopulation is greater when it is rare within the population than when it is common 12 . The pre-Cit + clade was rare during the time period when the rnk-citG duplication evolved. It constituted <1-5% of the population from 30,000 to 32,500 generations 10 .
To test whether this kind of 'ecological potentiation' was important for the evolution of Cit + in the LTEE, we repeated the P rnk -citT knock-in assay competition for strain ZDB706 in the context of the full diversity that existed in the population at 31,000 generations (Fig. 1b). The Cit + and Citvariants were mixed together equally and added such that they comprised ~1% of the cells in a mixture with the evolved population sample. In this context, the Cit + strain had a 0.9% fitness advantage over the Citstrain, which was less than and only marginally different from the result when the two strains were competed versus one another normally (two-tailed t-test, P = 0.053). Thus, we find no support for the ecological potentiation hypothesis. If anything, the more diverse mixed population context may slightly reduce the benefit of Cit + evolution.

Anti-potentiated strains evolved at intermediate time points
We next performed the P rnk -citT knock-in assay on 23 additional clones isolated from the LTEE population (Fig. 2a). Our goal was to determine whether activating citT expression was similarly beneficial in other evolved genetic backgrounds. We measured citT mRNA levels in five of the constructed strains with the P rnk -citT cassette and found them to be similar in all of these strains ( Fig. S5), indicating that the strength of the rnk promoter was largely unchanged by the specific suites of evolved mutations present in each of these strains. For four of the evolved strains we found strong evidence that the citT cassette significantly increased fitness versus the control with the null cassette, as it had in the fully potentiated strain ZDB706 (one-tailed bootstrap test incorporating Ara + /Aramarker and Cit + /Citcompetitions described in Methods, P < 0.05). In nine strains, citT activation had no significant effect on fitness (two-tailed bootstrap test, P < 0.05), though our measurements did not achieve sufficient precision to rule out that there was a fitness benefit of 1% or greater in seven of these cases (one-tailed bootstrap test, P < 0.05).
Unexpectedly, the Citvariant outcompeted the Cit + variant for the 11 remaining strains of the 23 we tested (one-tailed bootstrap test, P < 0.05). The actualizing step needed for subsequently evolving full citrate utilization (Cit ++ ) would have been effectively blocked if it occurred in these strain backgrounds; they are 'anti-potentiated'. For five of these strains, activating citT expression was extremely detrimental, decreasing competitive fitness by >20% (one-tailed bootstrap test, P < 0.05). For ZDB483 and ZDB14, two of the severely antipotentiated strains, we investigated the nature of this defect by comparing growth curves of the Cit + and Citvariants. There was very little difference in the growth curve for the LTEE ancestor REL606 whether the activated citT cassette or null control cassette was added to its genome, which is in keeping with its almost imperceptible effect on the competitive fitness of this strain.
In contrast, we found that activating citT expression drastically increased the lag phase of growth in the severely anti-potentiated strains (Fig. 2b). This additional lag time can explain the sizable competitive disadvantage versus the Citstrain, even though the Cit + variants are able to reach a higher final cell density if cultured alone.

Mapping potentiation onto phylogeny
Identifying specific mutations that contributed to potentiation and anti-potentiation requires interpreting the fitness data from the P rnk -citT knock-in assays in a phylogenetic context. To improve the resolution of a previously published whole-genome phylogenetic tree of 29 clonal isolates from this LTEE population 8 , we sequenced the genomes of 20 new clones (Table S1) and also incorporated 12 other clones sequenced in another recent study of the rate of genome evolution through 50,000 generations in all LTEE populations 13 . The 20 newly sequenced isolates were selected to improve our ability to temporally order mutations that occurred near when citrate utilization evolved: they were minimally diverged from the line of descent to the Cit + progenitor and were mostly sampled at later time points.
The updated phylogenetic tree (Fig. 3) includes all 25 clones we tested with the P rnk -citT knock-in assay. We used these strains to identify branches in the tree within which the adaptive potential of activating citT expression changed due to one or more mutations. Specifically, we clustered phylogenetically-adjacent strains into groups within which all pairwise comparisons of the fitness effect of the P rnk -citT module were not significantly different (Bonferroni-corrected two-tailed bootstrap tests, P > 0.05). Overall, this analysis suggests that there were at least three major step-like changes in the potential for evolving the rudimentary Cit + trait along the pre-Citlineage that eventually evolved citrate utilization (Fig. 4).
Proceeding backward in the tree from the earliest known Cit + isolate (ZDB564), two earlier clones (ZDB19 and ZDB13) from as early as 29,000 generations are as fully potentiated as the key Citrevertant (ZDB706). The overall fitness effect of evolving Cit + in this group was +2.4% [+1.4%, +3.4%] (95% confidence interval). The next-earliest group comprises three clones isolated at time points from 25,000 to 27,000 generations (ZDB478, ZDB486, and ZDB309).
Activation of citT had little to no impact on this set of strains, with an estimated group-wise effect on fitness of +0.4% [-1.3%, 2.0%]. One intermediate strain, ZDB310 from 27,000 generations, was not significantly different from either of these two groups immediately before and afterward, although the two groups were significantly different from one another.
These clones appear to be genetically typical of the pre-Cit + lineage. ZDB425 at 10,000 generations and ZDB458 at 20,000 generations have only one and two 'private' mutations not shared with the main pre-Cit + lineage, respectively, though we cannot rule out that other changes in the impact of citT activation may have occurred on the main line of descent within this interval. Before these anti-potentiated clones, there is an initial cluster that groups ZDB409 and ZDB429 with the REL606 ancestor. In these three isolates, evolution of Cit + would have been slightly beneficial with a fitness impact of +1.7% [+0.0%, +3.3%].
Other strains are not classified into these major groups. It is less likely that they are representative of how potentiation evolved in the lineage leading to Cit + . For example, the four most highly anti-potentiated clones (ZDB467, ZDB483, ZDB14, ZDB18) appear to have evolved this property independently and due to 'private' mutations not shared with the main pre-Cit + lineage ( Fig. 3), at least this is the most parsimonious explanation. Similarly, the fitness effects measured in the P rnk -citT knock-in assay for two three-member subclades (ZDB334, ZDB339, ZDB317; and ZDB23, ZDB27, ZDB25) indicate that each likely shared one or more mutations that altered Cit + potentiation only within that subclade, though the effects are much smaller in these cases. Finally, we excluded ZDB446 from this analysis because it was so deeply branched: removed by >5,000 generations from the pre-Cit + lineage. It would have been clustered with the earliest group containing REL606 according to our criteria.

Cit + evolution in the context of competition with other beneficial mutations
During the time period when the pre-Cit + lineage was anti-potentiated, from approximately 10,000 to 20,000 generations, invasion of a new Cit + subpopulation would have been nearly impossible. Lineages that lost fitness by evolving the rudimentary version of this new trait would be rapidly purged by selection before refining mutations (e.g., activating dctA) could accumulate to give the decisive benefit of full citrate utilization (the Cit ++ phenotype). What about the earlier and later time periods when the evolution of Cit + was neutral or slightly beneficial? During these epochs, a newly evolved Cit + lineage would still have had to compete with not only its own ancestor, but also against other lineages that were evolving at the same time, many of which would have other beneficial mutations. That is, an incipient Cit + lineage had to survive in competition with alternative adaptive pathways, such as those improving fitness on glucose.
In order to understand when the fitness effects we measured for evolving Cit + by citT activation would have made this metabolic innovation a viable evolutionary pathway in the context of competition within the LTEE population, we compared the group-wise fitness effects determined from the P rnk -citT knock-in assays to two models of the fitness effects of beneficial mutations that were successful at different generations in this LTEE population (Fig. 4) The models demonstrate that even if Cit + evolution was marginally beneficial in the REL606 ancestor and other early isolates, it was initially much less beneficial than was needed to be successful at this point. Even by 5,000 generations, citT activation appears to have been average, at best, in terms of its fitness effect among all possible beneficial mutations. It would have been unlikely for the Cit + trait to appear and persist at this point because there were so many alternative mutations, such as those that required only single-base substitutions or IS insertions that knocked out gene function, which would have occurred at a higher rate than the specific duplications or IS element insertions needed to activate citT expression 7 . After anti-potentiation appeared and receded in this lineage, competition would have continued to suppress Cit + evolution when the citT mutation was again neutral. In striking contrast, evolving Cit + was clearly superior to a typical successful beneficial mutation in the final group of strains that first evolved by 29,000 generations. It was a viable adaptive pathway at this point. Thus, by comparing P rnk -citT knock-in assays to models of the rates of population evolution, we can explain how a variant with a rudimentary Cit + trait was able to appear and avoid extinction long enough to achieve the decisive dctA mutation that led to the dominant Cit ++ trait.

Discussion
Our work reframes and further elucidates why the emergence of citrate utilization is so rare in the Lenski long-term evolution experiment (LTEE). Rudimentary citrate utilization (the Cit + phenotype) can apparently evolve at any time when a mutation switches on expression of the CitT transporter under the aerobic conditions of the experiment. However, the success of a new Cit + variant is far from guaranteed. It is contingent on whether its descendants can survive long enough to incorporate a second mutation, such as one activating expression of the DctA transporter, that enables full citrate utilization (the Cit ++ phenotype). The chance that Cit ++ will be realized by this evolutionary pathway is dependent on two major factors. First, the initial mutational step conferring the weak Cit + phenotype must be beneficial to fitness. Whether it is advantageous or not depends on the context of other mutations present in an evolved genome in which citT activation occurs. Second, the benefit of the mutation conferring weak Cit + must be great enough that it can survive in competition with other adaptive mutations. Whether it is sufficiently beneficial depends on the population context in which it arises. We found that both genetic and population factors limited Cit ++ evolution at different times in the LTEE (Fig. 4).
Unexpectedly, evolution of Cit + by activating citT expression appears to have already been slightly beneficial to fitness in the ancestral strain used to found this E. coli population on the first day of the LTEE and to have remained so in other early evolved isolates. Even though Cit + strains that evolved in the LTEE population at this point would have been capable of displacing their own Citancestors, this first step on the pathway to the full Cit ++ innovation was suppressed due to competition with mutations on adaptive pathways that improve fitness in the original glucose niche. New cells with highly beneficial mutations related to this primary component of the LTEE environment were essentially guaranteed to arise in the population and outcompete any cells with mutations activating citT expression. By 10,000 generations, the lineage in which Cit + eventually evolved became 'anti-potentiated' after it accumulated additional mutations. Now, the pathway to innovation was blocked because it was deleterious to evolve rudimentary Cit + in this genetic background. There was a fitness valley separating the evolved Citstrains from the full Cit ++ phenotype. Finally, further mutations appeared in the focal LTEE lineage by 29,000 generations that altered the fitness impact of activating citT expression such that it was again beneficial to evolve the Cit + phenotype, and perhaps even more so than it had been in the ancestor. At this point, the rate of adaptation of the population had slowed enough that evolving rudimentary Cit + was now among the most beneficial mutational steps remaining. The two-step mutational pathway to Cit ++ was no longer suppressed by genetic or population factors, and the Cit ++ innovation evolved.
Cit ++ mutants of E. coli capable of growth on citrate as a sole carbon source under aerobic conditions have been isolated in other studies 7,8,17,18 . In all of these cases, multiple mutations have been required to achieve the Cit ++ phenotype. When they have been identified, the mutations that yield Cit ++ activate expression of the CitT and DctA transporters, as is observed in the LTEE. These studies have isolated Cit ++ mutants in much shorter periods of time (<1-8 weeks) than it took to evolve in the LTEE (~15 years) because they involve starving E. coli cells for days to weeks under conditions in which citrate was present as a potential carbon source. In the context of our results and as previously noted by others 19  Why is evolution of Cit + beneficial in some evolved genetic backgrounds and deleterious in others under the conditions of the LTEE? Activation of CitT expression under these aerobic conditions via the rnk-citG duplication leads to coupled import of citrate (a C 6 -tricarboxyate) and export of C 4 -dicarboxylates (e.g., succinate) 20 . In wild-type E. coli strains, CitT is normally expressed only under anaerobic conditions, and the imported citrate can only be assimilated when a fermentable co-substrate, such as glucose, is also present 21 . Under these conditions, citrate is cleaved to acetate and oxaloacetate by citrate lyase. The structural proteins and accessory factors necessary for producing this enzyme complex are encoded in the same operon as citT. When glucose is co-utilized with citrate, the resulting oxaloacetate is reduced to succinate by reverse tricarboxylic acid (TCA) cycle reactions. This process consumes reduced cofactors produced by breakdown of the sugar to balance redox metabolism without the need for O 2 . The succinate or other C 4 -dicarboxylates produced can be exchanged for more citrate import via CitT to continue this mixed fermentation mode of growth, or these TCA cycle intermediates can be siphoned off into biosynthetic pathways as necessary for cellular replication.
Under the aerobic conditions of the LTEE, citrate lyase is not expressed and succinate to balance citrate import by CitT must be produced in a different manner, from citrate or glucose using reactions of central metabolism. The availability of O 2 makes it possible to maintain redox balance while synthesizing succinate via the TCA cycle, the glyoxylate bypass, or anaplerotic reactions (e.g., phosphoenolpyruvate carboxylase). E. coli growing under aerobic conditions ferments glucose to acetate, and mutations in genes related to the ability to re-uptake and utilize acetate are widespread in the LTEE 13,22,23 . These mutations affect acetate transporters and also pathways for assimilating acetate as acetyl-CoA through citrate synthase, the TCA cycle, and the glyoxylate bypass. Therefore, how these pathways are altered by adaptation to better utilize glucose and acetate is likely an important determinant of the genetic background that affects the ability to evolve citrate utilization. If introduction of the CitT transport reaction misbalances the redox state of the cell or the distribution of carbon compound intermediates between anabolism and catabolism, then it would be deleterious to fitness. Therefore, mutations altering central metabolism are candidates for explaining the changes in the fitness effect of citT activation along the LTEE lineage that ultimately evolved citrate utilization (Fig. 4).
Starting with the ancestor and examining when changes in the potential for evolving Cit + were observed in the LTEE, a mutation in nadR, a repressor of NAD coenzyme biosynthesis 24 , occurs along the branch in the phylogenetic tree when anti-potentiation first evolved, before 10,000 generations. Mutations in nadR have appeared and swept to fixation in all twelve LTEE populations. These mutations include frameshift mutations and IS element insertions 13 , indicating that they are loss-of-function mutations, and deleting this gene from the genome of the LTEE ancestor has been shown to be beneficial 15 . Reducing or eliminating NadR activity is predicted to increase the NAD/NADH pool in the cell and could enable increased rates of glucose fermentation. Since NADH is a potent allosteric regulator of enzymes in central metabolism, including citrate synthase (gltA) for entry into the TCA cycle, this mutation may also reconfigure other cellular fluxes in ways that make CitT transport deleterious to fitness.
Between 10,000 and 25,000 generations mutations occurred in this LTEE population in three key genes that affected the activities of enzymes in central metabolism: iclR, arcB, and gltA1.
These mutations have all been shown to improve growth on acetate 10 . Two of these mutations are in negative regulators; they are expected to derepress enzymes of the glyoxylate bypass (iclR) 25 and TCA cycle (arcB) 26  Both the arcB and gltA1 mutations occurred on a branch in the phylogenetic tree for the citrate LTEE population when the effect of citT activation reverted to being neutral with respect to competitive fitness, so they are candidates for reversing anti-potentiation. The iclR mutation does not seem to have had an effect on genetic potentiation on its own, but it may have interacted with the arcB and/or gltA1 mutations in a way that contributes to this anti-potentiation effect.
Only one mutation in a gene known to be involved in central metabolism occurred around 27,000 to 29,000 generations, at the point in the phylogenetic tree when adding the citT mutation seems to have again become beneficial to fitness. This mutation is upstream of the ilv operon for branched chain amino acid biosynthesis in the yifB/ilvL intergenic region. This pathway consumes pyruvate and acetyl-CoA, and its products can be used to synthesize the pantothenate moiety of coenzyme A (CoA) 27 . If this mutation affects gene expression of the ilv operon, then it could impact the balance of citric acid cycle intermediates flowing into or out of the TCA cycle to sustain cellular growth directly or indirectly via changing CoA/acetyl-CoA availability.
While the functions of the genes that we have highlighted in central metabolism suggest that they may be especially important for altering the potential for Cit + evolution, other mutations also accumulated on the branches in the phylogenetic tree where the effects of citT activation on E. coli fitness changed (Fig. 3). In future work, the P rnk -citT knock-in assay can be used to further dissect this adaptive pathway by testing strains in which various evolved alleles have been removed or added. As an example of this type of approach, we have previously shown that removing the gltA1 mutation from the earliest Cit + isolate (ZDB564) makes the citT-activating duplication highly deleterious because it introduces a growth lag like that observed in the strongly anti-potentiated LTEE isolates in this study 10 . Similar studies could be conducted on strains that represent as closely as possible the genotypes present at critical junctures in the phylogenetic tree to determine which mutations altered the chances of achieving this innovation.
Another remaining question is whether the Cit + innovation will ever evolve in the other eleven LTEE populations. It has not as of more than 60,000 generations 23 , nearly twice the amount of time that was required for it to evolve in the population analyzed here 7 . The 'innovation interference' of other highly beneficial mutations within a population suppressing Cit + evolution has undoubtedly faded in all eleven of these populations as the pace of fitness increase has slowed similarly in all of them 14,28 . However, the ubiquity of nadR mutations in the LTEE may indicate that other populations similarly descended into a genetically anti-potentiated state. Our results suggest that Cit ++ may still appear in the future if mutations suitably adjust fluxes in central metabolism to make evolving rudimentary Cit + by activating citT expression a beneficial step on the pathway to innovation, as long as no critical components have been irrecoverably lost from the genome. Through 50,000 generations, no population has deleted either citT or dctA, and these genes have not accumulated any mutations in most populations 13 , so the latent genetic potential to evolve Cit + seems to have remained intact so far.
The LTEE is an open-ended evolution experiment 29 ; it did not begin with the aim of isolating E. coli that utilize citrate. There was never strong selection for this novel capability. Because evolving citrate utilization allowed the new Cit ++ clade to colonize an untapped nutrient niche and rapidly diversify, this new metabolic capacity is an example of a key evolutionary innovation 30 . The evolution of Cit ++ initiated a new round of rapid evolutionary optimization that included mutations that reduced the activity of citrate synthase (gltA2) and eliminated flux through the glyoxylate shunt (aceA), both of which reversed the effects of pre-Cit + adaptive mutations 10 . The many new possibilities for improving fitness in this alternative niche also likely contributed to the evolution of hypermutation within the Cit ++ clade by 36,000 generations 8 .
Lastly, new ecological interactions arose in this population such that Citand Cit ++ types coexisted via negative-frequency dependent interactions for at least 10,000 generations after Cit ++ evolved 7,8 . Continuing evolution of interactions between these and other E. coli lineages led to the emergence of an ecology that is unique to this flask in the LTEE 11 .
We found that a metabolic innovation in a laboratory population of E. coli was contingent on both a history of genetic adaptation and ongoing population dynamics. Evolution of metabolic capabilities has been found to be crucial to the emergence and continued success of bacterial pathogens in several instances 31,32 . For example, Salmonella acquired the ability to use tetrathionate as an electron acceptor, giving it a growth advantage relative to other bacteria in the environment that it creates in the gut during infection by inducing inflammation 33 . On a shorter timescale, mutations in the opportunistic pathogen Pseudomonas aeruginosa that accumulate during chronic infections in the cystic fibrosis lung lead to an increased ability to acquire iron from hemoglobin 34 . Even in the simple environment of the LTEE, both genetic and population factors suppress the evolution of an innovation that allows a new niche to be exploited by a new bacterial species. It may be useful in the treatment of disease to understand when these and other factors, including competition for specific nutrients by commensal species in a microbiome, can be used to suppress evolutionary outcomes that are harmful to human health 35 .

Media conditions and strains. E. coli were cultured in Davis-Mingioli (DM) medium and
Lysogeny Broth (LB) 10 . As necessary, media were supplemented with 50 µg/mL kanamycin and 80 µg/mL 5-bromo-4-chloro-3-indolyl β-D-galactopyranoside (X-gal). Evolved clones characterized in this study from archived LTEE populations and strain ZDB706 (the spontaneous Citrevertant of ZDB564) were isolated in previous studies 7, 8,10 . New strains constructed in this study are listed in Table S3. P rnk -citT knock-in assay. The activated P rnk -citT module was constructed by amplifying the evolved rnk-citG duplication junction from the pCit plasmid along with a linked kanamycin resistance gene (Kan r ) 9 . The P rnk -citT construct in pCit is originally from evolved strain CZB154 9 . As a control, another module was created which only contains the Kan r marker. These modules were integrated into the genomes of several Citstrains (REL607, REL1166A, ZDB429, ZDB467, and ZDB483) via lambda Red recombination 36 such that they replaced the lac locus (lacA to lacZ), spanning positions 333,862-337,485 in the REL606 genome (GenBank:NC_012967.1) 37 . We transferred the cassettes to other strains using P1 bacteriophage transduction 38 . Successful transductants were scored based on blue/white screening in the presence of X-gal and kanamycin. All Cit + strains were made by transduction of the P rnk -citT module into an Ara -LTEE clone. Isogenic Citstrains were constructed by insertion of the control Kan r module into an Ara + version of the same clone generated as described in the next section. To determine whether any other mutations present in the evolved strains from the LTEE were altered during transduction, we screened for mutations identified by whole-genome sequencing in the recipient strain that were within 100 kb upstream or downstream of the P rnk -citT insertion site. Strains from three Cit + /Citpairs were found to have gained or lost evolved alleles in this process (Table S2).
Selection for spontaneous Ara + mutants. All Arastrains inherited a point mutation in araA present in the REL606 LTEE ancestor that prevents arabinose utilization 15 . To isolate spontaneous Ara + mutants, Arastrains were revived overnight at 37°C in DM containing 1 mg/mL glucose (DM1000). For each strain, three separate flasks containing 10 ml of DM1000 were each inoculated with ~500 cells from the first DM1000 culture to reduce the chance that they might share any secondary mutations affecting fitness. After incubating overnight at 37°C, cells were harvested by centrifugation at 4,000 rpm for 15 min and the entire volume was plated on minimal arabinose (MA) plates. Plates were incubated for 36-48 h and colonies were streaked and grown on new MA plates before picking single-colonies as candidate Ara + revertants. The presence of secondary mutations affecting fitness was assessed by competing the original Araand selected Ara + strains, as described below. In most cases, we identified an Ara + revertant with a fitness that was not significantly different from its Araprogenitor (Fig. S3).
Competition assays. Relative fitness was measured using co-culture competition assays 4,39 .
Two strains to be competed are differentiated based on their ability to ferment arabinose. Arastrains form red colonies on tetrazolium arabinose (TA) media, and Ara + strains form pink colonies. Strains were revived overnight in LB then were diluted 10,000-fold into separate cultures for each replicate competition assay in DM containing 25 µg/mL glucose (DM25).
These cultures were preconditioned and competed under the same conditions as used in the LTEE 4,40 , in 10 mL of DM25 in 50 mL Erlenmeyer flasks shaken at 120 rpm over a diameter of 1 inch with incubation at 37°C. After 24 h of growth separately to precondition strains to these conditions, two replicate cultures for each Araand Ara + pair were mixed at equal volumes in fresh DM25 media such that there was an overall 1:100 dilution. Dilutions of these initial mixtures were plated on TA plates to determine the initial representation of each strain in each replicate flask. Then, the competition was carried out over three days of transferring 1:100 dilutions into fresh medium each day. A dilution of each culture after growth on day three was again plated to determine the final representation of each strain. Relative fitness was calculated as the ratio of the realized growth rates of each strain between the final and initial platings 4,39 .
For comparisons of the effect of the authentic rnk-citG duplication versus the addition of the P rnk -citT module to REL606 and ZDB706 (Fig. 1) we first established neutrality of an Ara + revertant and then judged whether there was significant difference between the fitnesses of the Citand Cit + strains pairs. For comparing the fitness impact of evolving Cit + in other strains (Fig.   2), we measured the relative fitness of the Ara -Cit + variant of the strain with the P rnk -citT module added versus the Ara + Citrevertant of its Citprogenitor (Cit competition) and multiplied this by the relative fitness of the Ara + Citrevertant versus the Ara -Citclone with the null module added (Ara competition) (Fig. S2). To account for how error in each of these two competitions impacts confidence in the overall fitness change inferred for evolving Cit + , we performed 10,000 bootstrap resamplings of the Ara and Cit competition replicates to estimate 95% fitness intervals and significance on the combined measurements. The same bootstrapping procedure was used for comparing the fitnesses of different strains in the population phylogeny in the procedure that combined them into equivalence groups along the lineage to Cit + (Fig. 4).
qRT-PCR measurement of citT expression. Cells were cultured according to the method We initially predicted mutations in each re-sequenced genome by comparing Illumina reads to the REL606 reference genome 37 using breseq (v0.31.1) 42,43 . Then, we further curated the lists of predicted mutations as previously described 13 . Briefly, a maximum-parsimony phylogenetic tree for all 61 strains from the LTEE population was constructed using the DNAPARS program from the PHYLIP package (v3.69) 44 . Where necessary, we manually corrected mutation predictions, including adding mutations that were hidden by later deletions or splitting sequence differences into multiple mutational events to construct the most parsimonious phylogeny possible. In the current study, we did not discard mutations in repetitive regions before analysis, except we did ignore changes in the hypervariable 7´CCAG repeat at reference coordinates    change had no effect on competitive fitness in each case (Fig. S3). Error bars are 95% confidence intervals. (c) Schematic of the gene cassettes used in the P rnk -citT knock-in assay showing how they were integrated into the E. coli chromosome in a way that replaces the native lac locus. (d) citT mRNA expression levels measured relative to the REL606 LTEE ancestor in the evolved Cit + isolate from the LTEE (ZDB564) and strains with the P rnk -citT and corresponding empty control cassettes integrated into their chromosomes. Error bars are 95% confidence intervals.   of the timing of mutations on the lineage leading to Cit + (names in italics). In order to identify changes in the degree of potentiation due to mutations, we mapped the results of the P rnk -citT knock-in assay onto this phylogenetic tree. Colored symbols reflect the Cit + to Citrelative fitness measured for those strains. The ancestor and 61 evolved isolates were used to construct this phylogenetic tree (Table S1). Two clones isolated at 50,000 generations are not shown. Two strains that evolved citrate utilization in replay experiments under the LTEE conditions in a previous study 7 are marked with plus signs (++), and three strains that had evolved alleles added or removed during strain construction as described in Table S2 are starred (*).  Wiser et al. 14