In Vivo Evolution of a Catalytic RNA Couples Trans-Splicing to Translation

How does a non-coding RNA evolve in cells? To address this question experimentally we evolved a trans-splicing variant of the group I intron ribozyme from Tetrahymena over 21 cycles of evolution in E.coli cells. Sequence variation was introduced during the evolution by mutagenic and recombinative PCR, and increasingly active ribozymes were selected by their repair of an mRNA mediating antibiotic resistance. The most efficient ribozyme contained four clustered mutations that were necessary and sufficient for maximum activity in cells. Surprisingly, these mutations did not increase the trans-splicing activity of the ribozyme. Instead, they appear to have recruited a cellular protein, the transcription termination factor Rho, and facilitated more efficient translation of the ribozyme’s trans-splicing product. In addition, these mutations affected the expression of several other, unrelated genes. These results suggest that during RNA evolution in cells, four mutations can be sufficient to evolve new protein interactions, and four mutations in an RNA molecule can generate a large effect on gene regulation in the cell.


Introduction
To answer the question how a specific macromolecule evolved requires understanding the circumstances in its evolutionary history that led to its current role [1]. However, the information of the evolutionary context and of evolutionary intermediates is usually lost to history. Instead, the biological evolution of macromolecules can be recapitulated using experimental evolution systems. Our focus is on the evolution of catalytic RNAs (ribozymes), which was studied previously by in vitro evolution experiments [2,3,4,5,6,7,8,9]. We set out to study the evolution of RNAs in cells [10] because the biological evolution of RNAs may be strongly affected by their interaction with the cellular environment.
The model RNA for our evolution was a trans-splicing variant of the group I intron ribozyme from Tetrahymena ( Figure 1). Group I introns are ribozymes that do not require the spliceosome for their removal from primary transcripts. Instead, they fold into three-dimensional structures that catalyze their own excision and the joining of their flanking exons [11]. These cis-splicing ribozymes have been re-engineered to act in trans, by removing their 59-exon and replacing it with a short substrate recognition sequence [12]. In this format, the trans-splicing ribozymes specifically recognize a target site on a substrate RNA by base pairing, and replace the 39-portion of the substrate RNA with their own 39-exon. In cells, these trans-splicing ribozymes usually repair less than 10% of the target RNAs [12,13,14,15,16], probably because group I intron ribozymes were evolutionarily optimized for cis-splicing and not for trans-splicing. Indeed, evolving these ribozymes in the lab could increase their efficiencies [10]. Such an evolution of trans-splicing group I intron ribozymes is a good model system to study RNA evolution in cells because in addition to sampling the protein repertoire of the cell, the ribozymes report on a range of functions, such as the formation of a complex threedimensional structure [17], the recognition of a substrate in trans [12,18], the catalysis of two transphosphorylation reactions [11], and conformational changes in the RNA structure [19].
To evolve trans-splicing ribozymes in cells it is possible to express large ribozyme libraries in cells, and design the ribozymes such that they can repair the mutated mRNA of chloramphenicol acetyl transferase (CAT) [13] [10]. The CAT enzyme catalyzes the O-acetylation of the antibiotic chloramphenicol, with acetyl-CoA as the acetyl donor [20]. This acetylation renders chloramphenicol unable to inhibit the ribosome, and thereby mediates resistance to chloramphenicol [21]. Upon inactivation of the CAT mRNA by a mutation, chloramphenicol resistance is lost. Efficient transsplicing ribozymes are able to repair the mutation-inactivated CAT mRNA in bacterial cells, thereby enabling the bacteria to grow on medium containing chloramphenicol. This allows for the selection of active trans-splicing ribozymes from populations with more than 10 6 ribozyme variants [13] [10]. Because selections contain all sequence diversity in the initial pool, we were interested to see what solutions a ribozyme population could find during evolution. This process introduces sequence diversity between multiple selection steps and thereby mimicks more closely the biological evolution of RNAs [2].
Here we show the evolution of a trans-splicing group I intron ribozyme from Tetrahymena in E.coli cells, for 21 cycles of evolution. The resulting ribozyme population mediated bacterial growth at more than 10-fold higher chloramphenicol concentrations than the parent ribozyme. The focus of this study is to determine how the most efficient, evolved ribozyme was able to achieve high efficiency in cells. The most efficient ribozyme contained four mutations that caused its increased activity. Interestingly, these four mutations did not improve trans-splicing but appear to have recruited the transcription termination factor Rho and improved translation of the repaired mRNAs. These results shed light on how RNAs evolve in cells, by showing that a handful of mutations can be sufficient in an RNA to evolve binding to a protein, and mediate major effects on gene expression.

In vivo Evolution
The in vivo evolution was conducted [13] essentially as described [10] [22] [23]. In short, a library plasmid was generated from a variant of the plasmid pUC19, by cloning the CAT cassette from plasmid pLysS between SphI and HindIII, and the ribozyme expression cassette between BamHI and SacI. This placed the CAT gene under the control of a constitutive promoter and the ribozyme cassette under the control of a downregulated version of the IPTG-inducible trc promoter, in which the 230 box was mutated from TTGACA to TTTACA [13,24]. The CAT gene contained a frameshift mutation, the deletion of T273 (position 1 is the A of the ATG start codon). In contrast to published evolution experiments [10], the splice site in the mutated CAT mRNA was at position 177 (counted from the A of the ATG translation start codon) and not at position 258. In each cycle of evolution the plasmids were isolated from the selected cells, the ribozyme sequence was amplified by PCR and purified by agarose gel electrophoresis. Mutations or recombination events were introduced into the ribozyme gene during its amplification by mutagenic PCR [22] or the staggered extension process (StEP) [23], respectively. The mutagenesis made use of the lower fidelity of Taq DNA polymerase at higher magnesium concentrations and in the presence of 0.5 mM Mn 2+ . The recombination approach used 40 cycles of PCR with annealing times of five seconds and temperature ramping rates of 6uC per second, without extension steps at 72uC. Under these conditions, the PCR primers were incompletely extended in each cycle, dissociated from one template and annealed to a different template in the next cycle for further elongation, thereby facilitating recombination. These recombination conditions were chosen to mediate ,1 recombination event per ribozyme sequence, on average. The ribozyme gene was then ligated into fresh library plasmid, and ligation products were transformed into electrocompetent E.coli cells. The cells were then plated on LB medium containing ampicillin, incubated, washed, and frozen as glycerol stocks. The replating efficiency and the percentage of plasmids containing the ribozyme insert were determined by plating on LB medium containing ampicillin, and colony PCR. The concentration of chloramphenicol in the plates of the selective step of the evolution was adjusted over successive evolution cycles to the fitness of the pool, increasing from 4 mg/mL in the first cycle to 70 mg/mL in the last cycle of the evolution. Because the high mutagenesis rate in the first cycle of evolution (,7.3 mutations per ribozyme) did not allow mutagenesis in the second cycle (otherwise no viable transformants resulted), cycles 3 to 8 were separated into branch A and branch B, with medium and low mutagenesis rates (4.8 and 2.4 mutations per ribozyme, respectively). Branch A required the lower chloramphenicol concentration of 2 mg/mL to avoid collapse whereas branch B was stable at 4 mg/mL or more. Therefore, the low mutagenesis rate was chosen in later cycles after the material from branches A and B was combined in cycle 9. In each cycle of the evolution, ten ribozyme sequences were obtained to follow the progress of the evolution. Site-directed mutagenesis was employed to generate mutations in the evolved ribozymes, using the QuikChange kit (Stratagene) according to the manufacturer's instructions.

Measurement of E.coli Growth Rates
The doubling times of E.coli cells were determined essentially as described [13], by treating a fresh overnight culture for 1 hour with 1 mM IPTG to induce ribozyme expression, diluting the cells to an OD 600 of 0.05 in LB medium containing 1 mM IPTG and the appropriate concentration of chloramphenicol, shaking the cells at 37uC, and measuring the increase in the OD 600 . The values of OD 600 up to 0.6 were fitted to the exponential equation of OD 600 = a+bN2 ' (time/c). The parameters a, b, and c were fitted by the least squares method, with the constraints that a$0 and b$0.025. The parameter a described non-dividing cells, b described dividing cells, and c corresponded to the doubling time. Doubling times larger than 100 minutes showed large variations between experiments and were therefore described as ''.100 minutes''.
In vitro Trans-splicing Assay Assays were performed as described previously [18]. In short, ribozymes or the full-length CAT MUT mRNA were generated by run-off transcription from PCR products and purified by denaturing polyacrylamide gel electrophoresis (PAGE). The ribozyme 39-exon was truncated at its 39-terminus so that reaction products could be size separated from substrates and intermediates. Purified CAT MUT mRNA was 59-dephosphorylated by Antarctic phosphatase (New England Biolabs) and 59-radiolabeled with T4 PNK (Invitrogen) and c[ 32 P]ATP. After PAGE purification, trace amounts of radiolabeled products were incubated with 1 mM ribozyme in a buffer containing 1 mM MgCl 2 , 135 mM KCl, 50 mM MOPS/KOH pH 7.0, 20 mM GTP, and 2 mM spermidine at 37uC for 3 hours. Before the reaction, ribozymes were pre-incubated in reaction buffer without magnesium. Reaction products were separated on denaturing 7 M urea 5% PAGE and quantified by phosphorimaging (PMI; Bio-Rad) using the Image Quant software. The percent of repaired CAT mRNA was calculated using the signal intensities of substrate, reaction intermediate, and reaction product. The values are the averages from three experiments.

Measurement of CAT Activity
The CAT activity was measured as described [25] but with 10fold more cells because the highest levels of CAT activity were ,2.5-fold below the level resulting from the expression of functional CAT mRNA (data not shown). Cells were grown under the same conditions as the assay measuring the growth rate and harvested at an OD 600 of 0.5. The cells from 2 mL culture were concentrated to 200 mL by centrifugation and frozen. After thawing, 200 mL of 200 mM Tris/HCl pH 7.8 and 10 mM Na 2 EDTA were added, then 4 mL of toluene were added and mixed. Fifteen mL of this solution were mixed with 135 mL of reaction buffer for the final concentrations of 1.0 mM 5,59-Dithiobis(2-nitrobenzoic acid) (DTNB), 0.2 mM Acetyl-CoA, and 0.2 mM chloramphenicol. The absorption at 412 nm was measured every 15 seconds in a Nanophotometer (Implen). The slope in the time interval from 6 minutes to 15 minutes was obtained by linear least squares fitting. The units of CAT activity were calculated based on the extinction coefficient 13,600 M 21 cm 21 for the reaction product 59-thio-2-nitrobenzoic acid, and the unit definition of CAT activity where one unit catalyzes the acetylation of 1 mmol of chloramphenicol per minute [25].

Fractionation of Ribosomes
Ribosome fractionations were done as described [26]. 100 mL of cell culture were grown as described above for growth rate measurements. Cells for replicate experiments were from three separate biological samples. The cells were treated with 400 mg/ mL chloramphenicol to stall ribosomes and immediately cooled on ice. Cells were lysed using lysozyme (1 mg/mL in 20 mM Tris/ HCl pH 7.5, 15 mM MgCl 2 , and 400 mg/mL chloramphenicol) and freeze-thawed, followed by treatment with 0.5% (w/v) sodium deoxycholate. After sedimentation of cell debris and genomic DNA the A 260 was measured and 180 mg of RNA were loaded on each 10%-40% sucrose gradient (11 mL volume with 20 mM Tris/HCl pH 7.5, 10 mM MgCl 2 , 100 mM NH 4 Cl, and 2 mM bmercaptoethanol), at 0uC. After centrifugation (3 hours at 260,0006g at 0uC) the gradients were fractionated into 1 mL fractions, with the 70 S peak collected in the first fraction. The observed, increased abundance of RNAs on polysomes was not caused by higher cellular ribosome concentrations because the samples loaded on the sucrose gradients were normalized for their absorption at 260 nm, which is dominated by ribosomal RNA.

RNA Isolation
RNA was isolated from 100 mL of each fraction of the sucrose gradient, immediately after fractionation of the gradients, using the Nucleospin RNA II kit (Macherey Nagel) according to the manufacturer's instructions, with on-column DNase digestion. Total RNA was isolated from 2 mL of E.coli culture grown logarithmically in LB medium containing 100 mg/mL ampicillin and 1 mM IPTG. Immediately after the OD 600 had reached 0.5, the cells were pelleted by centrifugation, and the RNA was isolated. RNA isolation from cells was done with the RNAeasy mini kit (Qiagen) according to the manufacturer's instructions with on-column DNase digestion. For each replicate experiment, three separate biological samples were used to prepare total RNA.

RT-qPCR
For reverse transcription, 200 ng of total RNA or the RNA corresponding to 13 mL of sucrose gradient (up to 310 ng, as estimated from the A 260 ) were used as templates per 20 mL reaction, with Superscript III reverse transcriptase (Invitrogen) according to the manufacturer's instructions. The reaction products were diluted with water such that the subsequent qPCR quantification cycles were between ,15 and ,30 cycles. For gradient fractions these dilutions were 15-fold for ribosomal gradient fractions for cysG and GAPDH, 50-fold for CAT pre-mRNA, ribozyme, and repaired CAT mRNA, and 500-fold for 16S rRNA. For total RNA a dilution of 500-fold for all samples gave consistent results. At least two dilutions were tested for each sample, which confirmed the linearity of all assays. Quantitative PCR was performed on the Fast 7500 machine (Applied Biosystems), using the SYBR green qPCR master mix (Applied Biosystems), and an amplification protocol of 95uC/30 seconds, 57uC/30 seconds, and 72uC/30 seconds. In all cases, melting profiles confirmed that specific PCR products were quantitated. The amounts of RNAs were calculated by correlating the quantification cycle value with qPCR from of a plasmid with known concentration (confirming an amplification of about 2-fold per PCR cycle), assuming that a cell density of 1 corresponded to 2 N 10 8 cells/mL, and assuming that no losses occurred during sample preparations. The RT primer was the same for substrate, ribozyme, and product (59-CACCGTCTTTCATTGC). The 59-PCR primers and 39-PCR primers were 59-CCGTTCAGCTG-GATATTACG and 59-CATACGGAATTCCGGATGAG (CAT pre-mRNA), 59-AGTGATGCAACACTGGAGCC and 59-TAC-TACCGATACGTACACTG (ribozyme), and 59-CCGTTCAGCTGGATATTACG and 59-TACTACCGA-TACGTACACTG (CAT mRNA). The silent mutations in the ribozyme 39-exons were insufficient to rigorously differentiate between CAT pre-mRNA and repaired CAT mRNA, which was visible in some cross-amplification between samples in pilot experiments (data not shown). Therefore, the RT-qPCR experiments were conducted with a modified ribozyme 39-exon sequence containing a specifically generated primer binding site, and a complementary 39-PCR primer. This made it possible to discriminate between mutated CAT mRNA and repaired CAT mRNA without detectable cross-amplification. Note that this also introduced a stop codon into the 39-exon but this does not affect the conclusions because the cells were not grown in the presence of chloramphenicol, and because similar results (with some crossamplification between primers) were obtained when the 39-primer sites for mutated CAT mRNA and repaired CAT mRNA differed only in the silent mutations contained in the ribozyme 39-exon (not shown). Although GAPDH is frequently used as reference RNA in RT-qPCR its reliability has been questioned, therefore we included an additional reference RNA with supposedly more stable expression, cysG [27]. The RT primer, 59-primer, and 39primer was 59-GTTGTCGTACCAGGATAC, 59-ACTTAC-GAGCAGATCAAAGC, and 59-AGTTTCAC-GAAGTTGTCGTT for GAPDH, repectively. The same primers were 59-TTAACATGCCTGCATCTG, 59-TTGTCGGCGGTGGTGATGTC, and 59-ATGCGGT-GAACTGTGGAATAAACG for cysG, respectively. Additionally, 16S rRNA was included to serve as control in the polysome fractionation experiments. For 16S rRNA the RT primer was 59-GTATTACCGCGGCTGCTG and the 59-and 39-PCR primers were 59-CTCTTGCCATCGGATGTGCCCA and 59-CCAGTGTGGCTGGTCATCCTCTCA [27].

Pull-down Experiments with Biotinylated RNA Hairpins
E.coli cells were grown in LB medium to an OD 600 of 0.5, the cells were washed twice in cold 0.26PBS, and frozen in 1/150 of the cell culture volume. The cells were thawed and suspended in cold 0.26PBS with 1 mM Na 2 EDTA, 0.1% (w/v) Triton X-100 and 1 mg/mL lysozyme. After 10 minutes incubation on ice the cells were frozen in liquid nitrogen and thawed for five times, then centrifuged. The supernatant was used in the following steps. To 0.3 mg of washed streptavidin-magnetic beads (Promega), 500 pmol of heat-renatured, biotinylated RNA hairpins in 0.26PBS were added and incubated for 10 minutes on ice. The supernatant was removed from the beads and 660 mL of cell lysate supernatant with 6.6 mL of 300 mM MgCl 2 were added. After incubation on ice for 10 minutes the beads were washed three times with 0.26PBS and 0.01% of Triton X-100. Proteins were eluted with LDS sample buffer under heat denaturation (29/80uC). Samples were run on SDS polyacrylamide gradient gels (4%-12%) and silver stained with the Focus Fast-silver kit (G-Biosciences). Specific bands were excised, destained, and analyzed by the UCSD mass spec facility. Peptides were compared to the E.coli database, and only peptides with a confidence of at least 95% were reported. Proteins resulting from sample handling (human keratin and porcine trypsin) were omitted. The GenBank Accession Numbers of the identified proteins are 170083270 (Rho transcription termination factor), 170079798 (protease Do), 170082766 (periplasmic protease), 170080341 (methylthio transferase), 170081402 (succinarginyl dihydrolase), 170083054 (metal dependent hydrolase), 170080578 (inner membrane protein), and 170083396 (glycerol kinase).

Evolution of Trans-splicing Ribozymes in Cells
The experimental procedure to evolve trans-splicing variants of the Tetrahymena group I intron ribozyme in E.coli cells was [13] similar to a previously published procedure [10]. In short, a transsplicing ribozyme was co-expressed with a mutation-inactivated mRNA of chloramphenicol acetyltransferase (CAT) (Figure 2A).
The ribozyme's 59-terminal targeting region was complementary to a splice site on the mutated CAT mRNA, and its 39-exon was designed to repair the mutated 39-portion of the CAT mRNA. Therefore, efficient ribozymes facilitated the expression of functional CAT enzyme and allowed their host cells to grow on medium containing chloramphenicol ( Figure 2B). Mutagenic PCR [28] was used to introduce mutations into the population of ribozyme genes. In each cycle of the evolution, an average of 7 N 10 5 viable bacterial cells was plated on medium containing chloramphenicol, thereby selecting for ribozymes that worked efficiently in cells. To avoid artifacts based on mutations in the E.coli genome or the plasmid, the library plasmids were isolated in each evolutionary cycle from the grown bacterial colonies, and their ribozyme genes were isolated, amplified by PCR with mutagenesis or recombination, purified by agarose gel electrophoresis, and re-cloned into fresh library plasmids. These plasmids were transformed into fresh E.coli cells, completing one cycle of the evolution. The used evolution procedure differed from that in our related study [10] by its population sizes, chloramphenicol concentrations, the application of recombination, and the number of evolution rounds.
Starting from a single sequence, the Tetrahymena group I intron ribozyme gene was subjected to 21 cycles of evolution ( Figure 2C). The selection pressure was adjusted in each cycle of the evolution to the fitness of the evolving population such that an average of 2 N 10 4 clones (,3%) formed visible colonies. This allowed building and maintaining population diversity and enriching for increasingly active ribozyme variants. The second cycle of evolution did not use mutagenic PCR because the first cycle used such high mutagenesis that no viable colonies resulted when mutagenesis was included in the second cycle. The following cycles of evolution (cycles 3-8) used two different levels of mutagenesis, a medium level in branch A and a low level in branch B. After cycle 8 the material from both branches was combined and mutagenesis was used only at the low error rate. Note that both branches resulted in the same average increase of 1.1 mutations per ribozyme and per cycle ( Figure 2D) despite the different mutagenesis rate. This is only 23% and 46% of the mutations introduced by mutagenic PCR into branch A and branch B, respectively. Therefore, most of the introduced mutations were culled from the population during the selective step, and only a fraction of the introduced mutations ended up increasing the genetic diversity of the evolving ribozyme population.
Recombination was introduced as a PCR-based technique [23] into the evolution procedure starting at cycle 9 ( Figure 2C). Our rationale to include recombination was to allow the combination of multiple beneficial mutations from separate ribozyme sequences into one ribozyme sequence, and to allow the removal of deleterious mutations from otherwise efficient ribozymes. The recombination likely did not benefit the accumulation of the four most important, beneficial mutations (see two paragraphs below) because these four mutations occurred within 6 nucleotides, which made it very unlikely that a recombination event occurred between them. To test whether recombination was successful in removing deleterious mutations we compared 40 sequences from cycles 7A, 7B, 8A, and 8B (before recombination) to 40 sequences from cycles 15 to 18 (after recombination). As a measure for deleterious mutations we counted the mutations that occurred in the conserved core of the ribozyme [17], with exception of the P1 helix. This conserved core consisted of 93 nucleotides (G96-G117, C204-U221, A252-G282, C296-A314, and U412-G414). Before the recombination cycles, each set of 10 sequences contained 4.361.3 mutations in the conserved core, whereas after the recombination cycles, this value was reduced to 1.060.8 mutations. This 4.3-fold reduction of deleterious mutations by recombination is an underestimate because the total number of mutations was 24610% higher in cycles 15-18 than in cycles 7A, 7B, 8A, and 8B ( Figure 2D). These results suggested that the recombination events played an important role in removing deleterious mutations from the evolving ribozyme population.
To enrich for the most efficient ribozymes in the population, the selection pressure was raised after 17 cycles of evolution. This was done by increasing the chloramphenicol concentrations in the selection medium to 70 mg/mL while omitting mutagenesis and recombination. Satisfyingly, the average fitness of the evolving pool increased strongly ( Figure 2E).
The enrichment of specific mutations during the evolution was followed by the analysis of 10 ribozyme sequences in every evolution cycle. Several regions of the ribozyme accumulated many mutations, such as the P8 stem-loop and the P9-P9.2 regions, with the highest frequency of mutations in the P6b stemloop ( Figure 3A). The frequency of these mutations rose during the evolution, most pronounced after evolution cycle 9, where recombination was introduced ( Figure 3B). The high frequency of mutations in the P6b stem-loop, namely at positions 236, 238, 239, and 241, was clearly visible in the sequencing chromatogram of the ribozyme pool after 21 cycles of evolution, where three of these four mutations appeared to represent the dominant sequence ( Figure 3C).
To identify individual ribozyme sequences with high activity the sequences of 30 ribozymes were obtained from these last three cycles of evolution ( Figure S1). Comparison of these sequences revealed 15 clones that jointly possessed all mutations that appeared at least twice among the 30 sequences. These 15 clones were individually tested in E.coli cells, for their effect on the doubling time in suspension culture in the presence of chloramphenicol. The clone with the highest activity showed cell-doubling times about 2-fold below the cell doubling times with the parent ribozyme, over a wide range of chloramphenicol concentrations ( Figure 4). Therefore, this most active ribozyme clone was chosen for further analysis.

Four Clustered Mutations Mediate High Ribozyme Efficiency in Cells
The ribozyme sequence with the highest activity in cells contained twelve mutations relative to the parent ribozyme ( Figure 5). To identify the mutations that were necessary for the highest activity, we individually reverted each of the 12 mutations to the parent ribozyme sequence and measured the effect of these reversions on the ribozyme activity in cells. Only four of the 12 revertants had decreased activity, thereby identifying the mutations U236C, U238C, U239C, and U241A as necessary for full activity. To test whether these four mutations were also sufficient for full activity, we constructed the ribozyme that differed only in these four mutations from the parent ribozyme sequence, and termed it M4. Indeed, this M4 ribozyme facilitated the same cell doubling time as the most efficient ribozyme from the evolution. Therefore, the four mutations were necessary and sufficient for full activity in vivo. The same four mutations were identified in a similar evolution experiment [10]. However, it was unknown how these four mutations were able to mediate higher antibiotic resistance of the cells.
Interestingly, all four mutations of M4 were positioned in the P6b stem-loop ( Figure 5). These positions are exposed to the solvent and distant from the catalytic site of the ribozyme [29,30]. To investigate whether these mutations acted intramolecularly (e.g. by aiding folding) or intermolecularly (e.g. by binding to a cellular factor) we transplanted the sequence of the mutated P6b stem-loop to the P8 stem-loop. The P8 stem-loop appeared to be a good target for this transplantation because like the P6b stem-loop, the P8 stem-loop is distant from the active site, not involved in tertiary interactions, and solvent-exposed [29]. The results showed that the ribozyme with the evolved mutations in the P8 loop was similarly efficient in cells, demonstrating that the precise position of the mutated stem-loop was not crucial for its activity. These results suggested that the mutations did not act intramolecularly (e.g. by aiding folding) but served to interact with a cellular factor.  [51] are given at the top. The short helices P3, P7, and P9.0 are not labeled due to space constraints. The position of the P6b loop is indicated by an empty triangle. B, The frequency of mutations that appeared at least 15 times in total is shown over the cycles of evolution. The nucleotide position of these mutations is given in the left-most column. Ten sequences were obtained for each cycle, with blue denoting at least 2 mutations and red denoting at least four mutations detected for that cycle and nucleotide position. For cycles 3-8, where the evolution was split into two branches, the color-coding is averaged over both branches. C, The sequencing chromatogram of the evolved plasmid library after cycle 21 shows the enrichment of mutations 238, 239, and 240 in the ribozyme population (top), compared to the parent sequence (bottom). Note that at position 236, ''C'' (blue) represents only a small proportion of the evolved pool. doi:10.1371/journal.pone.0086473.g003

The Four Evolved Mutations Facilitate Protein Binding
To identify a cellular factor that bound to the evolved P6b stemloop we performed pull-down experiments with lysates from E.coli cells ( Figure 6). The pull-down experiments utilized biotinylated RNA stem-loops that contained the sequence of the parent P6b stem-loop or that of the M4 P6b stem-loop, with four additional base pairs to stabilize the helices ( Figure 6A). The proteins that were pulled down via the RNA stem-loops were separated by denaturing SDS-polyacrylamide gels, visualized by silver staining (Figure 6B), and identified by Mass Spectrometry (Figure 6C). The transcription termination factor Rho was the only protein that preferentially interacted with the M4 hairpin compared to the parent hairpin, in three independent experiments.
The known RNA binding specificity of Rho fits well with the Crich sequence of the evolved P6b stem-loop ( Figure 6A): Rho is known to bind oligo(C) sequences [31], and the mutations in the evolved P6b stem-loop generated a (C) 5 sequence. When we extended the oligo(C) sequence in the P6b stem-loop by mutating 59-AGACCCCCA-39 of the M4 ribozyme to 59-ACCCCCCCA-39 and 59-CCCCCCCCC-39 it resulted in the same cell doubling times as M4 (3462 minutes for the M4 ribozyme and 3561 minutes and 3661 minutes for the mutants, respectively). These results were consistent with the model that the characteristic of a high C-content in the P6b stem-loop of the M4 ribozyme mediated the recruitment of Rho.

Effects of the Four Evolved Mutations on Ribozyme Function
To investigate how the four mutations in the M4 ribozyme increased ribozyme efficiency in cells we measured the effects of the mutations on ribozyme function in vitro and in vivo (Figure 7). The in vitro trans-splicing activity of the M4 ribozyme was not increased over the parent ribozyme ( Figure 7A). We were surprised to find that the amount of repaired CAT mRNA was also not significantly increased in cells with the M4 ribozyme relative to the parent ribozyme, as judged by RT-qPCR analysis of total RNA from cells ( Figure 7B). In contrast, the CAT enzyme activity in cell lysate was increased by 9-fold, in cells with the M4 ribozyme relative to the parent ribozyme ( Figure 7C). This suggested that the M4 ribozyme led to more efficient translation of the trans-spliced CAT mRNA because the cells containing the M4 ribozyme generated 9-fold higher CAT enzyme activity than the parent ribozyme, from similar levels of trans-spliced CAT mRNA.
To test whether the M4 ribozyme led to a more efficient recruitment of ribosomes to the CAT mRNA we isolated ribosomes from E.coli cells and fractionated them on sucrose gradients into 70 S fractions and polysome fractions ( Figure S2).  Efficiently translated RNAs were expected to be associated more with polysomes, whereas inefficiently translated RNAs were expected to be associated more with single ribosomes. To determine which RNA was associated with polysomes or single ribosomes we isolated the RNA from these fractions and measured the concentration of mutated CAT mRNA, ribozyme, repaired  CAT mRNA, 16S rRNA, and two control mRNAs (GAPDH and cysG) ( Figure S3). The percentage of RNA associated with polysomes compared to single ribosomes was between 5-fold and 55-fold higher in cells with the M4 ribozyme than in cells with the parent ribozyme ( Figure 7D). The association of the ribozyme and the repaired CAT mRNA with polysomes was increased ,50-fold, and the association of other mRNAs with polysomes was increased 5-to 10-fold. The percentage of 16S rRNA on polysomes was increased at the intermediate level of 16-fold. This suggested that the M4 mutation caused a non-specific upregulation of polysome assembly in the E.coli cells. These results -a weaker, non-specific upregulation of polysome assembly for several unrelated mRNAs and 16S rRNA, and a stronger, specific assembly of polysomes on CAT mRNA -support the interpretation that an increase in translation efficiency is the reason why the four mutations in the M4 ribozyme lead to 9-fold higher levels of CAT enzyme from similar levels of the repaired CAT mRNA.

Discussion
This study describes the experimental evolution of a transsplicing group I intron ribozyme for increased efficiency in E.coli cells. Four mutations mediated the activity increase of the most efficient, evolved ribozyme. The mutations created a (C) 5 sequence in the P6b stem-loop of the ribozyme, which facilitated binding to the protein Rho, and caused a widespread effect on gene expression in E.coli cells. The 9-fold increase in activity of the gene product of the targeted mRNA, chloramphenicol acetyl transferase (CAT), appeared to be caused not by an increase in trans-splicing activity but by an increase in translation of the transspliced CAT mRNA.
Because the four evolved ribozyme mutations facilitated the binding of Rho, understanding the function of Rho is important to understand why the four ribozyme mutations were beneficial. The biological function of Rho is transcription termination (for recent overviews see [32,33,34,35]). Rho forms circular homohexamers that fluctuate between an open and a closed conformation. Upon binding to the C-rich rut sites on mRNA, the Rho hexamers close, with the mRNA threaded through the center of the donut-shaped hexamer [34,36,37,38]. The ATP dependent motor of Rho then forces the Rho hexamer to migrate along the mRNA in 59-to 39direction [34,39]. Because this process is co-transcriptional, Rho can catch up with the transcription elongation complex (TEC) when the TEC pauses. Upon reaching the TEC the helicase activity of Rho separates the nascent RNA transcript from the DNA transcription bubble [40,41], thereby terminating transcription. This function of Rho takes place on a genome-wide scale to match transcription with translational needs [35].
How could the interaction between the evolved M4 ribozyme and the transcription termination factor Rho cause the observed effects on gene expression in E.coli? Rho is known to have a micromolar affinity towards (C) 7 and (C) 8 [31], and poly(C) can trap Rho in a state that is termination inactive [42]. Therefore, it appears plausible that the (C) 5 sequence in the M4 ribozymes reduced Rho mediated transcription termination activity in E.coli cells.
The apparent effect of the M4 mutations on the presence on polyribosomes was about 50-fold for the repaired CAT mRNA and ribozyme that probably stayed associated with repaired CAT mRNA and 5-to 10-fold for unrelated RNAs (see the 5-10-fold effect on GAPDH and cysG in figure 7). A nonspecific reduction of Rho activity by the M4 mutations would be possible because the estimated 17,000 M4 ribozyme molecules per cell ( Figure 7B) outnumbered the ,5,000 Rho monomers per cell (Rho contributes 0.1-0.15% to the E.coli cell protein [43] and each cell has ,340 fg of protein [44]). This interpretation of the non-specific effect is also consistent with the widespread effect of Rho on E.coli gene expression [35]. In contrast, the stronger, specific effect of the M4 ribozyme on repaired CAT mRNA (,50-fold, see figure 7) is caused by the specific co-localization of the ribozyme with CAT mRNA: The M4 ribozyme is targeted to its splice site on the emerging CAT mRNA, and the resulting higher concentration of the (C) 5 sequence of the M4 ribozyme would preferentially inhibit Rho molecules near CAT mRNA. In summary, we propose that the difference in strength between the non-specific effect (5-10fold) and the specific effect (,50-fold), on the RNA presence on polysomes was caused by the localization of the M4 ribozyme on the CAT mRNA but not unrelated mRNAs. It is currently unclear what exact mechanism could be used by Rho to modulate the assembly of polysomes. Hypotheses for such a mechanism could be based on Rho's transcription terminator function, the observation that translation in E.coli is co-transcriptional [45], and the finding that Rho coordinates transcription with translation on a genomewide level [35].
Because the (C) 5 sequence in the P6b stem-loop of the M4 ribozyme had such a profound effect on gene expression in E.coli we hypothesized that this effect would exert a strong selection pressure on the abundance of oligo(C) sequences in the E.coli genome. Indeed, (C) n homopolymers are strongly underrepresented in the E.coli genome (Figure 8). The sequences (C) 5 , (C) 6 , and (C) 7 are represented only at 27%, 16%, and 12% of the value expected from an unbiased distribution. This effect is not caused by a nucleotide bias in the E.coli genome, which contains 25.4% C and 25.37% G. The underrepresentation is even visible at the level of triplet sequences, where CCC and its reverse complement GGG are present only at 53% and 52% of their expected frequencies, respectively. Only the triplets TAG and its reverse complement CTA are represented less (both at 36%), perhaps due to the role of TAG as stop codon. This strong underrepresentation of oligo(C) in the E.coli genome suggests that (C) n oligomers have a biological role. This role may be the interference with Rho, consistent with the postulated role of the evolved (C) 5 sequence in the M4 ribozyme (see above).
The mutations evolved in the P6b stem-loop showed a similar effect in cells when they were transplanted to the P8 stem-loop ( Figure 5A). Therefore, one might wonder why the same activitymediating mutations did not evolve in the P8 stem-loop. The answer may lie in the sequence of the stem-loops and the frequency of different mutation events in the used method of mutagenic PCR [28]. The evolution of the P6 stem-loop from the sequence 59-AGAUCUUCU-39 to 59-AGACCCCCA-39 required three U to C transitions and one U to A transversions. These two mutations were the two most frequent mutations we found among the six possible transitions and transversions. In contrast, the P8 stem-loop 59-GAUGUAUUC-39 would most likely have arrived at the sequence 59-GAUCCCCCA-39, in order to widen the loop and generate a (C) 5 sequence in the same position. The six mutations required for this change would have included not only four frequent mutations (one G to A, three U to C) but also two rare mutations (one A to C and one G to C). Therefore, the (C) 5 sequence would have been much less likely to evolve in the P8 stem-loop.
The evolution of a ribozyme population in cells presented here and in a related study [10] is different from previous selection studies [46,47,48,49,50], because evolution introduces sequence diversity over multiple selection steps, whereas selection introduces sequence diversity only in the initial library [2]. Five previous studies selected ribozymes from ribozyme libraries in cells. (i) Variants of the cis-splicing group I intron ribozyme from Tetrahymena were selected in E.coli cells, and showed improved splicing due to mutations in the P1 stem [48]. (ii) The selection of trans-splicing group I intron ribozymes from Tetrahymena identified the fittest ribozymes from a library of thirteen variants, thereby establishing a selection system in yeast cells [47]. (iii) Optimal hammerhead ribozyme cleavage sites in HIV-1 RNA were determined by randomizing the ribozyme's target recognition sequence and selecting the most efficient ribozymes in human embryonic kidney cells [49]. (iv) The linker region between hammerhead ribozymes and a theophylline aptamer was optimized from a library of 64 different linker sequences, in E.coli cells [50]. (v) The stem II and loop II sequences of a self-cleaving hammerhead ribozyme were optimized by the selection from 70,000 sequence variants in human HeLa cells [46]. These five studies led to ribozymes with improved folding and/or target binding characteristics, but none of them resulted in the recruitment of a cellular protein. In contrast, the evolution in the present study facilitated the binding of a protein by the ribozyme. Therefore, the introduction of mutations over many cycles of evolution, which mimicks more closely the natural evolution of RNAs, may be an important factor in the evolution of RNA-protein interactions. Figure S1 Mutations in the 30 ribozymes isolated from the three last cycles of the evolution (cycles 19, 20, and 21). (TIF) Figure S2 Polyribosome analysis. Shown is the absorption at 254 nm during the fractionation of ribosomes and polyribosomes on a representative sucrose gradient, and the abundance of RNAs in gradienmt fractions as judged by RT-qPCR. (TIF) Figure S3 Abundance of RNAs on ribosomes from E. coli strains expressing the parent ribozyme or the M4 ribozyme. The percentage of RNAs on polysomes is given. Data are shown for the mutated, primary transcript of CAT mRNA, the ribozyme, the repaired CAT mRNA, 16S rRNA, and the mRNAs of GAPDH and cysG.

Supporting Information
(TIF) Figure 8. Frequency of oligo(C) sequences in the genome of E.coli. The graph shows the relative abundance of homopolymers in the genome of E.coli, as a function of the length of the homopolymer sequence. Note that the values for oligo(C) (filled, black squares) and oligo(G) (filled, black circles) show the frequency of oligo(C) on the top and bottom strand of the genome, respectively. The values for oligo(A) and oligo(T) (filled triangles) are so similar that they cannot be distinguished in this graph. As comparison, the average of all triplet codons (white circle with standard deviation as error bars) and the frequency of the stop codon TAG and its reverse complement CTA (circles with grey filling) are shown. The genome used for this analysis was from E.coli K12 sub-strain MG1655 (gi|49175990). doi:10.1371/journal.pone.0086473.g008