EPP, GCPA, MH, and JDD conceived and designed the experiments. EPP and GCPA performed the experiments. EPP, GCPA, JLJ, MH, and JDD analyzed the data. JLJ and BM contributed reagents/materials/analysis tools. MH and JDD wrote the paper.
The authors have declared that no competing interests exist.
A wide range of RNA viruses use programmed −1 ribosomal frameshifting for the production of viral fusion proteins. Inspection of the overlap regions between ORF1a and ORF1b of the SARS-CoV genome revealed that, similar to all coronaviruses, a programmed −1 ribosomal frameshift could be used by the virus to produce a fusion protein. Computational analyses of the frameshift signal predicted the presence of an mRNA pseudoknot containing three double-stranded RNA stem structures rather than two. Phylogenetic analyses showed the conservation of potential three-stemmed pseudoknots in the frameshift signals of all other coronaviruses in the GenBank database. Though the presence of the three-stemmed structure is supported by nuclease mapping and two-dimensional nuclear magnetic resonance studies, our findings suggest that interactions between the stem structures may result in local distortions in the A-form RNA. These distortions are particularly evident in the vicinity of predicted A-bulges in stems 2 and 3. In vitro and in vivo frameshifting assays showed that the SARS-CoV frameshift signal is functionally similar to other viral frameshift signals: it promotes efficient frameshifting in all of the standard assay systems, and it is sensitive to a drug and a genetic mutation that are known to affect frameshifting efficiency of a yeast virus. Mutagenesis studies reveal that both the specific sequences and structures of stems 2 and 3 are important for efficient frameshifting. We have identified a new RNA structural motif that is capable of promoting efficient programmed ribosomal frameshifting. The high degree of conservation of three-stemmed mRNA pseudoknot structures among the coronaviruses suggests that this presents a novel target for antiviral therapeutics.
A new structural and conserved element is identified within the SARS virus genome. The element is important for gene expression, and might be a useful target for antiviral drugs.
Severe acute respiratory syndrome (SARS) first appeared in Guangdong Province, China, late in 2002. Its rapid transmission and high rates of mortality and morbidity resulted in a significant threat to global health by the spring of 2003, and the epidemic had a significant effect on the public health and economies of locales affected by SARS outbreaks. The rapid response of the World Health Organization is credited with containing this contagion by late June 2003, and only a few cases were reported during the winter cold season of 2003–2004. The severity of this crisis mobilized the scientific community as well: by March 24, 2003, scientists at the Centers for Disease Control and Prevention and in Hong Kong had announced that a new coronavirus had been isolated from patients with SARS (reviewed in [
Coronaviruses are enveloped animal viruses that cause respiratory and enteric diseases. Analysis of the SARS-CoV genome revealed that, similar to all coronaviruses, the 70% (approximately) at the 5′ end of its large single (+) stranded RNA genome consists of two sizable genes called ORF1a and ORF1b. The 3′ ORF1b overlaps, and is out of frame with, its 5′ neighbor, ORF1a, and similar to other coronaviruses, a programmed −1 ribosomal frameshift (−1 PRF) was posited to be used by the virus to produce an ORF1a/1b fusion protein [
−1 PRF signals typically have a tripartite organization. From 5′ to 3′, these are composed of a heptameric “slippery site,” a “spacer” region, and a stable mRNA secondary structure, typically an mRNA pseudoknot (reviewed in [
(A) Two-stemmed H-type mRNA pseudoknot proposed by Thiel et al. [
(B) Three-stemmed mRNA pseudoknot structure investigated in this study.
The presence of a long, 29-nt loop 2 seemed to be unusual, prompting us to subject the sequence from positions 13392–13472 to additional computational analyses in an effort to further define the structure of this mRNA pseudoknot. The nucleotide sequence suspected of featuring a −1 PRF signal between ORF1a and ORF1b was scanned by RNAMotif [
To address the question of whether the potential to form a three-stemmed mRNA pseudoknot is unique to the SARS-CoV, we searched for such structures in all of the known viral −1 PRF signals listed in the RECODE 2003 database [
AIBV, avian infectious bronchitis virus; BCoV, bovine coronavirus; HCoV-229E, human coronavirus 229E; HCoV-HKU1; HCoV-NL63, human coronavirus NL63; HCoV-OC43, human coronavirus OC43; MHV, murine hepatitis virus; PEDV, porcine epidemic diarrhea virus; SARS, SARS coronavirus; TGV, transmissible gastroenteritis virus. Heptameric slippery sites are indicated in brown; dashes indicate gaps in the sequence alignments; basepairing positions involved in the consensus first, second, and third helices are denoted by blue, red, and green nucleotides, respectively. Downstream regions homologous to the kissing loop known to promote frameshifting in HCoV-229E [
Unrooted tree constructed based on the multiple sequence alignment from
In light of the computational findings, we conducted biochemical analyses of the SARS-CoV frameshift signal using a [32P] 5′ end labeled SP6 RNA polymerase product spanning nucleotides 13399–13475. RNase A cleaves preferentially at single-stranded pyrimidine bases, RNase T1 cuts at single-stranded guanosine residues, and RNase V1 cleaves double-stranded RNA. We also examined alkaline hydrolysis cleavage patterns at low concentrations of sodium hydroxide to identify exposed phosphodiester bonds. Representative autoradiograms of the reactions are shown in
(A and B) The results of nuclease cleavage of RNA from nucleotides 13400–13470 of SARS-CoV. RNAs were 5′ end labeled with 32P and subjected to enzymatic digestion, as described in
(C) Interpretation of nuclease digestion analyses mapped onto the proposed secondary structure of the SARS-CoV frameshift signal. Nuclease cleavage sites, proposed basepairs, and specific bases protected from nuclease attack are indicated.
The nuclease mapping data are generally consistent with the computational predictions, showing double-stranded regions corresponding to all three stems. Some notable deviations from the predicted structure were observed, however. These fell into three general classes. One class consisted of distortions in predicted helical A-RNA structures, typified by bases that were equally digested by both single- and double-strand-specific nucleases and by nearby bases that were refractory to nuclease attack. These clustered in the middle of stem 1 (13406–13410 and 13427), near the middle of stem 2 (13419–13420), and in the middle of stem 3 (13436–13439 and 13458–13461). Another major group consisted of bases located in regions predicted to link the three stems that were completely protected from nuclease attack. Specifically, these were G13405 and G13435 at the stem 1/stem 3 junction, G13414 and G13423–C13425 at the stem 1/stem 2 junction, and C13463 and U13464, which link stem 2 with stem 3. We also observed enhanced susceptibility of the three pyrimidines in the predicted loop 2 region (C13447, U13448, and U13451) to attack by both single- and double-strand-specific endonucleases, suggesting that this region is structurally dynamic under the conditions assayed. The ability of the bulged adenosine residue at position 13467 to be recognized by RNaseV1 demonstrates that it is involved in a basepairing interaction, whereas the opposite pertains with regard to A13446, A13452, and A13456. The three bases at the 5′ and 3′ terminal ends of the molecule could not be meaningfully resolved.
Given the ambiguity of the nuclease mapping, homo- and heteronuclear two-dimensional (2D) nuclear magnetic resonance (NMR) experiments were used to confirm the predicted basepairing interactions of the three stems. The presence of 21 hydrogen-bonded guanine and uracil residues for the sequence including residues 13405–13472 of the SARS-CoV genome (
(A) Secondary structure of the SARS-CoV frameshift pseudoknot (residues 13405–13472). Different color coding was used to denote basepaired regions in stems 1 (cyan), 2 (green), and 3 (grey and blue). Only the last two digits of the wild-type sequence numbering are used for clarity.
(B) Imino region of a one-dimensional jump-return echo spectrum of SARS-CoV pseudoknot.
(C) Portion of a 2D 1H,1H-NOESY. Sequential imino-imino proton NOE assignment paths are shown by different colors for stem 1 (cyan), stem 2 (green), and stem 3 (black and blue).
(D) 2D Quantitative
Assignments were made for 8 bp in stem 1, 4 bp in stem 2, and 5 bp in stem 3 (
The quantitative
To address the question of whether the predicted SARS-CoV −1 PRF signal functions similarly to −1 PRF promoting elements from other viruses, this sequence was cloned into bicistronic dual luciferase reporter constructs designed to assay programmed ribosomal frameshifting using in vitro and in vivo systems [
The ability of the SARS-CoV sequence to promote −1 PRF was assessed using two different in vitro and in vivo assay systems each. The results of these experiments are shown in
(A) The wild-type SARS-CoV frameshift signal promotes efficient frameshifting in vitro and in vivo. Programmed −1 ribosomal frameshifting was monitored in wheat germ and rabbit reticulocyte lysates in vitro, and in Vero epithelial cells and yeast in vivo, as described in
(B) SARS-CoV-directed −1 PRF was monitored in wild-type yeast cells with or without anisomycin (20 μg/ml), or in isogenic
Given that Vero cells more resemble the natural host of SARS-CoV than do yeast, a series of mutants of the SARS-CoV frameshift signal were developed to functionally dissect the mRNA pseudoknot in this cell type. Typically, mutagenesis experiments are constructed so as to change one or another side of a stem to disrupt basepairing, and then to combine the two mutants to re-form the stem (e.g., see [
Constructs used to examine the contributions of stem structures and bulged adenosine residues to programmed −1 ribosomal frameshifting are depicted. Shading is used to indicate mutagenized bases. Programmed −1 ribosomal frameshifting promoted by the wild-type SARS-CoV −1 PRF signal was monitored in Vero, as described in the
Six different stem 2 mutants were assayed for their ability to promote efficient −1 PRF (
Bulged adenosine residues are known to stimulate assembly of higher-order RNA structures by helping to link helices together [
Similar to the approach described above, five mutants were constructed to investigate stem 3 (
Similar to stem 2, a bulged adenosine is predicted in stem 3 at position 13456, and the phylogenetic analysis showed that this base was conserved among all of the coronaviruses. Substitution of this base to cytosine (S3C′) promoted a moderate but significant reduction in −1 PRF (26% inhibition,
An analogous series of constructs were also assayed in yeast (data not shown). In general, the trends were similar, though the actual baseline frameshifting efficiencies were lower. For example, in mutants S2A, S2A′, and S2B′, frameshifting was equally reduced by 85%–90%; the S3A and S3A′ mutations also resulted in moderate (35%–89%) decreases in yeast, and deletion of stem 3 (ΔS3) also presented a slight increase in −1 PRF (35%) in yeast cells. There were some notable contrasts, however: though the S2C′ and S2C constructs dramatically reduced frameshifting in Vero cells (95%), they only reduced −1 PRF by approximately 25%–33% in yeast. More strikingly, some of the mutations that resulted in a 25%–30% decrease in −1 PRF in Vero cells (A13465G, S3B, and S3C′) did not affect the overall rate of −1 PRF in yeast at all. The potential significance of these findings is discussed below.
Though the first descriptions of the mRNA secondary structure stimulating −1 PRF [
Specifically, loop 3 is predicted to be long and positioned between stems 3 and 2 in the group 2 coronaviruses. In contrast, the group 1 viruses contain little or no loop 3 but, rather, have an extended loop 2 positioned between stems 1 and 3. The notable exception is TGV, in which the relative structure and orientation of stem 3 and loop 3 more resembles those observed in SARS-CoV and AIBV.
Structurally, the nuclease analyses, showing distortions of the regular helical structures in the stems, protection of specific bases from nuclease attack, and the apparent involvement of bases in loop 3 and of A13467 in basepairing interactions, suggest that the three stems fold back on one another to form a complex, globular RNA structure. Long-range interaction anchors mediated by adenosine residues such as those at positions 13456 (stem 3) and 13467 (stem 2), making contact with the shallow minor grooves of two stacked basepairs of A-form helical stems, are a recurring theme in RNA structural biology. For example, the crystal structure of the ribosome reveals that RNA has a remarkable propensity for contributing adenine bases to such A-minor interactions [
Molecular genetic analysis of stem 2 of the SARS-CoV pseudoknot, demonstrating that frameshifting was reduced in all cases, including our attempts to make complementary mutations, indicates that primary sequences as well as structures are important for maximal frameshifting. Possible reasons for the observed sequence specificity could include aberrant folding or disruption of an essential interaction required for formation of the complex tertiary mRNA structure. For example, the findings that changing the identities of the bulged adenosine residues in stems 2 and 3 from adenosine to cytosine (S2C′ and S3C′), or deleting the bulge in stem 2 altogether (S2C), abrogated the stimulatory effects of the pseudoknot support the notion that this structural property of bulged adenosine residues is functionally important in this context. In addition, water-nucleobase “stacking” in the form of H-π and lone pair-π interactions have been demonstrated at the junctions between the stems in the BWYV pseudoknot [
Though alterations to stem 3 significantly reduced frameshifting levels, these effects were one to two orders of magnitude less than analogous mutations of stem 2. This is supported by the observation in another study that alteration of the sequence in stem 3 also promoted decreases in −1 PRF [
If stem 3 is not required to promote efficient frameshifting, why then has it been so highly conserved among the coronaviruses? It may be that frameshifting levels in coronaviruses need to be regulated in a manner not supported by a two-stem pseudoknot. For example, the frameshift signal marks the boundary between proteins required during the immediate early phase of infection (e.g., ORF1a-encoded proteases used to prepare the cell for virus production) and those required for intermediate functions in the viral life cycle (i.e., ORF1b-encoded RNA-dependent RNA polymerase and helicase used in transcription of subgenomic mRNAs, [−] strand synthesis, and genome replication). One of the fundamental problems of in the biology of (+) RNA viruses regards the switch between translation and replication. An elegant model proposes that the −1 ribosomal frameshift in barley yellow dwarf virus plays a central role in remodeling the (+) strand from translation competent to replication competent: frameshifting enables synthesis of the replicase, which in turn is able to denature the frameshift-promoting
In a previous study, frameshifting in HCoV-229E was shown to be stimulated by a short sequence approximately 200 nt downstream from the slippery site, and it was shown that efficient frameshifting is promoted by kissing-loop interactions [
A final finding of interest derives from the observed differences between yeast- and metazoan-derived frameshift assay systems. This represents a potentially exciting avenue of exploration, as it may be indicative of mRNA folding differences between the two systems or of differences in how yeast versus metazoan ribosomes interact with downstream mRNA structures. This could be a result of relative size differences in the ribosomes. Alternatively, the lower levels of frameshifting in yeast relative to wild-type in the Vero cells could reflect a higher sensitivity of these ribosomes to subtle changes in the frameshift signal. The normal levels of frameshifting in yeast promoted by the S3B and S3C′ mutants further support the notion that the reason for stem 3 may lie with some function other than programmed ribosomal frameshifting.
The SARS-CoV −1 PRF signal was identified from the complete genome sequence, using a combined approach. First, a pattern matching descriptor of known −1 PRF signals was used in conjunction with RNAMotif [
Oligonucleotides were synthesized and purified by IDT (Coralville, Iowa, United States). These are listed in
An additional set of plasmids was constructed from the parental plasmids described above that lacked the yeast-specific markers but contained the SV40 early promoter, T7 promoter, and SV40 late poly (A) signal. These were used for programmed ribosomal frameshifting analyses in epithelial cells, wheat germ, and rabbit reticulocyte lysates. The BamHI and EcoRI fragment from pJD465 was purified and ligated into BamHI- and EcoRI-digested p2luc [
The SP6SARS and revSARS oligonucleotides were used to generate a PCR amplicon from which an RNA transcript was made using the Ambion MEGAscript SP6 kit. The RNA was treated with calf intestinal phosphatase and 5′ end labeled with [γ-32P]ATP, using T4 polynucleotide kinase. The labeled RNA was gel purified and then eluted with 0.5 M NH4Ac, 1 mM EDTA, 0.1% SDS. Nuclease treatment with RNase A (1.0–0.01 ng), RNase T1 (1.0–0.01 U), and RNase V1 (0.1–0.001 U) from Ambion was performed according to the manufacturer's instructions for 15 min at room temperature. Digested RNA was electrophoresed through a 10% polyacrylamide gel and analyzed using a Storm PhosphoImager (Sunnyvale, California, United States).
A DNA construct (residues 13405–13472 of the SARS-CoV genome) was generated by PCR from pJD465 containing the wild-type SARS-CoV frameshift pseudoknot sequence. Two oligodeoxynucleotides (Invitrogen, Carlsbad, California, United States) were designed with a 5′ primer, including a T7 promoter sequence. The resulting PCR product was cloned into a pUC18 plasmid. To prepare milligram quantities of the SARS-CoV frameshift pseudoknot (residues 13405–13472), 7.5- to 20-ml in vitro transcription reactions with phage T7 polymerase from a linearized plasmid template were performed [
All NMR spectra were recorded at 5 °C, 15 °C, and 25 °C on a Bruker Avance 900 MHz spectrometer (Rheinstetten, Germany) equipped with a standard 5-mm triple axis pulsed field gradient 1H/13C/15N probehead optimized for proton detection. NMR experiments were performed on samples of 500-μl volume containing 0.2–2 mM SARS-CoV frameshift pseudoknot RNA. Data were processed using NMRPipe [
The first page provides a summary of the final statistics for the frameshifting experiments shown in
(178 KB XLS).
The GenBank (
We want to thank Drs. Wenxia Song, David Mosser, and Deborah Taylor and the members of their laboratories for their generous help with cell culture. This work was supported by a grant to JDD from the National Institutes of Health (GM58859).
programmed −1 ribosomal frameshift
two-dimensional
minimum free energy
nuclear magnetic resonance
severe acute respiratory syndrome