The coronavirus causing the COVID-19 pandemic, SARS-CoV-2, uses −1 programmed ribosomal frameshifting (−1 PRF) to control the relative expression of viral proteins. As modulating −1 PRF can inhibit viral replication, the RNA pseudoknot stimulating −1 PRF may be a fruitful target for therapeutics treating COVID-19. We modeled the unusual 3-stem structure of the stimulatory pseudoknot of SARS-CoV-2 computationally, using multiple blind structural prediction tools followed by μs-long molecular dynamics simulations. The results were compared for consistency with nuclease-protection assays and single-molecule force spectroscopy measurements of the SARS-CoV-1 pseudoknot, to determine the most likely conformations. We found several possible conformations for the SARS-CoV-2 pseudoknot, all having an extended stem 3 but with different packing of stems 1 and 2. Several conformations featured rarely-seen threading of a single strand through junctions formed between two helices. These structural models may help interpret future experiments and support efforts to discover ligands inhibiting −1 PRF in SARS-CoV-2.
The coronavirus that causes COVID-19 controls the production of key viral proteins through a process known as programmed ribosomal frameshifting. Frameshifting is triggered by a particular structure in the viral RNA, a pseudoknot, which is a promising drug target. Here we model the structure of this pseudoknot through atomistic molecular dynamics simulations. Surprisingly, we find that the pseudoknot can take on distinct fold topologies, two of which involve unusual threading of a single strand of RNA through helical junctions, something not seen before in frameshifting pseudoknots. All of the folds are generally consistent with previous experimental studies of the closely-related SARS coronavirus pseudoknot. These results should assist in the analysis and interpretation of future experimental studies of the pseudoknot structure, and support structure-based drug-discovery efforts.
Citation: Omar SI, Zhao M, Sekar RV, Moghadam SA, Tuszynski JA, Woodside MT (2021) Modeling the structure of the frameshift-stimulatory pseudoknot in SARS-CoV-2 reveals multiple possible conformers. PLoS Comput Biol 17(1): e1008603. https://doi.org/10.1371/journal.pcbi.1008603
Editor: Alexander MacKerell, University of Maryland School of Pharmacy, UNITED STATES
Received: June 8, 2020; Accepted: December 2, 2020; Published: January 19, 2021
Copyright: © 2021 Omar et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: Funding was received for this work by MTW and JAT from the Canadian Institutes of Health Research (https://cihr-irsc.gc.ca/e/193.html), reference no. OV3–170709, and from Alberta Innovates (https://albertainnovates.ca/), reference no. G2020000270. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The COVID-19 pandemic caused by the novel Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) has spread across the globe since the virus emerged in late 2019 . Given the high infectivity of SARS-CoV-2 and the novel immunological challenge it poses to human hosts, epidemiological modeling suggests that recurring outbreaks with elevated mortality can be expected even despite successful public-health responses, until vaccines or preventive drugs can be found to inhibit transmission . The discovery of effective treatment therapeutics is thus one of the central goals of research into COVID-19 .
One potential target for treatment is the frameshift-stimulatory pseudoknot found between the overlapping ORF1a and ORF1b in the SARS-CoV-2 genome . Like other human coronaviruses , SARS-CoV-2 depends on −1 programmed ribosomal frameshifting (−1 PRF) to produce essential proteins at regulated levels . In −1 PRF, a shift in reading frame is stimulated at a specific location in the RNA message by a structure in the mRNA—typically a pseudoknot—located 5–7 nucleotides downstream of a ‘slippery’ sequence, thereby generating more than 1 protein from the same message [7,8]. The level of frameshifted gene products must often be held within a tight range for optimal propagation of the virus, hence disrupting −1 PRF by modulating the efficiency of frameshifting can attenuate the virus. Indeed, inhibiting −1 PRF was found to suppress replication of the close relative SARS-CoV-1 by orders of magnitude [9,10], suggesting that the same strategy may be effective against SARS-CoV-2.
The structure of the frameshift-stimulatory pseudoknot has not yet been solved at high resolution for either SARS-CoV-1 or SARS-CoV-2, however, hindering structure-based drug-discovery efforts. The primary sequences of these two pseudoknots are almost identical, differing by a single nucleotide, hence their secondary structure is expected to be the same. Evidence from computational methods, nuclease protection assays, and 2D NMR spectroscopy applied to the SARS-CoV-1 pseudoknot [11,12] indicates it has a 3-stem architecture that is unusual for frameshift-stimulatory pseudoknots: whereas such pseudoknots typically consist of 2 interleaved stems and loops, here the second loop is greatly extended and a third stem-loop combination forms within it (Fig 1). Bulged adenine residues in stem 2 (S2) and stem 3 (S3) seem to play important functional roles, as mutating them to cytosine abolished or reduced −1 PRF (respectively for the bulges in S2 and S3) . Cryo-electron microscopy has been used to model the SARS-CoV-2 pseudoknot structure on its own  and on the ribosome , but with conflicting results; the full 3D structure has not yet been solved for any other 3-stem pseudoknot.
The sequence is color-coded by secondary structure (S1: cyan, S2: orange, S3: purple, loops: grey). The only difference from SARS-CoV-1 is that A59 (red) is changed from C59 in the latter. Bases shown in italic are protected against nuclease digestion in SARS-CoV-1. Slippery sequence shown in green.
Computational modeling provides an alternative approach to characterizing the structure of these pseudoknots, but such modeling has been limited to date. One study assembled a 3D structure of the SARS-CoV-1 pseudoknot by hand with the Sybyl chemical modeling package before equilibrating briefly with 1 ns of molecular dynamics (MD) simulation , and another study used the Rosetta FARFAR2 platform [16,17] to make a ‘blind’ prediction of the structure of the SARS-CoV-2 pseudoknot . Here, we have modeled the structure of the SARS-CoV-2 pseudoknot more extensively, using blind predictions from multiple platforms as inputs for μs-long MD simulations to examine the stability of the structures. We also assessed the ensemble of structures observed in the simulations for their consistency with previous work on the biochemical and biophysical properties of the SARS-CoV-1 pseudoknot to identify the most likely structural models. We found several possibilities, all sharing an extended S3 helix but differing in the S1/S2 packing and junction with S3.
Blind structure prediction
Initial structures for input into MD simulations of the monomeric pseudoknot were obtained using multiple platforms for blind RNA structure prediction: SimRNA , Rosetta FARFAR2 [16,17], RNAComposer , RNAvista , MC-Sym , RNA2D3D , and Vfold . For blind predictions, we assumed the secondary structure shown in Fig 1, based on previous characterization of the secondary structure of the SARS-CoV-1 pseudoknot . Blind predictions of pseudoknot dimer structures were made using FARFAR2, which allows for dimer structure prediction, or constructed manually using the Molecular Operating Environment software based on monomeric models from other prediction platforms, which do not directly allow for dimer predictions.
Models from blind structure predictions were used as starting structures for all-atom MD simulations in explicit solvent using Amber 18 . The models were protonated at pH 7 using Molecular Operating Environment. The pseudoknots were parameterized using the f99bsc0_chiOL3 force-field and were solvated in optimal point charge water boxes with minimum margins of 12 Å using the tleap module of Amber. The solvated systems were first neutralized using sodium ions, then their salinities were adjusted to 0.15 M NaCl using Joung-Cheatham monovalent ion parameters . Each pseudoknot model was simulated under two conditions: without Mg2+ ions, or with six Mg2+ ions placed initially at the junction between S1 and S3 as well as along the backbone of S2. The solvated systems were energy-minimized then heated to 310 K with heavy restraints of 10 kcal/mol/A2 on the backbone phosphate atoms. These restraints were gradually removed and the unrestrained systems were then simulated on graphical processing units for 1 μs at constant pressure.
Analysis of simulated models
Analysis was performed using the CPPTRAJ module of AmberTools . Different conformations of the pseudoknot within each simulation were clustered based on the root-mean-squared deviation (RMSD) of residues G1–G40 and C49–G66 (omitting the residues in L2 and at the 3′ terminus, which tended to have large fluctuations), using the hierarchical agglomerative approach. The representative structures of the three most populated clusters of each model (S1 Data) were visually assessed for helical distortions in S1, S2 and S3. The hydrogen bonding of the residues identified as protected in SARS-CoV-1 by nuclease-protection assays  as well as of bulged adenines and residues in L1 were also calculated, reporting the interactions formed by hydrogen bond donors and acceptors between two bases or between a base and the backbone atoms of other residues that were present for at least the last 100 ns of the trajectory. The root-mean-squared fluctuation (RMSF) of each residue was also calculated.
We made initial estimates of the pseudoknot structure using a variety of tools that have been developed for blind prediction of RNA structure [16,17,19–21,23], where the only input was the sequence and the expected secondary structure, as shown in Fig 1. Some of these predictions were rejected as being implausible because they contained topological knots, which cannot occur but were seen in predictions from MC-Sym, RNAvista, and RNA2D3D (S1 Table). The remainder are illustrated in Fig 2; note that those in Fig 2F–2H were also reported previously in a separate study . These predictions were quite varied. They all showed S3 as an extended helix lacking obvious contacts with the rest of pseudoknot except right near the junction with S1. However, the arrangement of S1 and S2 and L1 and L3 differed significantly between the predictions. In several models (Fig 2A–2D), S1 and S2 were packed very tightly, leading to distorted or broken base-pairs in order to accommodate the packing. A few models (Fig 2E, 2G, and 2J) showed an unusual quasi-knotted structure where the 5′ end was threaded through the junction between S1 and S3, a fold topology that has only been seen previously in exoribonuclease-resistant RNAs . Another model (Fig 2H) showed a similar knot-like topology, but this time with L3 (near the 3′ end) threaded through the junction between S1 and S2. Note that in several cases, the first three nucleotides upstream of the 5′ end of the pseudoknot (UUU) were included in the modeling, in order to distinguish the possibilities for 5′-end threading.
Structures predicted by (A) RNAvista, (B) RNAComposer, (C–E) SimRNA, (F–H) Rosetta FARFAR2, (I, J) Vfold. In each case, secondary structure is color-coded (S1: cyan, S2: orange, S3: purple, loops: grey).
To examine if the blind predictions were dynamically stable, we used them to initiate extended all-atom molecular dynamics simulations. Each structure in Fig 2 was simulated for at least 1 μs in explicit solvent under two conditions: with NaCl only, or with both NaCl and Mg2+ ions. Both conditions were used because not all pseudoknots require Mg2+ ions to fold [28,29], and it is unclear if Mg2+ ions are essential for the SARS-CoV-2 pseudoknot. The first part of the simulation was treated as an equilibration phase and only the last 500 ns of the simulation was examined in each case. Because the simulations were dynamic, we clustered the structures occupied in the simulations by RMSD and examined the centroid (representative) structures of the three most occupied clusters.
Several of the initial models led, after equilibration in MD simulations, to structures that featured various combinations of significant defects in the expected base-pairing for S1, defects in S2, and/or a lack of tertiary contacts (examples shown in S1 Fig). In some cases, these structures were sufficiently unstable that they unfolded substantially. Any model containing more than one broken base-pair in a given stem without the un-pairing being compensated by alternative interactions was therefore rejected as unlikely to be correct (model rejections listed in S1 Table). The other initial models yielded structures under at least one of the MD simulation conditions that were more plausible, and they were thus analyzed in more detail. The results could be arranged into three groups: structures without any threading at either end from Rosetta FARFAR2 and Vfold (Fig 3), structures with the 5′ end threaded through the S1/S3 junction from FARFAR2 and SimRNA (Fig 4), and a structure with L3 threaded through the S1/S2 junction from FARFAR2 (Fig 5).
(A–C) Structure from simulation of Fig 2F with Mg2+. (A) Secondary structure with tertiary contacts. Interactions indicated with the notation from Ref. (30). Bases shown in italic are protected against nuclease digestion in SARS-CoV-1. (B) Representative 3D structure of most populated cluster. Green spheres: Mg2+ ions. (C) Close-up view of key tertiary contacts. (D–F) Same for simulation of Fig 2F without Mg2+, showing fewer tertiary contacts than with Mg2+ in (A). (G–I) Same for simulation of Fig 2I with Mg2+.
(A–C) Structure from simulation of Fig 2G with Mg2+. (A) Secondary structure with tertiary contacts, notated as in Fig 3. (B) Representative 3D structure of most populated cluster. Mg2+ ions shown in green. (C) Close-up view of key tertiary contacts. (D–F) Same for simulation of Fig 2E without Mg2+. (G–I) Same for simulation of Fig 2G without Mg2+ showing opening of the top of S3.
(A–C) Structure from simulation of Fig 2H with Mg2+. (A) Secondary structure with tertiary contacts, notated as in Fig 3. Bases shown in italic are protected against nuclease digestion in SARS-CoV-1. (B) Representative 3D structure of most populated cluster. Mg2+ ions shown in green. (C) Close-up view of key tertiary contacts.
Considering first the structures that were more similar to standard H-type pseudoknots, without any threading of either end through stem junctions, the representative structures of the most populated clusters with and without Mg2+ are shown in Fig 3, indicating the canonical and non-canonical interactions with standard notation . From the simulations of the FARFAR2 model with Mg2+ (Fig 3A–3C), several triples and triple-like interactions were identified in S2/L1 (Fig 3A), and the Watson-Crick edge of U60 in L3 was seen to interact with the sugar edges of both G10 and U22 in S1. Base-pairs in S1 (G6:C25) and S3 (G31:U58) were disrupted, in favor of base-backbone bonding between C25 and both G57 and U58, and a wobble pair with the end of the spacer (G31:U0). A network of non-canonical base-pairs was also seen in L2. The stabilization from the hydrogen-bond networks led to low fluctuations, even in L2, and the RNA spent almost all of its time in the top 3 clusters (S2A Fig). There was no stacking between S1 and S2, but G31:U0 stacked on S1. For the simulations of the FARFAR2 model without Mg2+ (Fig 3D–3F), the disrupted base-pairs in S1 and S3 were restored, but the triples in S2/L1 were absent and the pairing in the lower part of S2 was disrupted in favor of non-canonical interactions between U11:C62 and G19:A63. However, the sparser network of tertiary interactions led to higher fluctuations and a greater diversity of clusters than with Mg2+ present (S2B Fig). There was no stacking of S1 with S2 or S3 in this structure. Finally, in the simulations of the Vfold model with Mg2+ (Fig 3G–3I), the G10:C21 pair in S1 was disrupted in favor of interactions with C62 in S2, which also interacted with U11. Non-canonical interactions were again formed between G19 and A63, but the sparse network of tertiary interactions led to relatively high fluctuations, even though the RNA spent almost all of its time in the top 3 clusters (S2C Fig). As with Fig 3A, there was no stacking between S1 and S2, but G31:U58 at the end of S3 stacked on S1.
Turning next to the representative structures of the most populated clusters with the 5′ end threaded through the S1/S3 junction, we again found three possibilities. In the simulations of the FARFAR2 model with Mg2+ present (Fig 4A–4C), several triples/triple-like interactions were identified in S2/L1, two of them the same as in Fig 3A–3C. The opening G:U pair in S1 was disrupted in favor of interactions with G57 and U58, to accommodate the threading of the 5′ end through the S1/S3 junction, and S3 was extended by a non-canonical pair between G40 and A48. L2 was again structured by a network of hydrogen bonds, but L3 did not interact with any other part of the structure other than a coordinated Mg2+ ion. In the simulations of the SimRNA model without Mg2+ (Fig 4D–4F), the 5′ end was threaded through the S1/S3 junction without disrupting the base-pairing near the junction, stabilized by interactions between the Watson-Crick edge of G31 and the Hoogsteen edge of U0. Only one triple was seen in S2/L1, but it also interacted with the Hoogsteen edge of the bulged A63 in S2. The U20:A61 pair at one end of S2 was distorted to include interactions between U11 and A61, and U60 in L3 was bonded to both G10 and C21, analogous to the situation in Fig 3A–3C. L2 was partially structured by interactions between G40 and U47. Finally, in the simulations of the FARFAR2 model without Mg2+ (Fig 4G–4I), the first two base-pairs in S3 next to the S1/S3 junction were opened to facilitate the 5′-end threading, stabilized by a U0:G31 wobble pair, with the unpaired G57 and U58 forming H-bonds with the backbone at A24 and U23. Triples/triple-like interactions in S2 analogous to some of those seen in Figs 3A–3C and 4A–4C were present, and once again U60 in L3 interacted with G10 in S1 through the sugar edges. The bottom of S3 was also re-arranged, forming a triple with the bulged A52 and extending the stem with the pair G40:C49 and non-canonical bonding between C41 and A48. As with the unthreaded structures, the simulation with Mg2+ showed smaller fluctuations than those without Mg2+; the fraction of the trajectory spent in the top 3 clusters was very high (over 90%) for the simulation with Mg2+ and the first model without Mg2+, but lower (66%) for the second model without Mg2+ (S3 Fig). Unlike in the unthreaded structures, however, S1 and S2 were stacked in all three 5′-threaded structures, whereas S1 and S3 were unstacked.
Considering the representative structure of the most populated cluster for the third fold topology, with L3 threaded through the S1/S2 junction, only the result with Mg2+ is shown (Fig 5A–5C), as the structure was unstable without Mg2+. Here, the threading of L3 disrupted base-pairing in the bottom of S2 and the middle of S1, although the two strands of S2 continued to interact via base-backbone hydrogen bonds and S1 still retained a fairly regular helical structure; U60 bonded with the backbone of G1 to help stabilize the threading of L3. In S3, the same triple formed as in Fig 4G–4I, but without the re-configuration and extension of the stem. L2 was partially structured by base-backbone bonds between G40, A48, and the closing base-pair of S3. The threading of L3 prevented any stacking of the helices. The top three clusters comprised 78% of the MD trajectory, with relatively low fluctuations (S4 Fig).
Because it is unclear if the nucleotides downstream of the pseudoknot might affect the stability of these structures, we repeated the simulations of each of the models shown in Figs 3–5 while extending the pseudoknot at the 3′ end by the next 4 nucleotides (UUUG). For the unthreaded and 5′-threaded models from Figs 3 and 4 (Fig 6), the network of tertiary interactions in L1 and S2 became more dense in half of the structures (Fig 6B, 6C, and 6E), in one it was little changed (Fig 6F), and in two of the structures with Mg2+ it became more sparse (Fig 6A and 6D). The network of interactions in L2 also showed changes. Helix stacking was generally unchanged, except that S1 and S2 became stacked in the unthreaded model without Mg2+ (Fig 6B), whereas they became unstacked in the 5′-threaded model with Mg2+ (Fig 6C). In contrast to these modest changes, however, half of S1 unfolded with the additional nucleotides present on the 3′ end for the L3-threaded model from Fig 5, suggesting that this fold topology is not as stable as the others.
Secondary structures and representative 3D structures of the most populated clusters. (A) Unthreaded model from FARFAR2 with Mg2+. (B) Unthreaded model from FARFAR2 without Mg2+. (C) Unthreaded model from Vfold with Mg2+. (D) 5′-threaded model from FARFAR2 with Mg2+. (E) 5′-threaded model from SimRNA without Mg2+. (F) 5′-threaded model from FARFAR2 without Mg2+.
Finally, the pseudoknot for SARS-CoV-1 is known to dimerize through a palindromic sequence in L2, and dimerization modulates the stimulation of −1 PRF . We therefore simulated dimeric pseudoknots to explore if the structures from the monomeric models were altered significantly upon dimerization. Five dimer predictions from Rosetta FARFAR2 with different combinations of fold topologies (unthreaded/unthreaded, unthreaded/5′-threaded, unthreaded/3′-threaded, 5′-threaded/3′-threaded, and 5′-threaded/5′-threaded) as well as one dimer constructed from the 5′-threaded prediction of SimRNA (Fig 2E) were simulated for 1 μs both with and without Mg2+, as for the monomers. All the models containing 3′-end threading showed significant unfolding of S1 and/or S2 in the MD simulations and were therefore rejected as inconsistent with the known base-pairing; so too was the unthreaded/unthreaded dimer with Mg2+, in which almost half of S2 was unpaired (S1 Table). However, the other combinations of folds were all stable with and/or without Mg2+ present (Fig 7).
Secondary structures and representative 3D structures of the most populated clusters for the models consistent with experimental data. (A) Unthreaded/unthreaded dimer without Mg2+. (B) Unthreaded/5′-threaded dimer with Mg2+. (C) Unthreaded/5′-threaded dimer without Mg2+. (D, E) 5′-threaded/5′-threaded dimer from FARFAR2 with (D) and without (E) Mg2+. (F, G) 5′-threaded/5′-threaded dimer from SimRNA with (F) and without (G) Mg2+. In each case, the interactions in the dimerization domain are shown in the red box.
Considering the unthreaded/unthreaded dimer, only the structure without Mg2+ was stable (Fig 7A). Domain (i) in this dimer had more extensive tertiary contacts in L1, S2, and U60 from L3 than domain (ii) (or than the comparable monomeric structure, Fig 3D–3E), leading to tighter packing of S1/S2, but none of the helices were stacked. The unthreaded/5′-threaded dimer was stable both with and without Mg2+ (Fig 7B and 7C). However, the tertiary interactions in L1/S2 were significantly more extensive in the absence of Mg2+: these interactions were much reduced in the presence of Mg2+ for both the unthreaded and threaded domain, compared to the analogous monomeric models (respectively Figs 3A and 4A), whereas they were notably increased for the unthreaded domain in the absence of Mg2+, compared to the analogous monomeric model (Fig 3B). The G31:U58 wobble pair in the unthreaded domain was also broken, in both conditions. Helix stacking was the same in both conditions: S1 and S2 were not stacked in either domain, whereas S1 and S3 were stacked in the threaded domain and U0:G31 stacked on S1 in the unthreaded domain. Turning to the 5′-threaded/5′-threaded dimer from FARFAR2 (Fig 7D and 7E), it was stable in both ionic conditions, but this time the tertiary interactions in L1/S2 and interactions of U60 with S1 were significantly more extensive with Mg2+ present. Indeed, the interactions in the presence of Mg2+ were more extensive than in the analogous monomer model with Mg2+ (Fig 4A), whereas the interactions in the absence of Mg2+ were considerably less extensive than in the analogous monomer model without Mg2+ (Fig 4G). The stacking was also Mg2+-dependent: with Mg2+, S1 and S2 were stacked but S1 and S3 were not, whereas S1 and S2 were unstacked but U0:G31 stacked on S1 without Mg2+. Finally, the 5′-threaded/5′-threaded dimer from SimRNA (Fig 7F and 7G) gave quite similar results both with and without Mg2+, with tertiary interactions that were not much changed from the analogous monomer model without Mg2+ (Fig 4), although the dimer lacked any interactions with U60 as well as S1/S2 stacking, showing S1 and S3 stacked instead. Note that in all the dimer models, almost none of the tertiary interactions involving L2 seen in the monomer models were present, because many of the bases in L2 were engaged in dimerization contacts (Fig 7, red boxes). The dimerization of L2 led to only minor changes in S3 and the S1/S3 junction when compared to the analogous monomer domains, however, most notably the breaking of the G31:U58 wobble pair in several of the dimer domains, the addition or removal of one or two tertiary interactions stabilizing the S1/S3 junction, and in some cases changes to tertiary interactions involving the bulged A52.
Perhaps the most notable aspect of this work is that we found at least two distinct fold topologies that were persistent in long MD simulations under at least one ionic condition. Such a result contrasts with previous work simulating the pseudoknot from SARS-CoV-1, which reported only a single structure that was somewhat similar to the models with 5′-end threading, although the additional nucleotides completing the threading were not included in that model . This difference can be explained by the fact that only a single initial structure was explored in that work: the distinct fold topologies we observed are sufficiently different that they cannot interconvert without substantial unfolding of the S1/S2 region, and they are sufficiently stable that such unfolding is unlikely (with the exception of the L3-threaded fold). Furthermore, the existence of multiple structures has been seen previously in various frameshift signals [32–37]. Indeed, it is entirely consistent with the hypothesis  that high-efficiency stimulatory structures such as that from SARS-CoV-2—which induces −1 PRF at a rate of ~20–35% [6,38]—have high conformational heterogeneity and hence form more than one structure.
The models presented above are generally consistent with previous experimental characterizations of the SARS-CoV-1 pseudoknot. Single-molecule force spectroscopy of pseudoknot unfolding for SARS-CoV-1 found a broad unfolding force distribution with high forces in the range 20–60 pN, indicating the presence of significant tertiary contact formation . Consistent with this observation, all of the models feature tertiary contacts stabilizing the 3D structure, although some of the unthreaded models showed fewer such contacts, especially in S2. The unfolding of the L3-threaded fold in several of the simulations, however, suggests that this model may not be able to support the high unfolding forces observed. Turning to the results from nuclease-protection and mutation experiments , the different models contain features that may be matched to protected residues, such as triples, hydrogen-bond networks, or steric protection; the participation of A63 in S2 in triples is also consistent with work showing that mutating A63 can dramatically lower the −1 PRF efficiency . Some of the unthreaded models without Mg2+ are most lacking in these features, suggesting that they may be less consistent with the experimental data. We note, however, that none of the structural models provided an obvious explanation for a few of the protected nucleotides, such as G54 and U55.
It is still unknown if Mg2+ is essential for the folding of this pseudoknot. Our modelling, however, suggests that Mg2+ helps to stabilize the structures. In every case, the fluctuations were reduced with Mg2+ present, and in most cases the presence of Mg2+ stimulated a denser network of hydrogen bonds. The role of Mg2+ was particularly important for threading L3 through the S1/S2 junction (Fig 5): Mg2+ was found to be essential for maintaining the integrity of S1, as the tight packing of the backbone that was needed could not be accommodated without the countervailing charge from the ions. Even with Mg2+, however, S1 and/or S2 still unfolded partially in many of the simulations, suggesting that the L3-threaded fold is not stable except possibly at high Mg2+ concentration.
The threaded fold topologies are particularly interesting: although they have been observed in exoribonuclease-resistant RNAs, where the 5′ end is threaded through a ring closed by a pseudoknot , no such fold has been seen in other frameshift-stimulatory structures. We note that threading of either the 5′ end (as in Fig 4) or L3 (as in Fig 5) requires that the different parts of the structure fold in a specific order. For example, S2 would likely need to form last for 5′-end threading, else the RNA upstream of the pseudoknot would be unable to thread through the S1/S3 junction, whereas S1 would likely need to form last for L3 threading, else the downstream RNA would be unable to thread through the S1/S2 junction. As a result, the pseudoknot would be expected to populate multiple conformers, dictated by the kinetic partitioning between the different possible pathways . Different fold topologies at the 5′ end may also have important functional implications: 5′-end threading reduces the effective length of the spacer between the slippery site and the pseudoknot from 6 nt to 5 nt by sequestering U0 in the pseudoknot structure, and −1 PRF levels are quite sensitive to the spacer length [7,8,40]. The different folds might thus be expected to have different effectiveness at stimulating −1 PRF owing to the change in spacer length, in addition to any differences arising from the fold.
Although reliably evaluating the relative stabilities of the different structural models from the simulations is difficult, it is clear that the L3-threaded structure is least stable, as it is the most likely to unfold in μs-long simulations. We therefore expect that it is the least likely fold to be formed by the pseudoknot. The models that are most like other frameshift-stimulatory pseudoknots are those with 5′-end threading: despite the unusual fold topology, these are the only models that consistently show both S1/S2 stacking—which is known to play an important role in helping pseudoknots stimulate −1 PRF —and dense networks of tertiary contacts, whether in monomers or dimers. The unthreaded models, in contrast, tend to have fewer tertiary contacts and little or no tendency to stack S1 and S2. As a result, we expect that the 5′-threaded fold is likely the most stable and hence dominant form of this pseudoknot. In fact, recent cryo-electron microscopy imaging of the SARS-CoV-2 pseudoknot on and off the ribosome shows evidence for 5′-end threading [13,14], although the structural models inferred from these studies differ from each other and from the models presented here, likely because of the different measurement conditions (high Mg2+ concentration or presence of ribosomes). Some images also suggest the presence of other conformers, consistent with the heterogeneity found in the simulations. Notably, we would expect that even if the 5′-threaded fold is the dominant conformer, unthreaded conformers will always be present, too, because the kinetic partitioning of S2 folding before or after the 5′ end is threaded into the S1/S3 junction will inevitably lead to a population of unthreaded conformers. Such an effect has been seen directly in the folding of an exoribonuclease-resistant RNA from Zika virus , where both threaded and unthreaded conformers were observed .
The structural models described above will be helpful for future experimental analyses of the SARS-CoV-2 pseudoknot. X-ray scattering profiles can be predicted from these models and used to analyze small- and wide-angle x-ray scattering measurements, to confirm which (if any) of these conformations are populated and in what kind of mixture [6,43,44]. The models could also be compared to single-molecule measurements of pseudoknot folding, which could detect heterogeneous populations of different conformers and characterize the sequence of intermediate states formed during the folding of each one [45,46], again verifying which of the conformers fold and quantifying their relative abundance and stability. These models should also prove useful for drug discovery efforts, facilitating structure-based searches for compounds that attenuate the virus by altering −1 PRF.
S1 Table. Pseudoknot models studied.
Models in red were rejected after MD simulations because of unfolded base-pairs that were inconsistent with the expected secondary structure. Models in blue were rejected before MD simulation because of a topologically knotted fold that is inconsistent with RNA. Models in green were analyzed in the main text.
S1 Fig. Selected structures from simulations with significant secondary structure disruption.
Representative structures of the most populated cluster from simulations of (A) Fig 2A, (B) Fig 2B, (C) Fig 2C, and (D) Fig 2D show significant disruption of the secondary structure. In each panel, the figure on the right is from simulations with Mg2+, that on the left is from simulations without Mg2+.
S2 Fig. MD simulations of unthreaded models.
(A) Overlay of the 3D structure of the 3 most populated clusters from simulations of Fig 2F with Mg2+ (ions not shown for clarity). Top inset: RMSD vs time, showing when each of the 3 most populated clusters was occupied during the last 500 ns of the simulation (blue: top cluster, orange: second cluster, green: third cluster). Bottom inset: RMSF for each residue. (B) The same for simulations of Fig 2F without Mg2+. (C) Same for simulations of Fig 2I with Mg2+.
S3 Fig. MD simulations of models with 5′-end threading.
(A) Overlay of the 3D structure of the 3 most populated clusters from simulations of Fig 2G with Mg2+ (ions not shown for clarity). Top inset: RMSD vs time, showing when each of the 3 most populated clusters was occupied during the last 500 ns of the simulation (blue: top cluster, orange: second cluster, green: third cluster). Bottom inset: RMSF for each residue. (B) The same for simulations of Fig 2E without Mg2+. (C) The same for simulations of Fig 2G without Mg2+.
S4 Fig. MD simulations of model with L3 threading.
(A) Overlay of the 3D structure of the 3 most populated clusters from simulations of Fig 2H with Mg2+ (ions not shown for clarity). Top inset: RMSD vs time, showing when each of the 3 most populated clusters was occupied during the last 500 ns of the simulation (blue: top cluster, orange: second cluster, green: third cluster). Bottom inset: RMSF for each residue.
S1 Data. PDB files for structural models.
Files in.pdb format for each of the model structures presented in the manuscript are included in an archived.zip folder. The files corresponding to the structures shown in Figs 3B, 3E, 3H, 4B, 4E, 4H, 5B, 6A–6F, and 7A–7G are named FigNX_1.pdb, where N is the figure number and X is the letter for the figure panel. These structures represent the most populated of the clusters found in the corresponding MD simulations. The structures of the second- and third-most populated clusters from each of the simulations are in the files named FigNX_2.pdb and FigNX_3.pdb, respectively.
We thank Compute Canada for providing exceptional access to computational resources for this project, and we acknowledge the use of computational resources from SOSCIP under the COVID-19 Response Program (Application 3–043) and the assistance of IBM CAS in Edmonton with this work.
- 1. Coronavirus disease (COVID-19) Situation Report—129. World Health Organization. 2020 May 28 [Cited 2020 June 7]. Available from: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200528-covid-19-sitrep-129.pdf?sfvrsn=5b154880_2
- 2. Kissler SM, Tedijanto C, Goldstein E, Grad YH, Lipsitch M. Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period. Science. 2020; 368: eabb5793. pmid:32291278
- 3. Dong L, Hu S, Gao J. Discovering drugs to treat coronavirus disease 2019 (COVID-19). Drug Disc. Therapeutics. 2020; 14: 58–60.
- 4. Wu A, Peng Y, Huang B, Ding X, Wang X, Niu P, et al. Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China. Cell Host Microbe. 2020; 27: 325–8. pmid:32035028
- 5. Plant EP, Dinman JD. The role of programmed-1 ribosomal frameshifting in coronavirus propagation. Front. Biosci. 2008; 13: 4873–81. pmid:18508552
- 6. Kelly JA, Olson AN, Neupane K, Munshi S, San Emeterio J, Pollack L, et al. Structural and functional conservation of the programmed −1 ribosomal frameshift signal of SARS coronavirus 2 (SARS-CoV-2). J. Biol. Chem. 2020; 295: 10741–8. pmid:32571880
- 7. Brierley I, Gilbert RJC, Pennell S. Pseudoknot-Dependent Programmed—1 Ribosomal Frameshifting: Structures, Mechanisms and Models. In: Atkins J, Gesteland R, editors. Recoding: Expansion of Decoding Rules Enriches Gene Expression Nucleic Acids and Molecular Biology. Springer, New York, NY; 2010. pp. 149–74.
- 8. Atkins JF, Loughran G, Bhatt PR, Firth AE, Baranov P. Ribosomal frameshifting and transcriptional slippage: From genetic steganography and cryptography to adventitious use. Nucleic Acids Res. 2016; 44: 7007–78. pmid:27436286
- 9. Plant E, Sims A, Baric R, Dinman J, Taylor D. Altering SARS Coronavirus Frameshift Efficiency Affects Genomic and Subgenomic RNA Production. Viruses. 2013; 5: 279–94. pmid:23334702
- 10. Plant EP, Rakauskaitė R, Taylor DR, Dinman JD. Achieving a Golden Mean: Mechanisms by Which Coronaviruses Ensure Synthesis of the Correct Stoichiometric Ratios of Viral Proteins. J. Virol. 2010; 84: 4330–40. pmid:20164235
- 11. Baranov P, Henderson CM, Anderson CB, Gesteland RF, Atkins JF, Howard MT. Programmed ribosomal frameshifting in decoding the SARS-CoV genome. Virology. 2005; 332: 498–510. pmid:15680415
- 12. Plant EP, Pérez-Alvarado GC, Jacobs JL, Mukhopadhyay B, Hennig M, Dinman JD. A three-stemmed mRNA pseudoknot in the SARS coronavirus frameshift signal. PLoS Biol. 2005; 3: 1012–23. pmid:15884978
- 13. Zhang K, Zheludev IN, Hagey RJ, Wu MT-P, Haslecker R, Hou YJ, et al. Cryo-electron Microscopy and Exploratory Antisense Targeting of the 28-kDa Frameshift Stimulation Element from the SARS-CoV-2 RNA Genome. BioRxiv [Preprint]. 2020 bioRxiv 2020.07.18.20927 [posted 2020 July 20; cited 2020 Nov 3]: [54 p.]. Available from: https://www.biorxiv.org/content/10.1101/2020.07.18.209270v1 pmid:32743589
- 14. Bhatt PR, Scaiola A, Loughran G, Leibundgut M, Kratzel A, McMillan A, et al. Structural basis of ribosomal frameshifting during translation of the SARS-CoV-2 RNA genome. BioRxiv [Preprint]. 2020 bioRxiv 2020.10.26.355099 [posted 2020 Oct 26; cited 2020 Nov 3]: [22 p.]. Available from: https://www.biorxiv.org/content/10.1101/2020.10.26.355099v1
- 15. Park SJ, Kim YG, Park HJ. Identification of rna pseudoknot-binding ligand that inhibits the −1 ribosomal frameshifting of SARS-coronavirus by structure-based virtual screening. J. Am. Chem. Soc. 2011; 133: 10094–100. pmid:21591761
- 16. Lyskov S, Chou F-C, Conchúir SÓ, Der BS, Drew K, Kuroda D, et al. Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE). PLoS ONE. 2013; 8: e63906. pmid:23717507
- 17. Watkins AM, Rangan R and Das R. FARFAR2: Improved De Novo Rosetta Prediction of Complex Global RNA Folds. Structure. 2020; 28: 963–76 pmid:32531203
- 18. Rangan R, Watkins AM, Kladwang W, Das R. De novo 3D models of SARS-CoV-2 RNA elements and small-molecule-binding RNAs to guide drug discovery. BioRxiv [Preprint] bioRxiv 2020.04.14.041962. [posted 2020 April 15; revised 2020 Dec 18; cited 2020 Nov 3]: [31 p.]. Available from: https://www.biorxiv.org/content/10.1101/2020.04.14.041962v2
- 19. Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, et al. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res. 2015; 44: e63. pmid:26687716
- 20. Biesiada M, Purzycka KJ, Szachniuk M, Blazewicz J, Adamiak RW. Automated RNA 3D structure prediction with RNA Composer. Methods Mol. Biol. 2016; 1490: 199–215. pmid:27665601
- 21. Antczak M, Zablocki M, Zok T, Rybarczyk A, Blazewicz J, Szachniuk M. RNAvista: a webserver to assess RNA secondary structures with non-canonical base pairs. Bioinformatics. 2019; 35: 152–5. pmid:29985979
- 22. Parisien M and Major F. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature. 2008; 452, 51–5. pmid:18322526
- 23. Martinez HM, Maizel J, Shapiro BA. RNA2D3D: A program for generating, viewing, and comparing 3-dimensional models of RNA. J. Biomol. Struct. Dyn. 2008; 25: 669–83. pmid:18399701
- 24. Xu XJ, Chen SJ. Hierarchical assembly of RNA three-dimensional structures based on loopctemplates. J. Phys. Chem. B. 2018; 122: 5327–35. pmid:29258305
- 25. Case DA, Ben-Shalom IY, Brozell SR, Cerutti DS, Cheatham TE III, et al. AMBER 2018. University of California, San Francisco. 2018.
- 26. Joung IS and Cheatham TE. Molecular Dynamics Simulations of the Dynamic and Energetic Properties of Alkali and Halide Ions Using Water-Model-Specific Ion Parameters. J. Phys. Chem. B. 2009; 113, 13279–90. pmid:19757835
- 27. Akiyama BM, Laurence HM, Massey AR, Costantino DA, Xie X, Yang Y, et al. Zika virus produces noncoding RNAs using a multi-pseudoknot structure that confounds a cellular exonuclease. Science. 2016; 354: 1148–52. pmid:27934765
- 28. Egli M, Minasov G, Su L, Rich A. Metal ions and flexibility in a viral RNA pseudoknot at atomic resolution. Proc. Natl. Acad. Sci. U.S.A. 2002; 99: 4302–7. pmid:11904368
- 29. Pallan PS, Marshall WS, Harp J, Jewett FC, Wawrzak Z, Brown BA, et al. Crystal structure of a luteoviral RNA pseudoknot and model for a minimal ribosomal frameshifting motif. Biochemistry. 2005; 44: 11315–22. pmid:16114868
- 30. Leontis NB, Stombaugh J and Westhof E. (2002) The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res. 2002; 30, 3497–531. pmid:12177293
- 31. Ishimaru D, Plant EP, Sims AC, Yount BL, Roth BM, Eldho Nv, Pérez-Alvarado GC, et al. RNA dimerization plays a role in ribosomal frameshifting of the SARS coronavirus. Nucleic Acids Res. 2013; 41, 2594–608. pmid:23275571
- 32. de Messieres M, Chang JC, Belew AT, Meskauskas A, Dinman JD, La Porta A. Single-molecule measurements of the CCR5 mRNA unfolding pathways. Biophys. J. 2014; 106: 244–52. pmid:24411256
- 33. Moomau C, Musalgaonkar S, Khan YA, Jones JE, Dinman JD. Structural and functional characterization of programmed ribosomal frameshift signals in West Nile virus strains reveals high structural plasticity among cis-acting RNA elements. J. Biol. Chem. 2016; 291: 15788–95. pmid:27226636
- 34. Halma MTJ, Ritchie DB, Cappellano TR, Neupane K, Woodside MT. Complex dynamics under tension in a high-efficiency frameshift stimulatory structure. Proc. Natl. Acad. Sci. U.S.A. 2019; 116: 19500–5. pmid:31409714
- 35. Ritchie DB, Foster DAN, Woodside MT. Programmed −1 frameshifting efficiency correlates with RNA pseudoknot conformational plasticity, not resistance to mechanical unfolding. Proc. Natl. Acad. Sci. U.S.A. 2012; 109: 16167–72. pmid:22988073
- 36. Ritchie DB, Soong J, Sikkema WKA, Woodside MT. Anti-frameshifting ligand reduces the conformational plasticity of the SARS virus pseudoknot. J. Am. Chem. Soc. 2014; 136: 2196–9. pmid:24446874
- 37. Ritchie DB, Cappellano TR, Tittle C, Rezajooei N, Rouleau L, Sikkema WKA, et al. Conformational dynamics of the frameshift stimulatory structure in HIV-1. RNA. 2017; 23: 1376–84. pmid:28522581
- 38. Neupane K, Munshi S, Zhao M, Ritchie DB, Ileperuma SM and Woodside MT. Anti-Frameshifting Ligand Active against SARS Coronavirus-2 Is Resistant to Natural Mutations of the Frameshift-Stimulatory Pseudoknot. J. Mol. Biol. 2020; 432: 5843–7. pmid:32920049
- 39. Lin JC, Hyeon C, Thirumalai D. RNA under tension: Folding landscapes, kinetic partitioning mechanism, and molecular tensegrity. J. Phys. Chem. Lett. 2012; 3: 3616–25. pmid:23336034
- 40. Brierley I, Jenner AJ and Inglis SC. Mutational analysis of the “slippery-sequence” component of a coronavirus ribosomal frameshifting signal. J. Mol. Biol. 1992; 227: 463–79. pmid:1404364
- 41. Cornish PV, Stammler SN and Giedroc DP. The global structures of a wild-type and poorly functional plant luteoviral mRNA pseudoknot are essentially identical. RNA. 2006; 12: 1959–69. pmid:17000902
- 42. Zhao M and Woodside MT. Mechanically stable knot formed by strand threading in Zika virus RNA confers RNase resistance. BioRxiv [Preprint]. 2020 bioRxiv 2020.07.01.183129 [posted 2020 July 2; cited 2020 Nov 3]: [19 p.]. Available from: https://www.biorxiv.org/content/10.1101/2020.07.01.183129v1
- 43. Chen Y, Pollack L. SAXS studies of RNA: structures, dynamics, and interactions with partners. Wiley Interdiscip. Rev RNA. 2016; 7: 512–26. pmid:27071649
- 44. Pabit SA, Katz AM, Tolokh IS, Drozdetski A, Baker N, Onufriev A, et al. Understanding nucleic acid structural changes by comparing wide-angle x-ray scattering (WAXS) experiments to molecular dynamics simulations. J. Chem. Phys. 2016; 144: 205102. pmid:27250330
- 45. Chen G, Wen JD, Tinoco I. Single-molecule mechanical unfolding and folding of a pseudoknot in human telomerase RNA. RNA. 2007; 13: 2175–88. pmid:17959928
- 46. Neupane K, Yu H, Foster DAN, Wang F, Woodside MT. Single-molecule force spectroscopy of the add adenine riboswitch relates folding to regulatory mechanism. Nucleic Acids Res. 2011; 39: 7677–87. pmid:21653559