Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Free Energy Landscape and Multiple Folding Pathways of an H-Type RNA Pseudoknot

  • Yunqiang Bian,

    Affiliations Collaborative Innovation Center of Advanced Microstructures and Department of Physics, Nanjing University, Nanjing 210093, China, Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China

  • Jian Zhang , (JZ); (WW)

    Affiliation Collaborative Innovation Center of Advanced Microstructures and Department of Physics, Nanjing University, Nanjing 210093, China

  • Jun Wang,

    Affiliation Collaborative Innovation Center of Advanced Microstructures and Department of Physics, Nanjing University, Nanjing 210093, China

  • Jihua Wang,

    Affiliation Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China

  • Wei Wang (JZ); (WW)

    Affiliation Collaborative Innovation Center of Advanced Microstructures and Department of Physics, Nanjing University, Nanjing 210093, China

Free Energy Landscape and Multiple Folding Pathways of an H-Type RNA Pseudoknot

  • Yunqiang Bian, 
  • Jian Zhang, 
  • Jun Wang, 
  • Jihua Wang, 
  • Wei Wang


How RNA sequences fold to specific tertiary structures is one of the key problems for understanding their dynamics and functions. Here, we study the folding process of an H-type RNA pseudoknot by performing a large-scale all-atom MD simulation and bias-exchange metadynamics. The folding free energy landscapes are obtained and several folding intermediates are identified. It is suggested that the folding occurs via multiple mechanisms, including a step-wise mechanism starting either from the first helix or the second, and a cooperative mechanism with both helices forming simultaneously. Despite of the multiple mechanism nature, the ensemble folding kinetics estimated from a Markov state model is single-exponential. It is also found that the correlation between folding and binding of metal ions is significant, and the bound ions mediate long-range interactions in the intermediate structures. Non-native interactions are found to be dominant in the unfolded state and also present in some intermediates, possibly hinder the folding process of the RNA.


As a major type of macromolecule essential for life, RNAs carry out numerous biological functions including translating genetic information into proteins, regulating gene expression, catalyzing biochemical process, etc. For a better understanding of their functions, knowledge of how they achieve functional structures through folding is necessary. Furthermore, comparison of the folding mechanisms of two distinct biopolymers, RNA and proteins, may reveal different physical-chemical interactions governing folding and deepen our understanding of the structure formation process of biomolecules.

RNA pseudoknots are examples of minimal structural motifs in RNA with tertiary interactions. They have been found play important roles in self-splicing, stimulating ribosomal frameshifting, forming the catalytic core, etc [1, 2]. In addition to their functional importance, RNA pseudoknots also provide excellent models for studying the folding mechanism of RNAs. This is because they contain many types of interactions commonly seen in RNAs, including canonical and non-canonical base pairs, tertiary interactions such as the A-minor interactions often seen in loop-stem triplex, coaxial stacking, and particularly the metal ion-nucleotides interactions. Moreover, the study of pseudoknots can also be beneficial for developing new approaches for RNA structure prediction, since the non-nested topology, non-canonical interactions and loop entropy of pseudoknots cause significant difficulties in developing efficient sampling algorithms, as well as in determining energy rules[38].

There have been lots of experimental and theoretical works studying the folding and unfolding mechanism of RNA pseudoknots [921], conformational switch between metastable structures [2224], roles of metal ions on structure stability and folding process [2530], etc. For instance, in an early study, Draper and colleagues investigated the effects of mono and divalent ions in the folding process of an mRNA pseudoknot and discussed different effects of metal ions of different size and valence [15]. Chen et al. studied the mechanical folding and unfolding of a pseudoknot in human telomerase RNA (hTR) by optical tweezers and discovered a stepwise folding mechanism as well as a one-step unfolding mechanism [9]. They also detected the existence of nonnative intermediates and provided evidence that the folding of both hairpin and pseudoknot takes complex pathways. Also using optical tweezers, Green et al. studied the frameshifting pseudoknot from infectious bronchitis virus (IBV) and its mutants, giving their thermodynamics and kinetics as a function of forces, and also their dependence on Mg2+ ions [25]. Wu et al. studied the operator of the rpsO gene transcript and discovered that it interchanges spontaneously between a pseudoknot conformation and a double-hairpin conformation[22]. Theoretically, in an early all-atom molecular dynamics (MD) simulation of a pseudoknot from beet western yellow virus, Csaszar et al. revealed several early unfolding events at an elevated temperature (400 K) [20]. Chen's group developed a series of coarse-grained RNA models, which enabled them to predict the conformational entropy, loop-helix tertiary contacts, as well as structures of loops [1013]. Cho and colleagues performed systemic simulations on different pseudoknots based on a coarse-grained model [14]. They found that the folding landscapes of RNAs with a similar topology are significant sequence-dependence. Their work also showed that the folding mechanism of pseudoknot is mostly determined by the relative stabilities of secondary structures. With all-atom simulations, we studied the thermal unfolding kinetics of an RNA pseudoknot within gene32 mRNA of bacteriophage T2 [19]. We detected multiple intermediates and found that the transition states are rather diverse, with a conclusion that the unfolding of the specific RNA follows multiple pathways.

Although there are many excellent experimental as well as theoretical works, the atomistic scenario of the folding process of RNA pseudoknots is still lacking. All-atom molecular dynamics simulations can be of help. However, such studies are still rare for RNA pseudoknots, mostly due to the slow kinetics of the molecules and the resulted difficulty in achieving enough sampling in the relevant phase space [18]. Here we study the folding free energy landscape (FEL) of the pseudoknot within the gene32 mRNA of bacteriophage T2 (Fig 1A and 1B), with both RNA and water molecules explicit modeled. To accelerate the sampling rate in phase space, we adopt an advanced sampling technique named bias-exchange metadynamics (BEMD) [31]. This powerful technique enables us to explore the free energy landscape and identify potential folding intermediates. The FEL, combined with the kinetic results for the same RNA obtained in our previous work [19], provides a comprehensive view of the folding process. Based on the simulation results we propose an atomistic scenario for the folding process and discuss the relevance of our results with previous experimental and theoretical findings. We also discuss the roles played by native and non-native interactions and metal ions in the folding process.

Fig 1. Native structure of the RNA pseudoknot within gene32 mRNA of bacteriophage T2.

(A) Secondary and tertiary structures of the RNA pseudoknot (PDB code: 2TPK). Two helices are labeled as Helix 1 (H1) and Helix 2 (H2) and are colored red and blue, respectively. Two loops are labeled as Loop 1 (L1) and Loop 2 (L2), and are colored orange and green, respectively. The same color code is used in all the figures unless otherwise indicated. (B) Surface model of a typical structure, viewed from two perpendicular directions, taken from a 200 ns MD simulation started from the native structure. Cations are plotted as yellow spheres. (C) Hydrogen bond (HB) map averaged over all the conformations in the same MD run. The formation probabilities are indicated by different colors, as quantified by the color scale beside the figure. The red labels indicate the HBs within two helices, and the yellow ones indicate that between H2 and L2, i.e., the tertiary interactions. (D) The fraction of the formed native HBs as a function of time in the same MD run.

Materials and Methods

Preparation of the system

The native structure of the RNA pseudoknot (PDB code: 2TPK) was solvated in a periodic box with 12525 TIP3P water molecules. Na+ and Cl- ions were added to neutralize the system and to maintain a salt concentration of 100mM. The simulations were performed with Gromacs (version 4.5.5) [32] and force field amber99sb_parmbsc0 [33], which combined the amber99sb force field with the parmbsc0 nucleic acids parameters. All the bond lengths were constrained with LINCS algorithm and the time step was set to 2fs. PME method with a cutoff of 1.0nm was used to treat the electrostatic interactions. The cutoff of nonbonded van der Waals (VDW) interactions was also 1.0nm. The system was first subjected to an energy minimization of 1000 MD steps, followed by a gradual heating to 300K. After that a 2 ns equilibrium run was performed with NPT ensemble at 1 atm and 300K. The final conformation of the 2ns run was utilized as the initial structure for further simulations. To test the stability of the native structure, we performed a conventional MD simulation of length 200 ns at 300K and 1atm with a NPT ensemble.

Bias-exchange metadynamics

Metadynamics is a powerful sampling method that concentrates in the given collective variables (CVs) space and periodically applies repulsive Gaussian potentials on the CVs to accelerate the barrier-crossing events. Bias-exchange metadynamics (BEMD) further enhances the sampling efficiency by employing multiple replicas biased on different CVs and exchanging their configurations and velocities periodically according to a Metropolis-like criterion [31, 34, 35].

In our BEMD simulation, six replicas were used, where one replica was unbiased (the neutral replica) and the rest five were biased on different CVs, respectively. The CVs were chosen as the number of native hydrogen bonds (Nhb) in the helix H1, the Nhb in the helix H2, the Nhb between H1 and the second loop L2, the radius of gyration (Rg) of the pseudoknot and the energy of the system. The height of Gaussian potential was set to 0.1 kJ/mol and the width was chosen as 0.3, 0.4, 0.25, 0.2nm and 100 kJ/mol for the five CVs described above, respectively. The time interval for depositing the Gaussian potentials was 1ps while the exchange attempting interval between replicas was 30ps. The simulation time of each replica was 500ns, resulting in a total of 3 microseconds. The program PLUMED (version 1.3) as a Gromacs plugin was used for the BEMD simulation [36].

Free energy landscape (FEL) calculation and intermediates identification

Only the data from the neutral replica was used for further analysis. This is to avoid the problems that may exist in the biased replicas [37]. A detailed discussion of the reliability of the data from metadynamics can be found in our previous work [38]. The free energy landscape, taking a one-dimensional case for example, is calculated as (1) Where s is a desired CV, P(s) is the probability distribution along the CV, T is the simulation temperature and kB is the Boltzmann constant.

From the FEL we identified the basins of attraction visually and then obtained their representative structures as follows. First the conformations were designated to different basins according to their CVs. Second, the conformations within the same basin were subjected to a clustering analysis. In brief, the i-th conformation was compared to each of the representative structures of the clusters obtained previously; if a RMSD smaller than a threshold was found, the i-th conformation was regarded as belonging to that cluster; otherwise, if the i-th conformation could not be assigned to any existing clusters, it was considered to be the representative structure of a new cluster. The representative structure of the largest cluster in a basin was used to represent the basin. The RMSD threshold in the clustering analysis was set to 0.1nm.

Bound ions and ion-mediated structures

A "bound ion" is defined if a metal ion is within a distance of 0.4nm to any nucleotides. To characterize the effect of metal ions in mediating long-range interactions between different structural elements rather than a simple electrostatic screening effect, a “bridging ion” is defined if a metal ion simultaneously binds two or more nucleotides that are separated by at least 4 nucleotides along the sequence. If such an ion is detected in a structure, then this structure is defined as an “ion-mediated structure”. In a given structure ensemble, the ratio of the population of the ion-mediated structures over that of the non-ion-mediated structures is calculated to evaluate the importance of the mediating effect of cations in this ensemble. The equation reads, (2) If the value of RP of an ensemble is greater than unity, the majority of the conformations therein will be ion-mediated structures, indicating that the mediating effect of metal ions plays a significant role in the stability of the ensemble.


Stability of the native structure

The stability of the RNA pseudoknot is mostly attributed to three groups of interactions, including the canonical base-base pairing within two helices and the non-canonical interactions between loop L2 and H1 (Fig 1A and 1C). The last one is a tertiary interaction. The native structure is very stable under the current force field, according to the 200ns-long conventional MD simulation starting from the native structure. It shows that for most of the time, the fraction of the formed hydrogen bonds (HBs) exceeds 0.9 and the RMSD with respect to the starting structure is smaller than 0.4 nm (Fig 1 and S1 Fig). However, the tertiary interactions are comparably less stable, with a formation probability around 0.4 (Fig 1C).

It is commonly believed that metal ions play indispensable roles in RNA stability. In Fig 1B we show the structure that most represents the average cation binding pattern of the native state (S1 Fig). It can be seen that some cations are partly or deeply buried inside the RNA. For example, Na1+ binds to A8 and U28, and Na2+ interacts with both A8 and U11. By simultaneously interacting with nucleotides far away from each other along sequence, these metal ions bridge tertiary interactions and help stabilizing the RNA structure. Presumably, they play more active roles than simply acting as electrostatic screening agents.

The free energy landscapes (FELs) and intermediates from the BEMD simulation

Based on the BEMD simulation data, we calculated the FEL on a two-dimensional surface defined by the number of HBs in H1 and that in H2 and monitored its change as a function of simulation time. It was found that the change was almost undetectable during the last 60ns of the 500ns BEMD run, suggesting a good quality of convergence (S2 Fig). Therefore the data collected until 500ns are used for further analysis.

In Fig 2 we show two FELs calculated solely from the neutral replica of the BEMD simulation. The FEL projected on the collective variables (CVs) Nhb and Rg has an L-like shape, demonstrating that the RNA undergoes a structural collapse at its early folding stage, consistent with the previous findings [39]. Meanwhile, the FEL projected on the number of HBs in H1 and that in H2 reveals the existence of multiple basins of attraction, labeled as N, U, and I1~I6, respectively. Clearly, the basin-N and basin-U correspond to native states and unfolded states, respectively, supported by their positions in the FEL and structure analysis described in the following sections. The rest basins correspond to the folding intermediates and will be discussed later.

Fig 2. Free energy landscapes (FELs) calculated from the 500ns BEMD run.

The magnitude of the free energy is represented by different colors, as quantified by the color scale on the top of the figure; the unit is kcal/mol. (A) The FEL projected on the radius of gyration (Rg) and the number of native hydrogen bonds (Nhb). The inset is a zoom-in of the left region. (B) The FEL projected on the number of native hydrogen bonds in H1 and that in H2. The labels from I1 to I6, U and N denote six intermediate states, the unfolded basin and the native basin, respectively.

Structure of the unfolded states

We performed cluster analysis on the denatured states. The largest six clusters are denoted from U1 to U6, respectively and shown in Fig 3. Their relative populations are about 26%, 18%, 8%, 5%, 5% and 4%, respectively. It can be seen that the unfolded states are structurally heterogeneous, with U1 and U2 compact while the other four partially or fully extended. At the first sight, the backbone of U1and U2 roughly resembles that of the native structure. However, a further calculation of the hydrogen bond map suggests that most hydrogen bonds are non-native; therefore both U1 and U2 are non-specific collapsed states. U3 is partially compact, with the 3'-end forming a triplex-like structure; however, most of the formed hydrogen bonds are still non-native, and its 5'-endis flipped out into solution. Compared with the previous structures, U4 is much more extended. It forms a hairpin between L2 and H1 through the base-pairs G16:G27 and C19:U25, which are, again, non-native. U5 and U6 are fully extended and no stable base-pairs are observed.

Fig 3. Structures of the unfolded states.

(A) Representative structures of the largest six clusters in the unfolded states. (B) HB map averaged over all the structures in the unfolded states. The formation probabilities are indicated by different colors, as quantified by the color scale beside the figure. The labels inside the figure are also colored, with yellow indicating tertiary interactions between H1 and L2 and white non-native HBs. (C) The number of bound metal ions and Rp as a function of Rg plotted for the largest six clusters.

The formation probabilities of HBs averaged over all the conformations in the unfolded basin are shown in Fig 3B. It can be seen that most HBs are non-native. Interestingly, the tertiary HBs between L2 and H1 are also observed, for example, G27:G16 and G26:G17, although they are relative week even in the native structure.

Fig 3C shows the number of bound cations and the value of Rp as a function of Rg. It is clear that both of them increase rapidly as the molecule collapses. For the more extended states from U3 to U6, the average number of bound ions is less than 3 and Rp is less than 0.4. In contrast, for the compact states such as U1 and U2, Rp increases rapidly to about 2.0, indicating that the mediating effect of cations becomes more important in these structures. For example, in the structure of U1 shown in Fig 3A, Na1+ and Na2+ are trapped deeply inside the RNA and simultaneously bind to C6, C19 and C26, therefore drawing L2 and H1 close to each other. In U2 a similar role of metal ions is observed.

Structures of the intermediates

The representative structure of the intermediate-I1 is a triplex in which the helix H1 and the L2-H1 interaction have been formed and the nucleotides in the original H2 form a nonnative hairpin, as shown by Figs 4A and 5A. The L2-H1 interaction is established through A24:G4 and A22:G2. The Intermediate-I2 is different from I1 in that the nonnative hairpin is disrupted and the bases rearrange to form several native HBs (Fig 4B); at the same time the L2-H1 interactions in the triplex also rearrange to a new pattern. The intermediate-I3 is characterized by partly formed H1 and H2 (Figs 4C and 5C). The L2-H1 triplex has also formed, stabilized by A22:U3 and C26:G17. The intermediate-I4 is very different from I1, I2, and I3 in that H2 is almost formed while H1 not (Figs 4D and 5D); meanwhile, the two strands of H1, U3~C7 and G16~A20, interact with L2 through the tertiary contacts U3:A22, G17:C26 and G16:G27. The intermediate-I5 has a well formed H2 and a partly formed H1 (Figs 4E and 5E). In addition, the L2-H1 triplex is formed through A22:G4. The intermediate-I6 is a native-like state; most of the native base pairs have formed except A20:U3 in H1 (Figs 4F and 5F). In this intermediate, L2 and H1 interact with each other though C26:C6 and G27:G17.

Fig 4. Representative structures of the intermediates.

(A)-(F) are for the intermediates from I1 to I6, respectively. The nucleotides and metal ions are represented in the same way as in Fig 1.

Fig 5. Average HB maps of the intermediates.

(A)-(F) are for the intermediates from I1 to I6. Different colors of the HBs indicate different formation probabilities as quantified by the color sale on the top of the figure. The labels of the hydrogen bonds are also colored, with the red, white and yellow colors indicating the native, non-native, and tertiary HBs, respectively.

The evolution of the number of bound cations and Rp as folding proceeds through the above intermediates is given in Fig 6. It can be seen that Rp are larger than 2.0 for all the six intermediate states, implying that the mediating effect of metal ions are important for their stabilities. Take the intermediate I1 as an example, Na1+ binds to both C6 and C26 and pulls L2 and H1 together, hence stabilizing the triplex structure. Similar roles are observed for metal ions in the other intermediates, as can be seen from the representative structures in Fig 4. Moreover, Fig 6 shows that Rp increases as the molecule folds, indicating that when the structure becomes more similar to the native one, the mediating effect of cations becomes more important.

Fig 6. Binding of metal ions during folding process.

The evolution of the number of bound metal ions (the upper number in the circle) and Rp (the lower number in the circle) during the folding process. The basins are depicted by circles and their positions roughly correspond to the CVs.


The FEL obtained from our simulations shows multiple intermediates and hence suggests that the RNA pseudoknot folds via multiple mechanisms/pathways, summarized in Fig 7. Note that here the term pathway is used to describe the thermodynamic aspect of the FEL and does not necessarily reflect the underlying kinetics. The RNA may first form the helix H1 and then H2, or in an opposite order, corresponding to the pathway-I and pathway-IV, respectively. It may also form two helices simultaneously, corresponding to the pathway-III. The pathways-I and IV reflect a step-wise folding mechanism, i.e., one helix after the other, while the pathway-III corresponds to a cooperative mechanism. The folding may also proceed via a hybrid mechanism. For example, it may go via pathway-III to I3 first, and then to I2 or I5, from there it goes to the native basin via pathway-I or IV, respectively. Among all the pathways, the pathway-I is dominant since the relevant intermediates (I1 and I2) have lower free energy and larger entropy, estimated qualitatively from the size of the basins. The overall folding picture is consistent with the result for the same RNA obtained in our previous work [19], which suggested that the unfolding of the pseudoknot follows multiple pathway and the dominant one is a sequential unfolding of H2 and H1, corresponding to the pathway-I in this study. It worth mentioning that in our previous work the conclusion was drawn based on massive unfolding MD simulations and therefore reflected the kinetic nature of the underlying FEL. In contrast, the BEMD simulation in this work sacrifices the kinetics and concentrates on the thermodynamic aspect of the FEL. The consistence between FELs and pathways drawn from two very different approaches is amazing and bolsters the presented findings.

Fig 7. Multiple folding mechanisms proposed for the RNA pseudoknot.

The typical pathways are labeled as pathway-I, pathway-II, pathway-III and pathway-IV, and are colored by red, violet, blue and wine, respectively.

Multiple pathway mechanism has been frequently observed for RNAs in experiments. For example, Chen et al. studied the mechanical folding and unfolding of an H-type pseudoknot in hTR RNA with optical tweezers and found that at low forces the folding occurs step by step, with 5'-stem first followed by 3'-stem, in contrast to one-step unfolding at high forces of ~46pN [9]. This is direct experiment evidence that the folding of pseudoknot takes complex pathways, according to the authors. Also using optical tweezers, Wu et al. studied the structural dynamics and rearrangements of the rpsO operator RNA and suggested that the molecule folds to the high-stability-pseudoknot via a double-hairpin structure, or via a low stability pseudoknot as an intermediate [22]. The double-hairpin pathway is reminiscent of the pathway-III in this study. Messieres et al. investigated the unfolding pathways of a -1 PRF sequence in CCR5 mRNA with optical trapping techniques; they found the RNA manifests several distinct unfolding pathways when subject to end-to-end force [40]. In fact, even for a simple RNA hairpin, large-scale simulations revealed multiple intermediates and suggested that folding can start either from the closing base pair or from the end base pair [41].

Multiple folding pathways of RNAs and the underlying mechanism have been discussed by Cho et al, who studied the folding of three H-type RNA pseudoknots based on a structure-based coarse-grained model [14]. They found that the pseudoknots fold through hierarchical, cooperative mechanism, or via multiple pathways. They further proposed that the folding order is mostly determined by the stabilities of the isolated secondary structure. Following this suggestion, we calculated the stabilities of H1 and H2 based on Turner’s free-energy rule [42] and found values of -10 and -12.4kcal/mol, respectively. The slightly stronger stability of H2 seems to indicate a first folding of H2 and then H1, contrary to the observation in our simulations. However, we notice that the tertiary interaction between H1 and L2 plays an indispensable role; according to our simulation, the tertiary HBs in this region are always observed in the intermediates I1, I2 and I3 and they essentially make the RNA a triplex. Therefore, it is hypothesized that the tertiary interactions, although rather weak themselves, stabilize the helix H1 and bias the folding flux to the pathway-I. Therefore the stability rule of folding order proposed by Cho et al. still holds but needs to consider the contribution from tertiary contacts. Interestingly, in the intermediate I4, the tertiary interactions form earlier than H1 and are particularly important for maintaining the overall structure. Presumably, they will also affect the folding flux through the pathway-IV. Similar role of tertiary interactions has been discussed by Cho et al [14]. They found that for the hTR pseudoknot, the tertiary contacts can form before the complete assembly of secondary structures, suggesting that they are significant in determining the folding cooperativity and pathways in some cases. An experimental study of the folding bacterial group I ribozyme also demonstrated that the tertiary interactions between helices bias the structural ensemble toward native-like conformations and therefore are important for determining the folding process.

It is of particular interest to discuss the folding rates of RNA pseudoknots. Experimentally, the measured folding rates vary greatly. In the optical tweezer study of the hTR pseudoknot, Chen et al. observed two-state folding transitions at ~10 and ~5 pN with ensemble rate constants of ~0.1 sec-1; an extrapolation to zero force yields an apparent time constant of ~60 ms[9]. In a T-jump perturbation experiment for VPK, a variant of the MMTV pseudoknot designed to avoid kinetic traps, Narayanan et al. measured folding times of 1–6 ms, which are at least 100-fold faster than previous observations of very slow folding pseudoknots that were trapped in misfolded conformations [18]. Green et al. studied the folding/unfolding kinetics of an H-type pseudoknot from IBV with optical tweezers over a force range of 15–20 pN; the apparent folding times extrapolated to zero force is ~1.8 microsecond [25]. We also estimated the transition times between basins for the gene32 mRNA pseudoknot by constructing a Markov state model based on the calculated FEL, following the procedure used by Marinelli and colleagues [35] (S1 Text and S5S7 Figs). It is found that the transition times are mostly of the order of microseconds, and there are no obvious kinetic traps that can hold the RNA for a long time (S7 Fig). An interesting discovery is that, despite of the existence of multiple intermediates and folding pathways, the ensemble folding kinetics is single-exponential, fitted from the time dependence of the fraction of the folded trajectories (S6 Fig). The fitted rate is determined to ~23μs-1, two orders of magnitude faster than that of VPK [18] but ten times slower than that of the pseudoknot from IBV [25]. Since direct experimental measurement of the folding rate of this RNA is currently lacking, we cannot make concrete conclusions regarding this issue. However, we do not expect that the RNA will fold at such a fast rate in experiments, based on the following reasons. First, the pre-defined CV space and its latter coarse-graining process in the BEMD simulation may conceal some kinetic traps. Second, the transition times were calculated based on the assumption that the diffusion constant is independent on the position on FEL, which is a very rough approximation and may actually fail for this RNA. Therefore, the folding rate estimated here may represent an upper-limit and gives a hint of how fast this RNA may ultimately fold, if potential traps, if exist, can be avoided by a careful design of experiments. It is interesting to wait experimentalists to measure the actual folding rates of the RNA and make comparisons.

It is worth discussing the role of non-native interactions in the folding process. In the unfolded states, the majority of HBs are non-native while native HBs are hardly seen in our simulation. This feature is in contrast to some cases in protein folding, where a significant amount of native contact can be observed in the denatured states, such as in protein villin headpiece [43, 44]. Whether this feature is unique to the specific RNA studied here or a general principle is not known and the answer needs a significant amount of future work. Non-native interactions are also frequently observed in the early folding intermediates such as I1. Apparently, these non-native interactions have to break for the RNA to fold to the next stage and hence essentially hinder the folding process. It is interesting to mention that the opposite role of non-native interactions has been observed in other cases, such as in the folding process of a DNA quadruplex [45] and an intrinsically disordered protein inhibitor IA3 [46], where non-native interactions were found to be able to facilitate the folding process by reducing the searching phase space.

Metal ions are crucial for the initial collapse of the denatured state as well as the later folding stages. According to our simulation, even in the unfolded basin, the correlation between the structural collapse and the number of bound ions is obvious (Fig 3A). This correlation also holds for the later folding process to the native state via several intermediates, the correlation between folding and binding (Fig 6). To be more specific, inside the unfolded state the number of bound cations increases from 2 for the extended structures to about 6 for the most compact structures; and the average increases from 5.3 for the unfolded state to 9.1 for the native state. The correlation shows that the binding of cations is indispensable for the folding process of the RNA. In addition to this correlation, there is also a correlation between folding and the value of Rp, which measures the significance of cations in bridging long-range interactions. According to Fig 6, Rp increases from 1.6 for the unfolded state to 8.4 for the native state. The continuous increase of Rp suggests that the mediating effect of cations becomes increasingly important during folding. This picture is in agreement with that given by previous studies [15, 47, 48], where metal ions were shown to be essential partners for the folding of RNAs, playing the roles of neutralizing the negatively charged backbone as well as mediating the formation of intermediate structures.

Here we want to mention that the bound ions in our simulations may correspond to the tightly bound ions proposed in the TBI theory [30, 49]. The benefit given by our simulation is the atomistic binding information of metal ions on the intermediates, which may be combined with such models and give a better description of the cation-RNA interactions. Such models are important since it has been shown that the intermediates have to be considered for the effect of cations on RNA stability [27, 50].

Caution should be given regarding the accuracy of the force field. In a recent study of the folding of RNA tetraloops, Chen and Garcia suggested that the AMBER99 force field for nucleic acids results in bloated nucleobases that do not accurately reflect the physiochemical properties of aqueous heterocycles [51]. Here we argue that this deficiency may not seriously affect the folding mechanism, since the latter is mostly determined by the balance between enthalpy and entropy in searching for the native basin of attraction. In fact, a previous work from the same group [52] studied the pressure and temperature folding/unfolding equilibrium of a small RNA hairpin with replica exchange molecular dynamics simulations and found that the AMBER99 force field is able to fold the molecule from extended conformation to structures with RMSD within 0.4–0.6 nm to the crystal structure. However, a quantitative estimation of the influence of this deficiency on the folding mechanism obtained here is not available yet, which needs substantial further works.

In summary, in this work we studied the free energy landscape of an H-type RNA pseudoknot using an advanced sampling technique and all-atom molecular dynamics simulation with both RNA and water explicitly modeled. Multiple intermediates were detected and their structures were analyzed. It is suggested that the folding follows multiple mechanisms, including a step-wise mechanism starting either from the first helix or the second, and a cooperative mechanism with both helices forming simultaneously. Despite of the existence of multiple intermediates and pathways, the ensemble folding kinetics estimated from a Markov state model is single-exponential, with a folding rate of ~23μs-1. This value may represent an upper-limit and gives a hint of how fast this RNA may ultimately fold, if potential traps can be avoided. The roles of metal ions are also analyzed. It is shown that the correlation between folding and binding is significant, and the bound ions mediate long-range interactions, stabilizing the intermediate structures. We believe this study represents a step forward in understanding the folding process of RNA pseudoknots.

Supporting Information

S1 Fig. Results from the MD simulation for the native structure.

The simulation was started from the native structure and lasted for 200ns. It was designed to test the stability of the native structure under the current force field. The details are described in the main text. (A) The distribution of RMSD of the structures collected from the trajectory. (B) The RMSF of each nucleotide calculated from the trajectory. (C) The Na+ ion binding probabilities of the nucleotides. The labels in the x-axis are the sequence indices of the RNA nucleotides.


S2 Fig. Convergence test for the BEMD run.

(A) The FELs for the first 400ns simulation. (B) The FELs calculated from the last 60ns.


S3 Fig. The relative population of the largest 20 clusters in the unfolded states.


S4 Fig. The Na+ ion binding probabilities for the nucleotides of the six intermediates.


S5 Fig. The diffusion constant as a function of the lag time.

The results were calculated from the 200ns MD for the native structure.


S6 Fig. The ensemble folding kinetics calculated from KMC simulations.

The black curve is the raw data and the red one is from a single-exponential fitting.


S7 Fig. The transition times between different basins.

The results were estimated from the Markov state model and kinetic Monte Carlo simulations.


S1 Text. The construction of Markov state model.



The authors acknowledge Shanghai Supercomputer Center and HPCC of Nanjing University for the computational support.

Author Contributions

Conceived and designed the experiments: WW JZ. Performed the experiments: YB. Analyzed the data: YB JZ WW. Contributed reagents/materials/analysis tools: Jun Wang Jihua Wang. Wrote the paper: YB JZ WW.


  1. 1. Staple DW, Butcher SE. Pseudoknots: RNA Structures with Diverse Functions. PLoS Biol. 2005; 3: e213. pmid:15941360
  2. 2. Giedroc DP, Theimer CA, Nixon PL Structure, stability and function of RNA pseudoknots involved in stimulating ribosomal frameshifting. J. Mol. Biol. 2000; 298: 167–185. pmid:10764589
  3. 3. Zhang J, Dundas J, Lin M, Chen R, Wang W, Liang J. Prediction of geometrically feasible three-dimensional structures of pseudoknotted RNA through free energy estimation. RNA 2009; 15: 2248–2263. pmid:19864433
  4. 4. Zhang J, Lin M, Chen R, Wang W, Liang J. Discrete state model and accurate estimation of loop entropy of RNA secondary structures. J. Chem. Phys. 2008; 128: 125107. pmid:18376982
  5. 5. Solomatin SV, Greenfeld M, Chu S, Herschlag D. Multiple native states reveal persistent ruggedness of an RNA folding landscape. Nature 2010; 463: 681–684. pmid:20130651
  6. 6. Parisien M, Major F. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 2008; 452: 51–55. pmid:18322526
  7. 7. Zhao Y, Huang Y, Gong Z, Wang Y, Man J, Xiao Y. Automated and fast building of threedimensional RNA structures. Sci. Rep. 2012; 2: 734. pmid:23071898
  8. 8. Shi YZ, Wang FH, Wu YY, Tan ZJ. A coarse-grained model with implicit salt for RNAs: Predicting 3D structure, stability and salt effect. J. Chem. Phys. 2014; 141: 105102. pmid:25217954
  9. 9. Chen G, Wen JD, Tinoco I. Single-molecule mechanical unfolding and folding of a pseudoknot in human telomerase RNA. RNA 2007; 13: 2175–2188. pmid:17959928
  10. 10. Cao S, Chen SJ. Predicting RNA pseudoknot folding thermodynamics. Nucleic Acids Res. 2006; 34: 2634–2652. pmid:16709732
  11. 11. Cao S, Giedroc DP, Chen SJ. Predicting loop—helix tertiary structural contacts in RNA pseudoknots. RNA 2010; 16: 538–552. pmid:20100813
  12. 12. Liu L, Chen SJ. Computing the conformational entropy for RNA folds. J. Chem. Phys. 2010; 132: 235104. pmid:20572741
  13. 13. Liu L, Chen SJ. Coarse-Grained Prediction of RNA Loop Structures. PLoS ONE 2012; 7: e48460. pmid:23144887
  14. 14. Cho SS, Pincus DL, Thirumalai D. Assembly mechanisms of RNA pseudoknots are determined by the stabilities of constituent secondary structures. Proc. Natl. Acad. Sci. U. S. A. 2009; 106: 17349–17354. pmid:19805055
  15. 15. Gluick TC, Wills NM, Gesteland RF, Draper DE. Folding of an mRNA Pseudoknot Required for Stop Codon Readthrough: Effects of Mono- and Divalent Ions on Stability. Biochemistry 1997; 36: 16173–16186. pmid:9405051
  16. 16. Cao S, Chen SJ. Biphasic Folding Kinetics of RNA Pseudoknots and Telomerase RNA Activity. J. Mol. Biol. 2007; 367: 909–924. pmid:17276459
  17. 17. Isambert H, Siggia ED. Modeling RNA folding paths with pseudoknots: Application to hepatitis delta virus ribozyme. Proc. Natl. Acad. Sci. U. S. A. 2000; 97: 6515–6520. pmid:10823910
  18. 18. Narayanan R, Velmurugu Y, Kuznetsov SV, Ansari A. Fast Folding of RNA Pseudoknots Initiated by Laser Temperature-Jump. J. Am. Chem. Soc. 2011; 133: 18767–18774. pmid:21958201
  19. 19. Zhang Y, Zhang J, Wang W. Atomistic Analysis of Pseudoknotted RNA Unfolding. J. Am. Chem. Soc. 2011; 133: 6882–6885. pmid:21500824
  20. 20. Csaszar K, Špacková Na, Štefl R, Šponer J, Leontis NB. Molecular dynamics of the frame-shifting pseudoknot from beet western yellows virus: the role of non-Watson-Crick base-pairing, ordered hydration, cation binding and base mutations on stability and unfolding. J. Mol. Biol. 2001; 313: 1073–1091. pmid:11700064
  21. 21. Santner T, Rieder U, Kreutz C, Micura R. Pseudoknot Preorganization of the PreQ1 Class I Riboswitch. J. Am. Chem. Soc. 2012; 134: 11928–11931. pmid:22775200
  22. 22. Wu YJ, Wu CH, Yeh AYC, Wen JD. Folding a stable RNA pseudoknot through rearrangement of two hairpin structures. Nucleic Acids Res. 2014; 42: 4505–4515. pmid:24459133
  23. 23. Denesyuk NA, Thirumalai D. Crowding Promotes the Switch from Hairpin to Pseudoknot Conformation in Human Telomerase RNA. J. Am. Chem. Soc. 2011; 133: 11858–11861. pmid:21736319
  24. 24. Xu X, Chen SJ. Kinetic Mechanism of Conformational Switch between Bistable RNA Hairpins. J. Am. Chem. Soc. 2012; 134: 12499–12507. pmid:22765263
  25. 25. Green L, Kim CH, Bustamante C, Tinoco I Jr. Characterization of the Mechanical Unfolding of RNA Pseudoknots. J. Mol. Biol. 2008; 375: 511–528. pmid:18021801
  26. 26. Hengesbach M, Kim NK, Feigon J, Stone MD. Single-Molecule FRET Reveals the Folding Dynamics of the Human Telomerase RNA Pseudoknot Domain. Angew. Chem. Int. Ed. Engl.2012; 51: 5876–5879. pmid:22544760
  27. 27. Chen SJ. RNA Folding: Conformational Statistics, Folding Kinetics, and Ion Electrostatics. Annu. Rev. Biophys. 2008; 37: 197–214. pmid:18573079
  28. 28. Draper DE, Grilley D, Soto AM. IONS AND RNA FOLDING. Annu. Rev. Biophys. Biomol. Struct. 2005; 34: 221–243. pmid:15869389
  29. 29. Tan ZJ, Chen SJ. RNA Helix Stability in Mixed Na(+)/Mg(2+) Solution. Biophys. J. 2007; 92: 3615–3632. pmid:17325014
  30. 30. Tan ZJ, Chen SJ. Electrostatic correlations and fluctuations for ion binding to a finite length polyelectrolyte. J. Chem. Phys. 2005; 122: 044903.
  31. 31. Piana S, Laio A. A Bias-Exchange Approach to Protein Folding. J. Phys. Chem. B 2007; 111: 4553–4559. pmid:17419610
  32. 32. Hess B, Kutzner C, van der Spoel D, Lindahl E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008; 4: 435–447.
  33. 33. Guy AT, Piggot TJ, Khalid S. Single-Stranded DNA within Nanopores: Conformational Dynamics and Implications for Sequencing; a Molecular Dynamics Simulation Study. Biophys. J. 2012; 103: 1028–1036. pmid:23009852
  34. 34. Cossio P, Marinelli F, Laio A, Pietrucci F. Optimizing the Performance of Bias-Exchange Metadynamics: Folding a 48-Residue LysM Domain Using a Coarse-Grained Model. J. Phys. Chem. B 2010; 114: 3259–3265. pmid:20163137
  35. 35. Marinelli F, Pietrucci F, Laio A, Piana S. A Kinetic Model of Trp-Cage Folding from Multiple Biased Molecular Dynamics Simulations. PLoS Comput. Biol. 2009; 5: e1000452. pmid:19662155
  36. 36. Bonomi M, Branduardi D, Bussi G, Camilloni C, Provasi D, Raiteri P, et al. PLUMED: A portable plugin for free-energy calculations with molecular dynamics. Comput. Phys. Commun. 2009; 180: 1961–1972.
  37. 37. Laio A, Gervasio FL. Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep. Prog. Phys. 2008; 71: 126601.
  38. 38. Bian Y, Zhang J, Wang J, Wang W. On the accuracy of metadynamics and its variations in a protein folding process. Mol. Simulat. 2014;
  39. 39. Thirumalai D, Woodson SA. Kinetics of Folding of Proteins and RNA. Acc. Chem. Res. 1996; 29: 433–439.
  40. 40. de Messieres M, Chang JC, Belew AT, Meskauskas A, Dinman JD, La Porta A. Single-molecule measurements of the CCR5 mRNA unfolding pathways. Biophys. J. 2014; 106: 244–252. pmid:24411256
  41. 41. Bowman GR, Huang X, Yao Y, Sun J, Carlsson G, Guibas LJ, et al. Structural Insight into RNA Hairpin Folding Intermediates. J. Am. Chem. Soc. 2008; 130: 9676–9678. pmid:18593120
  42. 42. Xia T, SantaLucia J, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, et al. Thermodynamic Parameters for an Expanded Nearest-Neighbor Model for Formation of RNA Duplexes with Watson-Crick Base Pairs. Biochemistry 1998; 37: 14719–14735. pmid:9778347
  43. 43. Zagrovic B, Snow CD, Khaliq S, Shirts MR, Pande VS. Native-like Mean Structure in the Unfolded Ensemble of Small Proteins. J. Mol. Biol. 2002; 323: 153–164. pmid:12368107
  44. 44. Smith AW, Chung HS, Ganim Z, Tokmakoff A. Residual native structure in a thermally denatured beta-hairpin. J. Phys. Chem. B. 2005; 109: 17025–17027. pmid:16853169
  45. 45. Bian Y, Tan C, Wang J, Sheng Y, Zhang J, et al. Atomistic Picture for the Folding Pathway of a Hybrid-1 Type Human Telomeric DNA G-quadruplex. PLoS Comput. Biol. 2014; 10: e1003562. pmid:24722458
  46. 46. Wang J, Wang Y, Chu X, Hagen SJ, Han W, et al. Multi-scaled explorations of binding induced folding of intrinsically disordered proteininhibitor ia3 to its target enzyme. PLoS Comput. Biol. 2011; 7: e1001118. pmid:21490720
  47. 47. Draper DE. RNA Folding: Thermodynamic and Molecular Descriptions of the Roles of Ions. Biophys. J. 2008; 95: 5489–5495. pmid:18835912
  48. 48. Woodson SA. Metal ions and RNA folding: a highly charged topic with a dynamic future. Curr. Opin. Chem. Biol. 2005; 9: 104–109. pmid:15811793
  49. 49. Tan ZJ, Chen SJ. Ion-Mediated Nucleic Acid Helix-Helix Interactions. Biophys. J. 2006; 91: 518–536. pmid:16648172
  50. 50. Soto AM, Misra V, Draper DE. Tertiary structure of an RNA pseudoknot is stabilized by "diffuse" Mg2+ ions. Biochemistry 2007; 46: 2973–2983. pmid:17315982
  51. 51. Chen AA, García AE. High-resolution reversible folding of hyperstable RNA tetraloops using molecular dynamics simulations. Proc. Natl. Acad. Sci. U. S. A. 2014; 110: 16820–16825.
  52. 52. Garcia AE, Paschek D. Simulation of the Pressure and Temperature Folding/Unfolding Equilibrium of a Small RNA Hairpin. J. Am. Chem. Soc. 2008; 130:815–817. pmid:18154332