Computational Study of Unfolding and Regulation Mechanism of preQ1 Riboswitches

Riboswitches are novel RNA regulatory elements. Each riboswitch molecule consists of two domains: aptamer and express platform. The three-dimensional (3D) structure of the aptamer domain, depending on ligand binding or not, controls that of the express platform, which then switches on or off transcriptional or translational process. Here we study the two types of preQ1 riboswitch aptamers from T. Tengcongensis (denoted as Tte preQ1 riboswitch for short below) and Bacillus subtilis (denoted as Bsu preQ1 riboswitch for short below), respectively. The free-state 3D structure of the Tte preQ1 riboswitch is the same as its bound state but the Bsu preQ1 riboswitch is not. Therefore, it is very interesting to investigate how these riboswitches realize their different regulation functions. We simulated the unfolding of these two aptamers through all-atom molecular dynamic simulation and found that they have similar unfolding or folding pathways and ligand-binding processes. The main difference between them is the folding intermediate states. The similarity and difference of their unfolding or folding dynamics may suggest their similar regulation mechanisms and account for their different functions, respectively. These results are also useful to understand the regulation mechanism of other riboswitches with free-state 3D structures similar to their bound states.


Introduction
Riboswitches are RNA regulatory elements that perform regulation functions through conformational switch between ligand-free and ligand-bound states [1][2][3]. Therefore, the threedimensional (3D) conformations of free and bound states of riboswitches are crucial to understand their regulation function and mechanism. Up to now, the 3D structures of the ligand-bound states of many riboswitch aptamers have already been solved. Recently the ligand-free 3D structures of some aptamers have also been solved [4][5][6]. However, it is found that nearly all of these free-state 3D structures adopt bound-like structures [7,8]. This proposes a question: How do these riboswitches perform regulation functions in free form?
Recently, a series of experiments have focused on two types of preQ 1 riboswitches. One of them is from Bacillus subtilis [9,10]. The other one comes from T. Tengcongensis [11]. These two types of riboswitches have different functions (Figure 1). The former regulates transcription while the latter translation. In ligand-bound state, the aptamers of both preQ 1 riboswitches adopt similar Htype pseudoknot structure. While in ligand-free state, Tte preQ 1 riboswitch aptamers has almost the same 3D structure as its ligand-bound state [11] (see Figure 2). Though the structure of Bsu preQ 1 riboswitch aptamer in free state is not available, the free state structure is thought to be different as their bound structure [9,12]. The regulation of the Bsu preQ 1 riboswitch was considered to be kinetically controlled [9]. Jenkins et al [11] suggested that Tte preQ 1 riboswitch would have a different regulation mecha-nism. In this paper we show that the two preQ 1 riboswitches may use similar physical mechanism to control regulation processes.
The kinetic control mechanism depends on folding pathways of the aptamers. At present micro to millisecond timescale simulations of RNA folding using conventional all-atom molecular dynamics (MD) methods are still very difficult. Some simplifications are needed. For example, Feng et al. [13] used an all-atom Gō-model to investigate the folding mechanism of the Bsu preQ 1 riboswitch aptamer. They found that the preQ 1 riboswitch aptamer may fold along a directional pathway: the P1 stem-loop folds first, then the A-tract region establishes tertiary contacts with the P1 stem, and finally the pseudoknot P2 forms, i.e., the folding is along a single route from the 59 to 39 end and in the same direction as the transcription proceeds. However, as pointed by the authors [13], the Gō-model smoothed out the folding free energy landscape and cannot localize the possible intermediate states with non-native interactions.
Another MD approach to study RNA folding is hightemperature unfolding simulation, which has served as a powerful tool for investigating dynamics involving large conformational changes on computationally tractable timescales [14][15][16]. Previous studies suggested that the unfolding events for proteins could reflect the main attributes of the folding progress [17][18][19]. Recent studies of RNA hairpins have also shown that construction of folding pathways, including transition state ensembles, is possible using high temperature unfolding [20]. Csaszar [22]. Here we will perform a series of unfolding simulations on hightemperature to investigate the folding/unfolding pathways and mechanisms of preQ1 riboswitch. Figure 3A describes the time evolution of the mean fraction of native contacts (NC) in Bsu and Tte preQ 1 riboswitch aptamers as well as their components over the ten simulated unfolding trajectories at 400 K in the presence of ligands. It shows that the unfolding processes of both aptamers are similar: the P2 helix first loses its native contacts, then the A-tract L3 loses its tertiary contacts with P1 region and finally the P1 stem unfolds. This is essentially along a single route from the 39 to the 59 end, just the reverse direction of transcription and the folding pathway of the Bsu aptamer suggested by Feng et al [13]. Figure 3B is the two-dimensional unfolding free energy landscapes of the two aptamers built from the ten simulated trajectories in the presence of ligands, respectively. The order parameters are fractions of native contacts in the triplex P1-L3 and binding pocket region adding pseudoknot part P2-L1-L2. These free-energy landscapes further confirm the hierarchical unfolding pathways of the two aptamers. Furthermore, both free-energy landscapes show two intermediate states: one has a flexible structure but with stable hairpin P1 (denoted as I1) and another is different for the two aptamers: for Bsu aptamer the intermediate state (denoted as I2 Bsu ) has a structure with opening P2 and binding pocket but stable triplex P1-L3; on the other hand, the intermediate state of Tte aptamer (denoted as I2 Tte ) has a structure that is very close to the ligand-bound one but the P2 helix only has one of the base pairs G11:C30 remained and the base pairs C9:G33 has broken (see discussions in the following).

Results and Discussion
It is also shown on Figure 3A that the release of ligand occurs at the early stage of unfolding progress during the P2 helix breaks and the binding pocket (L1 and L2 loops) opens but before the P1-L3 triplex unfolds. The pathways that the ligands released from the two aptamers are also similar and along two directions: from the major or minor groove of P1 ( Figure 4). This suggests that the ligand binds to the aptamer in the later stage of the folding process, especially after the formation of the P1-L3 triplex. The very recent experimental result described such a picture of ligand binding to the preQ 1 aptamer domain [23]. Other experimental results also suggested that the P1 hairpin might already form before the ligand binding [9,10].
In the absence of ligands, we find that the global behaviors of the unfolding process of the two aptamers are also similar with each other as well as those in the presence of ligands ( Figure 5): the P2 helix first loses its native contacts, then the A-tract L3 loses its tertiary contacts with P1 region and finally the P1 stem unfolds. The two-dimensional free energy landscapes of the Bsu and Tte aptamers ( Figure 5B) are also similar to those in the presence of ligands. has not be reduced significantly due to the absence of the ligand. This is different from the Bsu preQ 1 riboswitch aptamer.
Since 400 K is higher than the transition temperatures of both aptamers, their starting structures are unstable and unfold quickly. So the two-dimensional free energy landscapes at 400 K cannot reflect the stability of the starting structures at the normal temperature. We also performed simulations at 300 K and the results are presented in the supplementary Figures S1 and S2. At 300 K the ligand-bound experimental structure of Tte preQ 1 riboswitch aptamer is very stable but the ligand-free experimental structure is unstable and the more stable structure is just I2 Tte . This can also be seen clearly from Supplementary Figure S3, which shows the variations of the distances between the nucleotides of the two base pairs in P2 region during the simulations. For the Bsu preQ 1 riboswitch aptamer, the situation is similar: the ligand-bound experimental structure is stable, although one of the base pairs in P2 has broken, while the ligandfree experimental structure is not stable and the more stable structure is I2 Bsu . These indicate that in the presence of ligands the stable structure of the two aptamers at the normal temperature are their experimental structures while in the absence of ligands their stable structure at the normal temperature are I2 Bsu and I2 Tte , respectively. In I2 Bsu all the base pairs in P2 have broken while in I2 Tte only one of the base pairs in P2 was broken (see Supplementary Figures S3 and S4).
The results above show that the two aptamers have similar unfolding pathways no matter the ligands present or not. The main difference between them is that for Tte preQ 1 riboswitch aptamer the structure of the intermediate state I2 Tte can keep part (one base pair G11:C30) of P2 helix but the Bsu preQ 1 riboswitch aptamer cannot. Such different unfolding behaviors of these two aptamers come from different conformational features of them. In Tte preQ 1 riboswitch aptamer all the nucleotides in pseudoknot part have formed stable base-paring or non-nearest-neighbor stacking interactions (bases from different sides of pseudoknot stack with each other), especially the C9 and G33 form the strongest G:C Watson-Crick base pair at the end of the pseudoknot, which make terminal of riboswitch remain stable. By contrast, there are no non-nearest-neighbor base-stacking interactions between two chains of the pseudoknot part in Bsu preQ 1 riboswitch aptamer, in addition to unpaired terminal base. These different structural features make the Tte preQ 1 riboswitch aptamer have more stable pseudoknot part than Bsu preQ 1 riboswitch aptamer.
There is one more factor that makes the binding pocket and P2 of the Tte preQ 1 riboswitch aptamer more stable in the ligand free form. In this case, the nucleotide A14 in L2 region inserts into the binding pocket and forms hydrogen bonds with A29 instead of the ligand preQ 1 , which makes the binding pocket keep stable and holds the tail of the aptamer to form pseudoknot shape. This may explain why part of the pseudoknot structure can still maintain in ligand-free state. By contrast, the L2 region of Bsu preQ 1 riboswitch is 6-nt long, which has two more nucleotides than that in Tte preQ 1 riboswitch aptamer. The extra-long length of the L2 region increases its flexibility and makes the nucleotides in L2 region difficult to turn back into the binding pocket to form effective interactions with the tail part in the absence of the ligand.
Such similarity and difference between the two aptamers in unfolding or folding behaviors may be related to their different functions. In the presence of ligands both Bsu and Tte preQ 1 riboswitch aptamers form pseudoknot structures due to ligand binding and make their express platforms switch to ''OFF'' conformations to stop transcription or translation (Figure 1). For Bsu preQ 1 riboswitch aptamer, the formation of the pseudoknot structure makes the express platform form a terminator that halts transcription. For Tte preQ 1 riboswitch aptamer, the formation of the pseudoknot structure makes the two 59-end nucleotides AG of ribosome binding sequence closed in P2 helix and avoid exposure to ribosome. In the absence of the ligands, both Bsu and Tte preQ 1 riboswitch aptamers should form structures that make their express platforms switch to ''ON'' conformations to allow transcription or translation to proceed. For Bsu preQ 1 riboswitch aptamer, it needs forming an anti-terminator and this requires the four nucleotides CUAA at the 39 end of the aptamer do not form pseudoknot base pairs with the hairpin P1 as in the ligand-bound structure (Figure 1). This just corresponds to the stable state I2 Bsu at the normal temperature in the absence of the ligand. This may be why the Bsu preQ 1 riboswitch aptamer needs a long L2 loop that makes it difficult to form interactions with the tail part and forbid the formation of the pseudoknot in the absent of ligand. For Tte preQ 1 riboswitch aptamer, in order to allow the translation to proceed, it only needs the two 59-end nucleotides AG of ribosome binding sequence apart from the hairpin P1 and the nucleotide C30 can still form pseudoknot base pair (Figure 1). This is just the stable state I2 Tte at the normal temperature in the absence of the ligand. This indicates that the kinetic behaviors of the two aptamers can satisfy the requirement of their functions.

Structural Analysis
The preQ 1 riboswitch aptamer domain adopts an H-type pseudoknot with two stems (P1 and P2) and three loops (L1, L2 and L3) in the ligand-bound state. Figure 2 plot the secondary structure and tertiary structure of preQ 1 riboswitch. The overall structures of these two riboswitches are very similar, with backbone RMSD of 4.0 Å . It shows that the stem P2 stacks over the P1 and L1 and the ligand preQ 1 locates in a closed binding pocket around by several nucleotides from different part of aptamer. Previous studies [9,10] showed that in the Bsu preQ 1 riboswitch the P2 helix forms only when the ligand preQ 1 binds to the aptamer while in the Tte preQ 1 riboswitch the pseudoknot P2 can form in both ligand-bound and ligand-free states and they adopts the similar H-type pseudoknot [11]. Furthermore, in Bsu riboswitch the adenine-tract (A-tract) L3 of the tail form a stretch of ''A-amino kissing motif'' [24] contacting in the minor groove of the P1 stem and the L2 loop is unusual 6-nt long which lies in the minor groove of P2 [9], while in the Tte preQ 1 riboswitch the L3 region has much less contacts with P1 and the L2 loop have only 4-nt long.

Simulations Strategy
All minimization and MD simulations are carried out using the sander program in AMBER 11 software package [25]. Previous studies have shown that the Amber ff98 as well as modified ff99 force fields can stabilize the RNA structures during the GB and explicit solvation simulations [26][27][28][29][30]. Here, we choose the appropriate Amber ff98 force field and HCT model (Tsui-Case parameters) [31] GB/SA implicit solvation model for the simulations. We simulated two sets of unfolding trajectories of the Bsu preQ 1 riboswitch aptamer starting from the NMR structure (the first structure of PDB 2L1V) with and without the ligand preQ 1 , respectively. For the Tte preQ 1 riboswitch, we also simulated two sets of unfolding trajectories starting from the bound-state (PDB ID: 3Q50) and free-state (PDB ID: 3Q51) experimental structures, respectively. All of the structures are simulated at the temperatures 300 K and 400 K. The canonical (constant T) ensemble is chosen for the simulation and the Langevin thermostat is used to control the temperature using a collision frequency of 1.0/ps [32]. The effective ion concentration is 0.2 M, and the electrostatic interactions is treated as default Amber parameters for GB simulation. The non-bond cutoff is 10 Å , and the time step is 1fs. Before we perform MD simulation, the entire system is first minimized for 3000 steps by the steepest descent method and then 6000 steps by the conjugate gradient method. Simulations at each temperature contain ten 10nstrajectories with different initial atomic velocities. The total simulation time is up to 1.6 ms.

Trajectory and Structure Analysis
We analyzed the trajectories using the Ptraj program in AMBER 11 for calculating the root mean square deviation (RMSD) as well as the native contacts. All the RMSD values are calculated against the experimental structures by using all heavy atoms. We employ the fraction of native contacts (NC for short below) as a measure of secondary and tertiary structure formation during unfolding process. We define the native contact using the distance cutoff 7 Å and calculate the fraction of native contacts by averaging all ten simulation trajectories. Such representation of native contact contains the base-pairing interactions as well as base stacking between neighboring bases. We give the corresponding smoothed curves of native contacts to better visualize their evolutions. The free energy landscape is calculated using the algorithm proposed by Pande et al. [16], in which the free energy is defined as {k B T ln NR=NT ð Þ , where NR is the number of structures in each region and NT is the total number of the structures. Although our simulated data are taken from the nonequilibrium unfolding trajectories, the free energy landscape also provides us the mostly populated structures, the unfolding intermediates, and possible unfolding pathway. The trajectory visualization and figures are generated using VMD [33], UCSF Chimera packages [34] and the PyMOL Molecular Graphics System, Version 1.3, Schrödinger, LLC.
To determine the unfolding temperature of our simulation system, we first did short thermodynamic simulations in which the aptamer was simulated over a wide temperature range (280 K to 440 K) to capture the RNA unfolding transition. The heat capacity C V is calculated to monitor aptamer folding and unfolding, following the equation where s E is the energy fluctuation at a given temperature T and k B is the Boltzmann constant. We found that the melting temperature for Bsu preQ 1 riboswitch is 380 K in both presence and absence of the ligand. The Tte preQ 1 riboswitch has a little higher melting temperature in both states (390 K), which indicates that the Tte preQ 1 riboswitch is more stable during the unfolding progress (see Figures S5 and S6). Therefore, we mainly focus on the simulations of unfolding processes at 400 K, which is slightly higher than the melting temperature and enables the RNA molecules unfold within a practically feasible time.

Conclusion
In summary, we have performed 400 K all-atom unfolding molecular dynamics simulations of the unfolding of Bsu and Tte preQ 1 riboswitch aptamers with and without ligands. The results indicate that the global behaviors of their unfolding are very similar and along a single directional pathway from the 39 to 59 end. Our results also show that there exist two intermediate states during the unfolding progress in the presence of ligand. In the presence of ligand the experimental ligand-bound structures are their stable states and also the functional ''OFF'' states. In the absence of ligands their stable states are not the experimental ligand-free structures (or the ligand-bound one by moving the ligand) but just one of the intermediate states in the presence of ligands. These ligand-free stable states seem to be the functional ''ON'' state. Our unfolding studies suggest that the regulation mechanisms of Bsu and Tte preQ 1 riboswitch may be similar: the aptamers will form the functional ''OFF'' or ''ON'' states depending on ligand binding or not after the formation of the triplex P1-L3. Although the ligand-free state of the Tte preQ 1 riboswitch aptamer has the same structure as its ligand-bound state, it also regulates the translational process through kinetic control as the Bsu preQ 1 riboswitch. These results will also help the understanding of regulation mechanism of different preQ 1 riboswitches as well as other types of riboswitches. Figure S1 Dynamics of the two types of preQ1 riboswitch aptamer domain with (top) and without (bottom) ligand at 300 K. The different curves describe the time evolution of the average fraction of native contacts of the aptamer and its various segments during the unfolding simulations. (TIF) Figure S2 The two-dimensional free energy landscape of the two types of preQ1 aptamer domain with (left) and without (right) the ligand at 300 K. The order parameters are the fractions of native contacts for P1-L3 and P2-L1-L2, respectively. (TIF) Figure S3 The variation of RMSD as well as the distance among nucleotides in pseudoknot of the two types of preQ1 aptamer domain with (left) and without (right) the ligand at 300 K. (TIF) Figure S4 The variation of RMSD as well as the distance among nucleotides in pseudoknot of the two types of preQ1 aptamer domain with (left) and without (right) the ligand at 400 K. Only the first 2 ns are plotted in order to view more clearly.