Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The assessment of molecular dynamics results of three-dimensional RNA aptamer structure prediction

Abstract

Aptamers are single-stranded DNA or RNA that bind to specific targets such as proteins, thus having similar characteristics to antibodies. It can be synthesized at a lower cost, with no batch-to-batch variations, and is easier to modify chemically than antibodies, thus potentially being used as therapeutic and biosensing agents. The current method for RNA aptamer identification in vitro uses the SELEX method, which is considered inefficient due to its complex process. Computational models of aptamers have been used to predict and study the molecular interaction of modified aptamers to improve affinity. In this study, we generated three-dimensional models of five RNA aptamers from their sequence using mFold, RNAComposer web server, and molecular dynamics simulation. The model structures were then evaluated and compared with the experimentally determined structures. This study showed that the combination of mFold, RNAComposer, and molecular dynamics simulation could generate 14-16, 28, or 29 nucleotides length of 3D RNA aptamer with similar geometry and topology to the experimentally determined structures. The non-canonical basepair structure of the aptamer loop was formed through the MD simulation, which also improved the three-dimensional RNA aptamers model. Clustering analysis was recommended to choose the more representative model.

Introduction

Aptamers are single-stranded DNA or RNA that bind to specific targets such as proteins or other biomolecules with high affinity. The 3D architectures of aptamers influence its binding affinity similar to the characteristics of monoclonal antibodies [13]. Compared to monoclonal antibodies, aptamers are less immunogenic, easier to mass-produce at lower costs, and have lower batch variation [4]. In addition, aptamers have a wider spectrum of target ligands than monoclonal antibodies, including ions and organic dyes. Therefore, aptamers have wide applications as biosensors as well as therapeutic agents. Another advantage of aptamers is that the time required for development is around 2–8 weeks, much shorter than antibodies which take more than 6 months [4].

In 1990, a selection process for aptamers named Systematic Evolution of Ligands by Exponential Enrichment (SELEX) was developed, which remains the gold standard for aptamer selection [5, 6]. In this method, the selection of aptamers is carried out in vitro from a collection of aptamers that can reach 1014-1015 variants. The method requires many cycles or is highly iterative because it involves several cycles of the oligonucleotide exposure process to the target and modification of the aptamer in vitro. Therefore, it is considered inefficient because it requires more time and effort. A great deal of research has been carried out to develop innovative techniques to produce aptamers with consistently high performance, efficiency, resource savings, and a greater chance of success [7, 8]. One of the efforts to improve the efficiency of the aptamer selection and analysis process steps is through bioinformatics. In principle, the method performs a selection process using a virtual oligonucleotide candidate library (1014—1015 variants). Each candidate will be selected based on the predicted affinity for the target obtained in-silico. The method has been developed by Yoruslav et al. [2, 9]. Several three-dimensional structures of RNA prediction platforms have been developed, such as RNAComposer, JAR3D, and iFoldRNA; all three were web-based servers while the other platforms which need local installation include Rosetta, RMdetect, and RAGTOP [10]. Among those platforms, RNAComposer offers several advantages for generating 3D RNA structures, including a user-friendly interface, high accuracy with an average RMSD of 1.7 A, fast computation, ability to handle large RNAs, ability to incorporate experimental data, and availability as a web-based application for easier access.

In this study, we used RNAComposer, a web server to predict the experimentally determined structures of RNA aptamer from the Protein Data Bank (PDB) database. The framework includes two-dimensional structure prediction using mFold, 3D structure generation using RNAComposer, and structure refinement using Molecular Dynamics (MD) simulation. We also utilized public software structure evaluation, analysis, and visualization, including RNApdbee and UCSF Chimera. The MD simulation has successfully improved the structural prediction of RNA aptamers.

Methods

RNA aptamer search and three-dimensional structure prediction

The experimentally determined three-dimensional structure of RNA aptamers was searched in the PDB database using “Aptamer” as the keyword. In this study, we focused specifically on single RNA aptamers. The exclusion criteria were structures with ligands, protein, modified nucleotides, quadruplex, triplex, duplex, or belonging to the DNA aptamer class. QGRS Mapper was used to predict the presence of quadruplex forming G-rich in aptamer sequences. The RNA aptamers which fulfill the inclusion and exclusion criteria were used, and the models of the 3D structure were generated. mFold web server was used to predict the secondary structure of each RNA aptamer. The dot-bracket results from mFold and its sequence were used as the input for the 3D structure generation using RNAComposer.

Molecular dynamics simulation

MD simulation was performed using GROMACS [11] and CHARMM36 force field [12]. Each system consisted of an aptamer in TIP3P as a water model and was neutralized by adding the appropriate number of ions (sodium or chloride). Periodic boundary conditions (PBCs) were applied to the system in all the spatial directions. LINCS algorithms were used, and all hydrogen bonds were constrained. A 1.2 nm distance cutoff was used for the short-distance electrostatic and van der Waals interaction. Particle Mesh Ewald algorithm (PME) was used to calculate the long-range electrostatic forces. The steepest descent algorithm was used to minimize the system’s energy. The system was then allowed to reach an equilibrium state through the NVT ensemble using the V-Rescale thermostat at 300K, then through the NPT ensemble using the Parrinello-Rahman barostat at 1 atm. The simulation was performed for 100 ns.

Results and discussion

Three-dimensional structure prediction

Our search of RNA aptamers using “Aptamer” as the keyword in the PDB database resulted in 351 structures, of which five aptamers matched our inclusion and exclusion criteria. The aptamers are listed in Table 1, and each aptamer has a different target protein. The length of aptamers was 14, 15, 16, 28, or 29 nucleotides. Each aptamer has one or more non-canonical base pairs at the end of the internal loop. All structures were solved by NMR spectroscopy. The aptamers were derived or obtained from modified structures of in vitro selection (SELEX) except for 2EVY, which is a fragment of poliovirus 59-clover-leaf ssRNA [3, 1316]. The aptamer sequences in Table 1 were used to construct the 3D model by using mFold web server and RNAComposer. To evaluate the predictive success of the 3D RNA aptamer models, the constructed models must be geometrically and topologically as close as possible to the experimentally determined structure. It was assumed that the crystal structure or NMR structure is correct within the limitations of the experimental methods. The accuracy of the proposed model was assessed by aligning the predicted structure with the corresponding ssRNA aptamer structure downloaded from the PDB database. The degree of similarity was measured based on the calculated Root Mean Square Deviation (RMSD) of the whole aptamer between each pair of structures [17]. Fig 1 shows the overlay of each predicted 3D structure using mFold and RNAComposer with the corresponding NMR structures from the PDB database, including the RMSDs values. RNAComposer utilized the machine translation principle and operated on the RNA FRABASE, an engine with a database to search the three-dimensional fragments within 3D RNA structures using the sequence and the dot-bracket notation as input [18, 19]. The RMSD calculation, visualization, and hydrogen bond analysis were evaluated using UCSF Chimera. The RMSDs values of the predicted structures compared to the corresponding reference range from 2.19 Å (PDB ID: 1XWP) for the shortest sequences to 12.942 Å (PDB ID: 2LUN), with an average value of 6.292 Å. Overall 3D aptamer models were similar to their corresponding reference structure, except for 2LUN, which had an RMSD value of 12.94 Å. Furthermore, 3D structures predicted by RNA Composer were named (PDB-ID)-RC (for example, 1XWP-RC), while structures derived from RCSB were named (PDB-ID)-Ref (for instance, 1XWP-Ref). Atomistic MD simulations were conducted to improve and refine the structural predictions for the five aptamers.

thumbnail
Fig 1. Alignment of 3D predicted structure (colored blue) and the corresponding experimentally solved structures from PDB database (colored red) for the 5 ssRNA aptamers used in this study.

Each structure was labeled by its PDB ID, nucleotides (nt) number, and the calculated RMSD values (in Angstrom).

https://doi.org/10.1371/journal.pone.0288684.g001

thumbnail
Table 1. The results for the selection of ssRNA candidates from Protein Data Bank database.

https://doi.org/10.1371/journal.pone.0288684.t001

Three-dimensional structure refinement using MD simulations

We have performed MD simulations using GROMACS to improve the structural predictions for the 5 aptamers and study the dynamics of the systems. We conducted individual MD simulations of 5 aptamer models or 5 aptamers from reference structures in a water solution model neutralized by Sodium ions. As a representative, the starting configuration of 1XWP simulation with 3142 water molecules and 14 Sodium ions was illustrated in Fig 2.

thumbnail
Fig 2. Starting configuration of the MD simulation of the 1XWP model.

The aptamer molecule (colored blue) was solvated in water (colored red) and the system was neutralized with 14 sodium ions (colored yellow).

https://doi.org/10.1371/journal.pone.0288684.g002

The RMSD of the sugar-phosphate backbone was calculated to explore the deviation of each structure with respect to its first structure (Fig 3). RMSD provided information on the conformational flexibility and structural deviation of each RNA aptamer model or reference structure. Fig 3 showed that both predicted and reference structures have similar time-dependent behavior except for 2EVY. During 100ns MD simulation, all aptamer models had similar RMSD values to the corresponding reference structures, except for the 2EVY, which started to fluctuate at 80 ns. Therefore, we extended the 2EVY simulation to 200ns for further investigation (Fig 3F). It was observed that 2EVY was stable for the last 50ns during the 200ns MD simulation. A snapshot was taken at 0 ns, 20 ns, 40 ns, 60 ns, 80 ns, and 100 ns to visually analyze the conformation change of the aptamer model compared to the reference structure during the MD simulation. Fig 4 showed that all aptamer models evolved during the 100ns length of MD simulation, particularly 2EVY, which was characterized by the higher RMSD value and hairpin loop opening at 100ns.

thumbnail
Fig 3.

(A-E) The evolution of RMSDs towards time for the aptamer structures with respect to the first structure (time 0). (F) RMSDs evolution of 2EVY-RC after the simulation was extended to 200 ns. It can be seen that the RMSD becomes stagnant after 170 ns.

https://doi.org/10.1371/journal.pone.0288684.g003

thumbnail
Fig 4. The overlays of 3D predicted structure (colored blue) and the corresponding experimentally solved structures from the PDB database (colored red) for the five aptamers at 0, 20, 40, 60, 80, and 100 ns during the MD simulation.

https://doi.org/10.1371/journal.pone.0288684.g004

Visual inspection of the 2EVY structure in Fig 4 showed that the RMSD fluctuation observed in 2EVY at 80–150 ns (Fig 3F) was due to the opening of the hairpin loop structure. By the end of the 200ns simulation, the 2EVY structure became more similar corresponding to the reference (Fig A1 in S1 Appendix) with RMSD 6.801 Å. In Fig 3, the RMSD of 2EVY-RC began to fluctuate towards the end of the simulation (100 ns), indicating that a longer simulation was required to stabilize the structure. The other aptamers have shown a stable structure as shown by its RMSD value.

Fig 4. The overlays of the 3D predicted structure (colored blue) and the corresponding experimentally solved structures from the PDB database (colored red) for the five aptamers at 0, 20, 40, 60, 80, and 100 ns during the MD simulation.

To investigate the structure difference between aptamer models during the 100 ns simulation with the crystal structure, we calculated the RMSD values of each aptamer model with respect to the crystal structure of the reference structure at several time points (Table 2). Based on Table 1, 1XWP became more similar with its crystal structure after 100ns MD simulation, while 1XWU, 2LUN, and 2JWV have a deviation under 2Å at each time point. The deviation of 2EVY after 100 ns became higher due to the opening of the hairpin loop. We extracted the representative structure of all aptamers during 100 ns MD simulation to perform further investigation.

thumbnail
Table 2. The summary of RMSD values at different time points throughout the MD simulation of the aptamer models corresponding to its crystal structures from the PDB database.

https://doi.org/10.1371/journal.pone.0288684.t002

GROMACS can calculate the representative structure during the MD simulation period. All resulting frames from the 100 ns MD simulation were clustered. The cluster from simulations of both the aptamer model and reference that contain the higher cluster members were analyzed and compared. Each 100 ns simulation containing 10001 structures was used to calculate the representative structure. The RMSD between these clusters was calculated using UCSF Chimera, summarized, and illustrated in Fig 5.

thumbnail
Fig 5. Comparison of the representative structure of aptamer model (RC-Cl, light blue) with their corresponding structure before simulation (RC, blue), after 100ns simulation (RC-last, dark green), representative of reference structure (Ref-Cl, light green), crystal structure (Ref, red), and structure of reference structure after 100ns simulation (Ref-last, orange).

The RMSD of each alignment was written at the bottom of the structure (in Angstrom).

https://doi.org/10.1371/journal.pone.0288684.g005

RMSD distribution plot (Fig A2 in S1 Appendix) showed that all aptamer models have similar RMSD distribution with the references during 100ns MD simulation except for the 2EVY. The selected cluster of each aptamer model was compared to the initial structure before simulation, the initial structure after 100ns simulation, the crystal structure, the selected cluster from reference structure, or the reference structure after 100ns simulation. We demonstrate that the simulation improves the predicted structure, and most of the conformation of the predicted and reference structure are similar during simulation. Compared to the end conformation, those from major clusters resulted in less RMSD corresponding to their reference and the major cluster of the reference. The less RMSD also applies when compared to those of RC-derived structures. Therefore, structures from major clusters can be used to represent predicted structures. From these results, we found that most of the aptamer conformation during the simulation of the predicted and the corresponding reference structure is similar.

For further analysis, we conducted a comparison of secondary structures of each conformation derived from cluster, reference structure, and RNA Composer. The secondary structure was generated by using web-based software http://rnapdbee.cs.put.poznan.pl/. RNA PDBEE derives secondary structure topology from the tertiary structure of RNA and/or from the list of base pairs. This process is based on the algorithm that iteratively unknots the RNA structure, saves partial information about knotting order, to finally merge intermediate results and encode the RNA topology [20, 21]. The obtained secondary structure of the major cluster from the simulation of reference and model were compared and summarized in Fig A3 in S1 Appendix. All five aptamers contain non-canonical base pairs which were also explained in the main publication of each corresponding structure. For 1XWU, 1XWP, and 2EVY, the non-canonical base pairs were in the loop, and all of them could be found in the aptamer model, except for A5:A11 in 1XWU. In this study, the initial structure of 2EVY-RC and 2EVY-Ref was different in A9:G6 base conformation, which resulted in different initial loop conformation Fig A4 in S1 Appendix. A9:G6 basepair in 2EVY-RC had trans sugar hoogsteen, while in 2EVY-Ref had cis sugar hoogsteen (Fig A3 in S1 Appendix). Both trans and cis sugar hoogsteen are commonly found in nucleotide structure [2224]. At 60 ns during the 100 ns MD simulation, we observed a distinct sterical difference of A9 in the predicted and reference structure, which affected the neighbor U5:G10 wobbly base pair. The difference in the loop (base 5–10) conformation consequently affects the base pair formation in the stem (base 2–4 and 11–13). Based on the visual inspection of the simulation results, the difference in stem conformation is followed by the different persistency of syn glycosidic bond angles formation in C14 which were initially observed at 60 ns (Fig A4(B) in S1 Appendix). The syn angle in C14 continuously persisted in the 2EVY-RC simulation and did not form base pair after 70ns. Meanwhile, C14 in 2EVY-Ref formed a base pair with G1, resulting in an intact stem structure after 70 ns (Fig A4(C) in S1 Appendix). Therefore, although the non-canonical base pairs (A9:G6) were predicted in the 2EVY-RC, the sterical structure of the aptamer loop was different compared to 2EVY-Ref which then affected the overall conformational changes during the simulation (Fig A4(D) in S1 Appendix). Both non-canonical base pairs in 2LUN-RC were not predicted, while 2JWV-RC predicted 2 out of 3 non-canonical base pairs. This might cause 2JWV-RC to be more similar to the corresponding reference structures compared to 2LUN-RC.

During the simulation, both 2LUN-RC-Cl and 2LUN-Ref-Cl formed a non-canonical base pair in the end-loop resulting in the loop structure being similar to those of the reference structure. On the other hand, both 2LUN-RC-Cl and 2LUN-Ref-Cl formed a non-canonical base pair in the end-loop resulting in the loop structure being similar to those of the reference structure. Hence the RMSD value corresponding to 2LUN-Ref-Cl of 2LUN-RC-Cl is lower than those of 2LUN-RC. At the beginning of the simulation, the RMSD of 2JWV-RC and 2JWV-Ref indicate similarity, but the loops are notably different according to visual inspection. The comparison of 2JWV-RC-Cl and 2JWV-Ref-Cl showed that both internal and end loops were visually similar, which manifested to the improvement of RMSD value. Hence, it is essential to enhance the precision of loop and non-canonical base pair prediction when generating 3D structures. This is because any newly discovered non-canonical base pairs are not always observed in short molecular dynamic simulation. The inaccuracy of three-dimensional structure of the loop and non-canonical base pair could potentially result in the overlooking of instabilities throughout the simulation. The accuracy of predicting the 3D structure of RNA molecules using computational tools may be affected by various factors, such as the length of the RNA sequence and the presence of non-canonical base pairs. Longer RNA sequences typically have a greater number of nucleotides, which can increase the complexity of the structure and the likelihood of errors or inaccuracies in the prediction. Furthermore, some RNA molecules contain non-canonical base pairs, which are formed by interactions between nucleotides that are not typically paired according to the Watson-Crick base pairing rules. Non-canonical base pairs can be more difficult to predict and may not be included in standard computational tools. To improve the accuracy of predicting the 3D structure of RNA aptamers, it is suggested to incorporate non-canonical base pair prediction in the computational process. This can help to account for the presence of non-canonical base pairs and improve the accuracy of the predicted 3D structure.

Conclusions

We have faithfully predicted the three-dimensional structure of five RNA aptamers using mFold, RNAComposer, and MD simulation which were geometrically similar to the experimentally determined structures available in the Protein Data Bank databases. The limitation of mFold and RNAComposer was the inability to predict the non-canonical base pair present in the aptamer structure. This study showed that atomic MD simulation could be used to improve the predicted aptamer model by extracting the representative structure of the trajectory. This approach worked for the five aptamers RNA aptamers which contain stem-loop and internal loop motifs with 14–16, 28, or 29 nucleotides length.

References

  1. 1. Carothers J., Oestreich S. & Szostak J. Aptamers selected for higher-affinity binding are not more specific for the target ligand. Journal Of The American Chemical Society. 128, 7929–7937 (2006) pmid:16771507
  2. 2. Chushak Y. & Stone M. In silico selection of RNA aptamers. Nucleic Acids Research. 37, e87–e87 (2009) pmid:19465396
  3. 3. Sakamoto T., Oguro A., Kawai G., Ohtsu T. & Nakamura Y. NMR structures of double loops of an RNA aptamer against mammalian initiation factor 4A. Nucleic Acids Research. 33, 745–754 (2005) pmid:15687383
  4. 4. Ni S., Zhuo Z., Pan Y., Yu Y., Li F., Liu J., et al. Recent progress in aptamer discoveries and modifications for therapeutic applications. ACS Applied Materials & Interfaces. 13, 9500–9519 (2020) pmid:32603135
  5. 5. Ellington A. & Szostak J. In vitro selection of RNA molecules that bind specific ligands. Nature. 346, 818–822 (1990) pmid:1697402
  6. 6. Tuerk C. & Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science. 249, 505–510 (1990) pmid:2200121
  7. 7. Ali M., Elsherbiny M. & Emara M. Updates on aptamer research. International Journal Of Molecular Sciences. 20, 2511 (2019) pmid:31117311
  8. 8. Thiviyanathan V. & Gorenstein D. Aptamers and the next generation of diagnostic reagents. PROTEOMICS–Clinical Applications. 6, 563–573 (2012) pmid:23090891
  9. 9. Shcherbinin D., Gnedenko O., Khmeleva S., Usanov S., Gilep A., Yantsevich A., et al. Computer-aided design of aptamers for cytochrome p450. Journal Of Structural Biology. 191, 112–119 (2015) pmid:26166326
  10. 10. Gong S., Wang Y., Wang Z. & Zhang W. Computational methods for modeling aptamers and designing riboswitches. International Journal Of Molecular Sciences. 18, 2442 (2017) pmid:29149090
  11. 11. Lindahl, E., Abraham, M., Hess, B. & Van Der Spoel, D. GROMACS 2021.4 Manual. GROMACS Development Team: Stockholm, Sweden. (2021)
  12. 12. Huang J. & MacKerell A Jr. CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data. Journal Of Computational Chemistry. 34, 2135–2145 (2013) pmid:23832629
  13. 13. Davlieva M., Donarski J., Wang J., Shamoo Y. & Nikonowicz E. Structure analysis of free and bound states of an RNA aptamer against ribosomal protein S8 from Bacillus anthracis. Nucleic Acids Research. 42, 10795–10808 (2014) pmid:25140011
  14. 14. Lebruska L. & Maher L. Selection and characterization of an RNA decoy for transcription factor NF-κB. Biochemistry. 38, 3168–3174 (1999) pmid:10074372
  15. 15. Melchers W., Zoll J., Tessari M., Bakhmutov D., Gmyl A., Agol V. & Heus H. A GCUA tetranucleotide loop found in the poliovirus oriL by in vivo SELEX (un) expectedly forms a YNMG-like structure: extending the YNMG family with GYYA. Rna. 12, 1671–1682 (2006) pmid:16894217
  16. 16. Reiter N., Maher L III. & Butcher S. DNA mimicry by a high-affinity anti-NF-κB RNA aptamer. Nucleic Acids Research. 36, 1227–1236 (2008) pmid:18160411
  17. 17. Cruz J., Blanchet M., Boniecki M., Bujnicki J., Chen S., Cao S., et al. RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction. Rna. 18, 610–625 (2012) pmid:22361291
  18. 18. Antczak M., Popenda M., Zok T., Sarzynska J., Ratajczak T., Tomczyk K., et al. New functionality of RNAComposer: an application to shape the axis of miR160 precursor structure. Acta Biochimica Polonica. 63, 737–744 (2016) pmid:27741327
  19. 19. Popenda M., Szachniuk M., Antczak M., Purzycka K., Lukasiak P., Bartol N., et al. Automated 3D structure composition for large RNAs. Nucleic Acids Research. 40, e112–e112 (2012) pmid:22539264
  20. 20. Antczak M., Zok T., Popenda M., Lukasiak P., Adamiak R., Blazewicz J. et al. RNApdbee—a webserver to derive secondary structures from pdb files of knotted and unknotted RNAs. Nucleic Acids Research. 42, W368–W372 (2014) pmid:24771339
  21. 21. Zok T., Antczak M., Zurkowski M., Popenda M., Blazewicz J., Adamiak R. et al. RNApdbee 2.0: multifunctional tool for RNA structure annotation. Nucleic Acids Research. 46, W30–W35 (2018) pmid:29718468
  22. 22. Sharma P., Chawla M., Sharma S. & Mitra A. On the role of Hoogsteen: Hoogsteen interactions in RNA: Ab initio investigations of structures and energies. Rna. 16, 942–957 (2010) pmid:20354152
  23. 23. Mládek A., Sharma P., Mitra A., Bhattacharyya D., Šponer J. & Šponer J. Trans Hoogsteen/sugar edge base pairing in RNA. Structures, energies, and stabilities from quantum chemical calculations. The Journal Of Physical Chemistry B. 113, 1743–1755 (2009) pmid:19152254
  24. 24. Sharma P., Sponer J., Sponer J., Sharma S., Bhattacharyya D. & Mitra A. On the Role of the cis Hoogsteen: Sugar-Edge Family of Base Pairs in Platforms and Triplets-Quantum Chemical Insights into RNA Structural Biology. The Journal Of Physical Chemistry B. 114, 3307–3320 (2010) pmid:20163171