CHSalign: A Web Server That Builds upon Junction-Explorer and RNAJAG for Pairwise Alignment of RNA Secondary Structures with Coaxial Helical Stacking

Lei Hua; Yang Song; Namhee Kim; Christian Laing; Jason T. L. Wang; Tamar Schlick

doi:10.1371/journal.pone.0147097

Abstract

RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to handle RNA secondary structures with CHS motifs. Specifically, we build upon our Junction-Explorer software for predicting coaxial stacking and RNAJAG for modelling junction topologies as tree graphs to incorporate constrained tree matching and dynamic programming algorithms into a new method, called CHSalign, for aligning the secondary structures of RNA molecules containing CHS motifs. Thus, CHSalign is intended to be an efficient alignment tool for RNAs containing similar junctions. Experimental results based on thousands of alignments demonstrate that CHSalign can align two RNA secondary structures containing CHS motifs more accurately than other RNA secondary structure alignment tools. CHSalign yields a high score when aligning two RNA secondary structures with similar CHS motifs or helical arrangement patterns, and a low score otherwise. This new method has been implemented in a web server, and the program is also made freely available, at http://bioinformatics.njit.edu/CHSalign/.

Citation: Hua L, Song Y, Kim N, Laing C, Wang JTL, Schlick T (2016) CHSalign: A Web Server That Builds upon Junction-Explorer and RNAJAG for Pairwise Alignment of RNA Secondary Structures with Coaxial Helical Stacking. PLoS ONE 11(1): e0147097. https://doi.org/10.1371/journal.pone.0147097

Editor: Emanuele Paci, University of Leeds, UNITED KINGDOM

Received: July 6, 2015; Accepted: December 29, 2015; Published: January 20, 2016

Copyright: © 2016 Hua et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This work was supported by National Science Foundation [grant IIS-0707571 to J.W.], and National Institute of General Medical Sciences [grants GM100469 and GM081410 to T.S.]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

RNA secondary structures are composed of double-stranded segments such as helices connected to single-stranded regions such as junctions and hairpin loops. These structural elements serve as building blocks in the design of diverse RNA molecules with various functions in the cell [1–3]. In particular, RNA junctions are important structural elements due to their ability to orient many parts of the RNA molecule [4].

An RNA junction, also known as a multi-branch loop, forms when more than two helical segments are brought together [5–10]. RNA junctions exist in numerous RNA molecules; they play important roles in a wide variety of biochemical activities such as self-cleavage of the hammerhead ribozyme [11], the recognition of the binding pocket domain by purine riboswitches [12] and the translation initiation of the hepatitis C virus at the internal ribosome entry site [13]. Recent studies have classified RNA junctions with three and four branches into three and nine families, respectively [14,15]. Experiments have verified that a three-way junction in Arabidopsis has an important functional role [16]. A junction database, called RNAJunction, has been established, which contains junctions of all known degrees of branching [5].

A common tertiary motif within junctions of an RNA molecule is the coaxial stacking of helices [17–19], which occurs when two separate helical segments are aligned on a common axis to form a pseudocontiguous helix [20]. Coaxial stacking configurations have been observed in all large RNAs for which crystal structures are available, including tRNA, group I and II introns, RNase P, riboswitches and large ribosomal subunits. Coaxial helical stacking (CHS) provides thermodynamic stability to the RNA molecule as a whole [21] and reduces the separation between loop regions within junctions [22]. Moreover, coaxial stacking configurations form cooperatively with long-range interactions in many RNAs [14,17,23], and are therefore crucial as for correct tertiary structure formation as well as the formation of different junction topologies [15,17,24]. Since junctions are major architectural components in RNA, it is important to understand their structural properties. For example, the function of RNA molecules may be inferred if their junction components are similar in structure to other well-studied junction domains.

In this paper we build upon our previously developed Junction-Explorer tool [25] for predicting coaxial stacking and RNAJAG [4] for modelling junction topologies as tree graphs, and present a method, CHSalign, for aligning two RNA secondary (2D) structures that possess CHS motifs within the junctions of the two RNA structures. Coaxial stacking interactions in junctions are part of tertiary (3D) motifs [24]. Thus, CHSalign differs from both RNA 2D and 3D structure alignment tools. Existing secondary (2D) structure alignment tools focus on sequences and base pairs without considering tertiary motifs. Existing tertiary (3D) structure alignment tools accept as input two RNA 3D structures including all types of tertiary motifs in the Protein Data Bank (PDB) [26] and align the 3D structures by considering their geometric properties, torsion angles, and base pairs.

For 3D structure alignment, Ferre et al. [27] developed a dynamic programming algorithm based on nucleotide, dihedral angle, and base pairing similarities. Capriotti and Marti-Renom [28] developed a program to align two RNA 3D structures based on a unit-vector root-mean-square approach. Chang et al. [29] and Wang et al. [30] employed a structural alphabet of different nucleotide conformations to align RNA 3D structures. Hoksza and Svozil [31] developed a pairwise comparison method based on 3D similarity of generalized secondary structure units. Sarver et al. [32] designed the FR3D tool for finding local and composite recurrent structural motifs in RNA 3D structures. Dror et al. [33] described the RNA 3D structure alignment program, ARTS, and its use in the analysis and classification of RNA 3D structures [34]. Rahrig et al. [35] presented the R3D Align tool for performing global pairwise alignment of RNA 3D structures using local superpositions. He et al. [36] developed the RASS web server for comparing RNA 3D structures using both sequence and 3D structure information.

On the other hand, a well-adopted strategy for RNA 2D structure alignment is to use a tree transformation technique and perform RNA alignment through tree matching [37–39]. For instance, RNAforester [39] aligns two RNA 2D structures by calculating the edit-distance between tree structures symbolizing RNAs. By utilizing tree models to capture the structural particularities in RNA, RSmatch [37] aligns two RNA 2D structures effectively. Additional methods are described in [38,40].

In contrast to these methods for aligning two RNAs when their 2D structures are available, another group of closely related methods achieved RNA folding and alignment simultaneously. For instance, FOLDALIGN [41] uses a lightweight energy model and sequence similarity to simultaneously fold and align RNA sequences. Dynalign [42] finds a secondary structure common to two sequences without requiring any sequence identity. DAFS [43] simultaneously aligns and folds RNA sequences based on maximizing the expected accuracy of a predicted common secondary structure of the sequences. Similar techniques are implemented in CentroidAlign [44] and SimulFold [45]. SCARNA [46] employs a method of comparing RNA sequences based on the structural alignment of the fixed-length fragments of the stem candidates in the RNAs.

While many methods have been developed for RNA structure alignment, as surveyed above, few are tailored to junctions, especially junctions with coaxial stacking interactions. Junctions and coaxial stacking patterns are common in many RNA molecules and, as mentioned above, are involved in a wide range of functions. Furthermore, experimental probing techniques, such as RNA SHAPE chemistry, SAXS, NMR, and fluorescence resonance energy transfer (FRET), often provide sufficient information to determine coaxial stacking configurations [2,47–49]. Thus, a junction-tailored tool capable of comparing RNA structures on the basis of coaxial stacking patterns in their junctions could be particularly valuable. To this end, we present CHSalign, which performs RNA alignment by applying a constrained tree matching algorithm and dynamic programming techniques to ordered labeled trees symbolizing RNA structures with coaxial stacking patterns. Experimental results on different data sets demonstrate the effectiveness of this newly developed tool. The CHSalign web server is freely available at http://bioinformatics.njit.edu/CHSalign/.

Materials and Methods

CHSalign accepts as input two RNA 2D structures which contain manually annotated coaxial stacking of helices, and produces as output an alignment between the two input structures. When manually annotated coaxial stacking patterns are not available, CHSalign invokes our previously developed Junction-Explorer tool [25] to predict the coaxial stacking configurations of the input structures.

Our approach is to transform each input RNA 2D structure with coaxial stacking patterns into an ordered labeled tree. Tree graphs are popular models for representing RNA structures [4,23,39,50–52]. We extend RNAJAG [4] to obtain an ordered tree model, in which each tree node represents a secondary structure element such as a helix (stem), junction or hairpin loop. When comparing two tree nodes, we use a dynamic programming algorithm [37,38] to align the 2D structural elements in the tree nodes, obtaining a score between the two nodes. We then use a constrained tree matching algorithm to find an optimal alignment between the two input RNA 2D structures, taking into account their coaxial stacking configurations. Below, we detail our tree model and the constrained tree matching algorithm.

Tree model formalism

Let R_seq be an RNA sequence containing nucleotides or bases A, C, G, U. R_seq[i] denotes the base at position i of R_seq ordered from the 5’ to 3’ ends. R_seq[i, j], i < j, is the subsequence starting at position i and ending at position j. Let R be the 2D structure of R_seq with at least one base pair. A helix in R is a double-stranded segment composed of contiguous base pairs. A base pair connecting position i and position j is denoted by (i, j) and its enclosed subsequence is R_seq[i, j]. If all nucleotides in R_seq[i, j] except R_seq[i] and R_seq[j] are unpaired single bases, and (i, j) is a base pair in R, we call R_seq[i+1, j-1] a hairpin loop.

A junction, or a multi-branch loop, is an enclosed area connecting different helices [7]. An n-way junction in R has n branches. This junction connects n helices where there are n base pairs (i₁, j₁)… (i_n, j_n) (one base pair for each helix), and n subsequences participating in the junction. The n subsequences are denoted by R_seq[i₁+1, i₂-1], R_seq[j₂+1, i₃-1], R_seq[j₃+1, i₄-1],…, R_seq[j_n-1+1, i_n-1], and R_seq[j_n+1,j₁-1]. All the unpaired bases on the n subsequences comprise the n-way junction, and the subsequences are called the loop regions of the junction. Internal loops or bulges can be considered as special cases of “two-way” junctions [5]. However, for the purpose of this work, n must be greater than 2. Thus, internal loops or bulges are not considered as junctions in our work; instead, they are considered as part of the helices in R.

We transform the 2D structure R into an ordered labeled tree T in which each node has a label and the left-to-right order among sibling nodes is important. Each node of T represents a 2D structural element of R, belonging to one of three types: helix, junction, and hairpin loop. With this tree model, pseudoknots are excluded.

Fig 1 illustrates the transformation process. Fig 1A shows the 3D crystal structure of the adenine riboswitch molecule (PDB code: 1Y26) obtained from the Protein Data Bank (PDB) [26] and drawn by PyMOL (http://www.pymol.org/). The first helix according to the 5′ to 3′ orientation is labeled by H₁ and highlighted in blue. The second helix is labeled by H₂ and highlighted in green. The third helix is labeled by H₃ and highlighted in red. The junction labeled by J₁ and hairpin loops labeled by P₁ and P₂ respectively are highlighted in light grey. J₁ is a multi-branch loop where the three helices H₁, H₂ and H₃ connect. P₁ and P₂ are hairpin loops connected to helices H₂ and H₃, respectively.

Download:

Fig 1. Transformation of an RNA 3D molecule into an ordered labeled tree.

(A) The 3D crystal structure of the adenine riboswitch molecule (PDB code: 1Y26) obtained from the Protein Data Bank (PDB) and drawn by PyMOL. The first helix according to the 5′ to 3′ orientation is labeled by H₁ and highlighted in blue. The second helix is labeled by H₂ and highlighted in green. The third helix is labeled by H₃ and highlighted in red. The junction labeled by J₁ and hairpin loops labeled by P₁ and P₂ respectively are highlighted in light grey. J₁ is a multi-branch loop where the three helices H₁, H₂ and H₃ connect. P₁ and P₂ are hairpin loops connected to helices H₂ and H₃, respectively. (B) The corresponding secondary (2D) structure, obtained from RNAView. Each 2D structural element in (B) is highlighted as in (A). The yellow bar across H₁, J₁ and H₃ denotes a coaxial helical stacking H₁H₃ in the molecule 1Y26. (C) The ordered labeled tree, T, used to represent the 2D structure R in (B). Each node of T corresponds to a 2D structural element of R where the octagon (squares, triangles respectively) in T represents the junction (helices, hairpin loops respectively) in R.

https://doi.org/10.1371/journal.pone.0147097.g001

Fig 1B shows the corresponding 2D structure, obtained from RNAView [53]. Each 2D structural element in Fig 1B is highlighted as in Fig 1A. Notice that there is a yellow bar across H₁, J₁ and H₃, symbolizing a coaxial helical stacking H₁H₃ in the molecule 1Y26, as described in [17,24]. In general, the coaxial helical stacking status of a three-way junction such as J₁ in Fig 1B is described as one of four possibilities: H₁H_2, H₂H_3, H₁H_3, or none, where H_xH_y indicates that H_x and H_y are coaxially stacked, i.e., helix H_x shares a common axis with helix H_y. The locations of the junctions and the coaxial helical stacking status of each junction in a given 2D structure can be determined using the methods described in [25].

Fig 1C shows the tree, T, used to represent the 2D structure R in Fig 1B. Each node of T corresponds to a 2D structural element of R where the octagon (squares, triangles respectively) in T represents the junction (helices, hairpin loops respectively) in R. Thus, like the 2D structural elements, each tree node belongs to one of three types, namely helix, junction, and hairpin loop. Tree nodes of different types are prohibited to be aligned with each other, and hence the term “constrained tree matching” is used in our work (reminiscent of structural constraints in RNA described in [54]).

We use t[i] to represent the node of tree T whose position in the left-to-right post-order traversal of T is i. The post-order procedure works by first traversing the left subtree, then traversing the right subtree, and finally visiting the root. In Fig 1C, the post-order position number of each node is shown next to the node. By construction, the tree node corresponding to an n-way junction consists of n– 1 children. The first helix according to the 5′ to 3′ orientation is the parent node of the junction node. The other n– 1 helices are the children of that junction node. The number of children of node t[i] is the degree of t[i]. In Fig 1C, H₁ is the parent node of J₁, which has two children, H₂ and H₃. The degree of the junction node J₁ is 2. In general, the degree of an n-way junction node is n– 1.

Consider two RNA 2D structures R₁ and R₂ and their tree representations T₁ and T₂ respectively. Let t₁[i] (t₂[j], respectively) be the node of T₁ (T₂, respectively) whose position in the post-order traversal of T₁ (T₂, respectively) is i (j, respectively). Let T₁[i] be the subtree rooted at t₁[i], and T₂[j] be the subtree rooted at t₂[j]. F₁[i] represents the forest obtained by removing the root t₁[i] from subtree T₁[i]. F₂[j] represents the forest obtained by removing the root t₂[j] from subtree T₂[j]. Suppose the degree of t₁[i] is m_i (i.e., t₁[i] has m_i children ) and the degree of t₂[j] is n_j (i.e., t₂[j] has n_j children ). We use S(T₁[i], T₂[j]) to represent the alignment score of subtree T₁[i] and subtree T₂[j], and use γ(t₁[i], t₂[j]) to represent the alignment score of node t₁[i] and node t₂[j]. We use ∅ to represent an empty node; matching a tree node with ∅ amounts to aligning all nucleotides in the tree node to gaps.

Alignment scheme

We employ a dynamic programming algorithm to align two RNA 2D structures with coaxial stacking patterns. Our approach is to transform each RNA 2D structure into an ordered labeled tree as explained in the previous subsection. We then apply the dynamic programming algorithm to the ordered labeled trees representing the two RNA 2D structures. Based on the alignment of the trees, we obtain the alignment of the corresponding RNA 2D structures. As noted above, each tree node belongs to one of three types: helix, junction, and hairpin loop. Different types of tree nodes are prohibited to be aligned with each other. When aligning two subtrees T₁[i] and T₂[j] and calculating the score S(T₁[i], T₂[j]), there are nine cases to be considered.

Case 1. Both t₁[i] and t₂[j] are junctions.

One constraint we impose on pairwise alignment is that when aligning a p-way junction node v1 with a q-way junction node v2, p must be equal to q. Furthermore, the coaxial helical stacking status of v₁ must be the same as the coaxial stacking status of v₂. Thus, a three-way junction must be aligned with a three-way junction, which is not allowed to align with a four-way junction. Furthermore, a three-way junction whose coaxial helical stacking status is H₁H₂ must be aligned with a three-way junction having the same H₁H₂ status, which is not allowed to align with a three-way junction whose coaxial helical stacking status is H₂H_3. In general, junctions with different branches and different coaxial stacking configurations have different biological properties. This constraint is established to ensure a biologically meaningful alignment is obtained, and to avoid introducing too many gaps in the alignment.

According to our tree model, if a tree node is a junction, it must have at least two children and the children must be helix nodes. A junction contains loop regions with single bases whereas helices are double-stranded regions with base pairs. A junction node is thus prohibited to be aligned with a helix node. Hence, t₁[i] must be aligned with t₂[j] provided they have the same number of branches and the same coaxial helical stacking status, denoted by Ψ(t₁[i]) = Ψ(t₂[j]). Their children are trees, which together form forests F₁[i] and F₂[j] respectively. F₁[i] must be aligned with F₂[j]. Thus the alignment score of T₁[i] and T₂[j] can be calculated as: (1)

If Ψ(t₁[i]) = Ψ(t₂[j]), t₁[i] and t₂[j] must have the same number of children, and the order among the sibling nodes is important. If Ψ(t₁[i]) ≠ Ψ(t₂[j]), i.e., t₁[i] and t₂[j] have different numbers of children (branches) or they have different coaxial helical stacking statuses, they are prohibited to be aligned together. Thus, the score of matching F₁[i] with F₂[j] can be calculated as: (2) where m is the number of children of t₁[i] and t₂[j] respectively. We use Π(t₁[i]) to represent the coaxial helical stacking status of t₁[i]; Π(t₁[i]) = 1 (2, 3, 0, respectively) if the coaxial helical stacking status of t₁[i] is H₁H₂ (H₂H_3, H₁H_3, none, respectively). The score of matching t₁[i] with t₂[j] is (3) Here, s is the score obtained by aligning the junction in t₁[i] with the junction in t₂[j]. We use a dynamic programming algorithm [37,38] to calculate the alignment score s, and adopt the RIBOSUM85-60 matrix [55] to calculate the score of aligning two bases or base pairs in RNA 2D structures. (The default gap penalty is –1.) With this scoring matrix, CHSalign can handle non-canonical base pairs. The addition of a parameter w to the alignment score is a computational device to enforce the right alignment of the RNAs when the junction patterns match. Thus, if t₁[i] and t₂[j] have the same number of branches, their CHS patterns are alike, and Π(t₁[i])≠0, Π(t₂[j])≠0, we use s+w as the modified alignment score. When t₁[i] and t₂[j] have the same number of branches and Π(t₁[i]) = Π(t₂[j]) = 0, we use s+(w/2) as the modified score. The value of w required experimentation, as we discuss later, but a value of 100 seems to work well in practice.

Case 2. Both t₁[i] and t₂[j] are helices.

Due to the nature of RNA 2D structures and based on our tree model, a helix has only one child, which is either a junction or a hairpin loop. The subtree rooted at the child of t₁[i] is denoted by T₁[i—1] and the subtree rooted at the child of t₂[j] is denoted by T₂[j—1]. We have to match helix nodes t₁[i] and t₂[j] first, and then add the alignment score of their subtrees T₁[i—1] and T₂[j—1] if the alignment score of the subtrees is greater than or equal to zero, or simply match t₁[i] with t₂[j] if the alignment score of their subtrees is negative (i.e., the subtrees are not aligned). Therefore, the alignment score of T₁[i] and T₂[j] can be calculated as: (4)

The score γ(t₁[i], t₂[j]) is obtained by aligning the helix in t₁[i] with the helix in t₂[j] using a dynamic programming algorithm [37,38]. The value 0 is used if the other entries in Eq (4) yield negative scores.

Case 3. Both t₁[i] and t₂[j] are hairpin loops.

Due to the nature of RNA 2D structures and based on our tree model, a hairpin does not have any child. Therefore hairpin nodes are always leaves in the tree representation of an RNA 2D structure. When both t₁[i] and t₂[j] are hairpin loops, matching T₁[i] with T₂[j] amounts to matching t₁[i] with t₂[j]. Thus, the alignment score becomes: (5)

The score γ(t₁[i], t₂[j]) is obtained by aligning the hairpin loop in t₁[i] with the hairpin loop in t₂[j] using a dynamic programming algorithm [37,38].

Case 4. t₁[i] is a junction and t₂[j] is a helix.

Since t₁[i] and t₂[j] have different types, they cannot be aligned with each other. There are two subcases.

Subcase 1. t₂[j] is aligned to gaps. Then T₁[i] must be aligned with T₂[j—1], which is the subtree rooted at the child of t₂[j].

Subcase 2. t₁[i] is aligned to gaps. Suppose t₁[i] has m_i children . The subtrees rooted at these children are denoted by respectively. Then, one of these subtrees must be aligned with T₂[j]; specifically the subtree yielding the maximum alignment score is aligned with T₂[j].

We take the maximum of the above two subcases. Thus, the score of matching T₁[i] with T₂[j] can be calculated as: (6)

The value 0 is used if both of the two subcases yield negative scores.

Fig 2 illustrates this case where two PDB molecules, A-riboswitch (PDB code: 1Y26) and the Alu domain of the mammalian signal recognition particle (SRP) (PDB code: 1E8O), are considered. Fig 2A shows the 3D crystal structure of the adenine riboswitch molecule and its tree representation T₁. Fig 2B shows the 3D crystal structure of the Alu domain of the mammalian SRP molecule and its tree representation T₂. When matching T₁[i] with T₂[j], since t₁[i] and t₂[j] have different types where t₁[i] is a junction and t₂[j] is a helix, there are two subcases to be considered, as detailed above. Fig 2C-i illustrates subcase 1, in which t₂[j] is aligned to gaps and T₁[i] is aligned with T₂[j—1]. Fig 2C-ii illustrates subcase 2, in which t₁[i] is aligned to gaps, and the subtree rooted at one of the children of t₁[i] is aligned with T₂[j]. In our example here, t₁[i] has two children, t₁[i₁] and t₁[i₂]. Thus, either the subtree rooted at t₁[i₁], denoted by T₁[i₁], is aligned with T₂[j] as illustrated in Fig 2C-iia, or the subtree rooted at t₁[i₂], denoted by T₁[i₂], is aligned with T₂[j] as illustrated in Fig 2C-iib. The maximum alignment score obtained from Fig 2C-iia and 2C-iib is used. Then S(T₁[i], T₂[j]) is calculated by taking the maximum of the two subcases illustrated in Fig 2C-i and 2C-ii respectively.

Download:

Fig 2. Illustration of an alignment between two RNA molecules.

(A) The 3D crystal structure of the adenine riboswitch (PDB code: 1Y26) and its tree representation T₁. (B) The 3D crystal structure of the Alu domain of the mammalian signal recognition particle (SRP) (PDB code: 1E8O) and its tree representation T₂. (C) When matching T₁[i] with T₂[j], since t₁[i] and t₂[j] have different types where t₁[i] is a junction and t₂[j] is a helix, there are two subcases to be considered. Subcase 1 is illustrated in (i) where t₂[j] is aligned to gaps and T₁[i] is aligned with T₂[j—1]. Subcase 2 is illustrated in (ii) where t₁[i] is aligned to gaps, and the subtree rooted at one of the children of t₁[i] is aligned with T₂[j]. In this example, t₁[i] has two children, t₁[i₁] and t₁[i₂]. Thus, either the subtree rooted at t₁[i₁], denoted by ₁[i₁], is aligned with T₂[j] as illustrated in (iia), or the subtree rooted at t₁[i₂], denoted by T₁[i₂], is aligned with T₂[j] as illustrated in (iib).

https://doi.org/10.1371/journal.pone.0147097.g002

Case 5. t₁[i] is a junction and t₂[j] is a hairpin loop.

Since t₁[i] and t₂[j] have different types, the two nodes cannot be aligned together. Furthermore, t₂[j] is a hairpin loop, which does not have any child. Thus t₁[i] must be aligned to gaps, and the subtree rooted at one of the children of t₁[i] is aligned with T₂[j]; specifically the subtree yielding the maximum alignment score is aligned with T₂[j]. Therefore, the alignment score of T₁[i] and T₂[j] can be calculated as: (7)

Case 6. t₁[i] is a helix and t₂[j] is a junction.

Similar to Case 4, there are two subcases.

Subcase 1. t_i[i] is aligned to gaps. Thus, the subtree rooted at the child of t₁[i], denoted by T₁[i—1], must be aligned with T₂[j].

Subcase 2. t₂[j] is aligned to gaps. Suppose t₂[j] has n_j children . The subtrees rooted at these children are respectively. Then T₁[i] must be aligned with one of these subtrees.

Taking the maximum of these two subcases, we calculate the score of matching T₁[i] with T₂[j] as: (8)

Case 7. t₁[i] is a helix and t₂[j] is a hairpin loop.

Because t₁[i] and t₂[j] have different types, the two nodes cannot be aligned together. Furthermore, since t₁[i] is a helix, it has only one child; t₂[j] is a hairpin loop with no children. Therefore, t₁[i] must be aligned to gaps and the subtree rooted at the child of t₁[i], denoted by T₁[i—1], must be aligned with T₂[j], or if the alignment yields a negative score, we use the value 0. Thus, the alignment score is (9)

Case 8. t₁[i] is a hairpin loop and t₂[j] is a junction.

This is similar to Case 5. Thus, we can calculate the score of matching T₁[i] with T₂[j] as: (10)

Case 9. t₁[i] is a hairpin loop and t₂[j] is a helix.

This is similar to Case 7, with the alignment score: (11)

Time and space complexity

Let |T₁| (|T₂| respectively) denote the number of nodes in tree T₁ (T₂ respectively) that represents RNA structure R₁ (R₂ respectively). CHSalign maintains a two-dimensional table in which c(i, j) represents the cell located at the intersection of the ith row and the jth column of the table. The value stored in the cell c(i, j), 1 ≤ i ≤ |T₁|, 1 ≤ j ≤ |T₂|, is S(T₁[i], T₂[j]). The dynamic programming algorithm employed by CHSalign calculates the values in the table by traversing the trees T₁ and T₂ in a bottom-up manner. After all the values in the table are computed, the algorithm locates the cell c with the maximum value. A backtrack procedure starting with the cell c and terminating when encountering a zero identifies the alignment lines of an optimal alignment and calculates the alignment score between T₁ and T₂.

Let |R₁| (|R₂| respectively) denote the number of nucleotides, i.e., the length, of RNA structure R₁ (R₂ respectively). Let |t₁[i]| (|t₂[j]| respectively) be the number of nucleotides in node t₁[i] (t₂[j] respectively). Let d₁ (d₂, respectively) be the maximum degree of any node in tree T₁ (T₂ respectively). The time complexity of computing γ(t₁[i], t₂[j]) is O(|t₁[i]|×|t₂[j]|) [37]. Thus, the time complexity of computing S(T₁[i],T₂[j]) is O(max(d₁,d₂)+|t₁[i]|×|t₂[j]|). Here max(d₁, d₂) is a constant because a junction has at most twelve branches in solved RNA crystal structures [4,25,52]. Furthermore, and Therefore the time complexity of calculating all the values in the two-dimensional table is (12)

Locating the cell c with the maximum value in the two-dimensional table and executing the backtrack procedure require computational time. Therefore the time complexity of CHSalign is O(|R₁|×|R₂|). Since only a two-dimensional table is used, the space complexity of CHSalign is O(|T₁|×|T₂|) = O(|R₁|×|R₂|).

Data sets

Popular benchmark datasets such as BRAliBase [56] and Rfam [57] are not suitable for testing CHSalign, since they do not contain coaxial helical stacking information. As a consequence, we manually created two datasets for testing CHSalign and comparing it with related methods. The first dataset, Dataset1, contains 24 RNA 3D structures from the Protein Data Bank (PDB) [26] (see Table 1). This dataset was studied and published in [4,25,52], in which all annotations for junctions and coaxial helical stacking were taken from crystallographic structures. Each 3D structure in Dataset1 contains at least one three-way junction, and the lengths of the 3D structures range from 40 nt to 2,958 nt. Some 3D structures contain higher-order junctions such as ten-way junctions with coaxial stacking patterns. The 2D structure of each 3D structure in Dataset1 is obtained with RNAView retrieved from RNA STRAND [58]. The pseudoknots in these structures are removed using the K2N tool [59].

Download:

Table 1. The 24 RNA full structures in Dataset1 selected from the Protein Data Bank (PDB) to evaluate the performance of the alignment methods studied in this paper.

https://doi.org/10.1371/journal.pone.0147097.t001

The second dataset, Dataset2, contains 76 three-way junctions extracted from the 24 3D structures in Dataset1. (Some 3D structures in Dataset1 contain more than one three-way junction and all those three-way junctions in a 3D structure are extracted.) The lengths of the three-way junctions range from 28nt to 153nt. The coaxial helical stacking status of each three-way junction in Dataset2 is described as one of three possibilities: H₁H_2, H₂H_3, H₁H_3. Thus, every three-way junction in Dataset2 contains a coaxial stacking pattern. In the RNA literature, most research efforts have been focused on three-way and four-way junctions [6,15,60–62] partly due to the fact that higher-order junctions are rare. In particular, three-way junctions are the most abundant type of junctions, accounting for over 50% of the available crystal data. We also performed experiments on four-way junctions; results obtained from the four-way junctions were similar to those for the three-way junctions reported here, and hence omitted.

Results and Discussion

Two CHSalign web server versions

We have implemented two programs in Java, a standalone version denoted by CHSalign_u, and the other a pipeline denoted by CHSalign_p. CHSalign_u requires the user to manually annotate the coaxial stacking patterns within junctions of the pair of RNA 2D structures in the input, and produces an optimal alignment between the two input structures.

By contrast, CHSalign_p accepts as input two unannotated RNA 2D structures and produces as output an optimal alignment between the two input structures while taking into account their junctions and coaxial stacking configurations within the junctions. This pipeline invokes our previously developed Junction-Explorer tool [25] to automatically predict and identify the junctions and coaxial stacking patterns within the junctions in the input structures, and then aligns the input structures containing the predicted coaxial stacking patterns. Both CHSalign_u and CHSalign_p are available on the web.

Performance evaluation using RMSD

We conducted a series of experiments to evaluate the performance of our algorithms. In the first experiment, we divided Dataset2 into three disjoint subsets Dataset2-1, Dataset2-2 and Dataset2-3, with 35, 18, and 23 junctions, respectively. These three subsets contain, respectively, three-way junctions whose coaxial helical stacking status is H₁H₂, H₂H₃, or H₁H₃. We performed pairwise alignment of junctions in each subset. There are (35×34/2+18×17/2+23×22/2 = 1,001) pairwise alignments produced by CHSalign. Commonly used ways for evaluating the accuracy of these structural alignments include the calculation of distance matrices or RMSD (root-mean-square deviation) [4,29,32,63–66]. We adopt the RMSD measure [4,29] to evaluate the performance of our algorithms; specifically we use the method for computing RMSDs of tree graphs [4]. It has been shown that RMSDs of tree graphs and RMSDs of atomic models are positively correlated and indicate similar trends [4]. The average of the RMSD values of the 1,001 pairwise alignments was calculated and plotted.

One important parameter in our algorithms is the weight w used in Eq (3) for calculating the alignment score of two junction nodes. This parameter is introduced to favor the alignment between two junctions with the same number of branches and the same coaxial helical stacking status. Experimental results show that when w is sufficiently large (e.g., w > 50), our algorithms work well. In subsequent experiments, we fixed the weight w in Eq (3) at 100.

Fig 3 compares CHSalign_u and CHSalign_p with three other alignment programs: RNAforester [39], RSmatch [37] and FOLDALIGN [41]. Like CHSalign, both RNAforester and RSmatch produce an alignment between two input RNA 2D structures. FOLDALIGN differs from the other programs in Fig 3 in that it performs 2D structure prediction and alignment simultaneously. When running the FOLDALIGN tool, the structure information in the datasets was ignored and only the sequence data was used as the input of the tool. In addition, when experimenting with CHSalign_u, the coaxial stacking patterns were provided along with the input RNA 2D structures. When running the other programs including CHSalign_p, RNAforester, RSmatch and FOLDALIGN, these coaxial stacking patterns were absent in the input. CHSalign_p automatically predicts the coaxial stacking patterns and then aligns the predicted structures.

Download:

Fig 3. Comparison of the RMSD values obtained by CHSalign_u, CHSalign_p, RSmatch, RNAforester and FOLDALIGN.

The RMSD values of CHSalign_u, CHSalign_p, RSmatch, RNAforester and FOLDALIGN are 1.78 Å, 1.83 Å, 4.41 Å, 6.13 Å and 8.26 Å, respectively. The proposed CHSalign method performs better than the existing alignment tools in terms of RMSD values.

https://doi.org/10.1371/journal.pone.0147097.g003

Fig 3 shows that CHSalign_u performs the best, achieving an RMSD of 1.78 Å. The drawback of CHSalign_u, however, is that it requires the user to annotate the input RNA structures with coaxial stacking patterns manually. Manually annotating coaxial stacking patterns on RNA structures requires domain related expertise. On the other hand, CHSalign_p does not require any manual processing and achieves a reasonably good RMSD of 1.83 Å. Since the predicted coaxial stacking patterns may be imperfect, the RMSD of CHSalign_p is larger than that of CHSalign_u. RSmatch and RNAforester have even larger RMSDs of 4.41 Å and 6.13 Å, respectively. This happens because RSmatch and RNAforester ignore coaxial stacking configurations when aligning RNA 2D structures. FOLDALIGN has the largest RMSD of 8.26 Å, partly because it does not consider coaxial helical stacking either, and partly because there are errors in its predicted 2D structures.

Performance evaluation using precision

In the next experiment, we adopt precision as the performance measure, defined below, to evaluate how junctions and coaxial stacking patterns are aligned by different programs using the 24 structures in Dataset1. We say a junction J₁ in structure R₁ is aligned with a junction J₂ in structure R₂, or more precisely there is a junction alignment between J₁ and J₂, if there exist a nucleotide n₁ on a loop region of J₁ and a nucleotide n₂ on a loop region of J₂ such that n₁ is aligned with n₂. A junction alignment between J₁ and J₂ is a true positive if J₁ and J₂ have the same number of branches and the same coaxial helical stacking status. A junction alignment between J₁ and J₂ is a false positive if J₁ and J₂ have different numbers of branches or different coaxial helical stacking statuses. The precision (PR) of an alignment between R₁ and R₂ is defined as (13) where TP equals the number of true positives and FP equals the number of false positives in the alignment. The higher PR value a program has, the more precise alignment that program produces. In the experiment, we also included a closely related RNA 3D alignment tool (SETTER) [31].

We calculated the precision of each alignment produced by a program, took the average of the precision values of the pairwise alignments of the 24 structures in Dataset1, and plotted the average values. Fig 4 shows the result. We see that CHSalign_u performs the best, achieving a PR value of 1. CHSalign_p achieves a PR value of 0.85, not 1, because some coaxial stacking patterns were not predicted correctly by Junction-Explorer [25] used in CHSalign_p. The other programs in Fig 4 did not consider coaxial helical stacking while performing pairwise alignments, and hence achieved low PR values. Specifically, the PR values of RNAforester, SETTER, RSmatch, and FOLDALIGN were 0.54, 0.42, 0.33, and 0.31 respectively. Unlike the CHSalign method, these programs occasionally align two junctions with different numbers of branches or different coaxial helical stacking statuses, hence yielding false positives. However, SETTER is a general-purpose structure alignment tool capable of comparing two RNA 3D molecules with diverse tertiary motifs, while CHSalign can only deal with the 2D structures of the 3D molecules that contain coaxial helical stacking motifs.

Download:

Fig 4. Comparison of the PR values obtained by CHSalign_u, CHSalign_p, RNAforester, SETTER, RSmatch and FOLDALIGN.

The PR values, as defined in Eq (13), of CHSalign_u, CHSalign_p, RNAforester, SETTER, RSmatch and FOLDALIGN are 1, 0.85, 0.54, 0.42, 0.33 and 0.31, respectively. The proposed CHSalign method performs better than the existing alignment tools in terms of PR values.

https://doi.org/10.1371/journal.pone.0147097.g004

Potential application of CHSalign

To demonstrate the utility of the CHSalign tool, we applied CHSalign to the analysis of riboswitches that regulate gene expression by selectively binding metabolites [67]. Table 2 lists six riboswitches that bind to different metabolites (purine, guanine, thiamine pyrophosphate [TPP], and S-Adenosyl methionine [SAM]) found in different organisms. Since such binding and gene regulation activities are correlated to junction structures, the results of junction alignments could help suggest structural similarity (and thus possibly function) of these riboswitches. For each riboswitch, Table 2 also lists the junction type and coaxial helical stacking status within the junction in that riboswitch. Fig 5 illustrates the coaxial stacking patterns in the six riboswitches. We tested several combinations of junctions in these six riboswitches to determine whether the CHSalign results confirm known structural and functional similarity in existing RNAs. Table 3 summarizes the test results. Details of these results, including the input and output of each test, can be found in S1 and S2 Files.

Download:

Table 2. The six riboswitches selected from the Protein Data Bank (PDB) to demonstrate the utility of our web server.

https://doi.org/10.1371/journal.pone.0147097.t002

Download:

Table 3. Results obtained by aligning seven pairs of riboswitches from Table 2.

https://doi.org/10.1371/journal.pone.0147097.t003

Download:

Fig 5. Illustration of the coaxial stacking patterns in the six riboswitches used to demonstrate the utility of our web server.

(A) Artificial purine riboswitch (PDB code: 2G9C) with a three-way junction and a CHS motif of type H₁H₃ in the junction. (B) Artificial guanine riboswitch (PDB code: 3RKF) with a three-way junction and a CHS motif of type H₁H₃ in the junction. (C) A. thaliana TPP riboswitch (PDB code: 3D2G) with a three-way junction and a CHS motif of type H₁H₂ in the junction. (D) E. coli TPP riboswitch (PDB code: 2GDI) with a three-way junction and a CHS motif of type H₁H₂ in the junction. (E) T. tengcongensis SAM-I riboswitch (PDB code: 2GIS) with a four-way junction and a CHS motif of type H₁H₄, H₂H₃ in the junction. (F) H. marismortui SAM-I riboswitch (PDB code: 4B5R) with a four-way junction and a CHS motif of type H₁H₄, H₂H₃ in the junction.

https://doi.org/10.1371/journal.pone.0147097.g005

Without knowledge of junction helical arrangements, we first tested the following cases using CHSalign_p, where the two aligned junctions had the same coaxial stacking patterns. We used SAM riboswitches in different organisms (PDB codes 2GIS and 4B5R in Table 2) as input. CHSalign_p predicted that the two riboswitches had helical arrangements of four-way junctions both with coaxial stacking helices 1 and 4 and helices 2 and 3, and produced a very high alignment score of 252.61, as calculated by the equations in the subsection ‘Alignment scheme” in the section ‘Materials and Methods’. This high score implies that the two riboswitches have highly similar helical arrangements. This corroborates our expectations, because the two tested riboswitches have similar structures and functionality, binding to SAM. Next, when we used purine and guanine riboswitches (PDB codes 2G9C and 3RKF), we obtained a high alignment score of 179.68 for three-way junction alignment of the two riboswitches with predicted coaxial stacking of helices 1 and 3 in both riboswitches, indicating high similarities of their three-way junction structures. We also tested two TPP riboswitches with three-way junctions in different organisms (PDB codes 2GDI and 3D2G), which produced a high alignment score of 191.06, again indicating that these two TPP riboswitches have similar three-way junction structures.

We next compared very different junction structures using CHSalign_p. When we aligned two different riboswitches—SAM riboswitch with a four-way junction and purine riboswitch with a three-way junction (PDB codes 2GIS and 2G9C, respectively), we obtained a low alignment score of 20.40. We also tested a pair of purine and TPP riboswitches (PDB codes 2G9C and 2GDI), which are in different riboswitch classes and have different coaxial stacking patterns in their three-way junctions. We obtained a low alignment score of 13.65. These experiments suggest that CHSalign_p, based only on secondary structural information, is useful for inferring tertiary structural features regarding helical arrangements.

Finally, we tested CHSalign_u, which requires prior information about junction arrangement and produces a structural similarity score for two given RNAs. Here, we tested two cases. First, we considered the same RNA structure (purine riboswitch with PDB code 2G9C) but annotated it with different helical arrangement patterns where one had coaxial stacking helices 1 and 3 (H₁H₃) and the other had coaxial stacking helices 1 and 2 (H₁H₂). Second, we considered two RNAs with different structures (purine riboswitch with PDB code 2G9C and guanine riboswitch with PDB code 3RKF respectively) but annotated them with the same helical arrangement pattern, namely coaxial stacking helices 1 and 2 (H₁H₂). Note that this manually annotated H₁H₂ pattern is different from the H₁H₃ pattern that naturally occurs, and is also predicted by CHSalign_p, in the purine and guanine riboswitches.

In the first case, the score produced by CHSalign_u was very low (36.69), due to the different helical arrangements. This result shows the large conformational range of structural arrangements that the purine riboswitch can have, from naturally preferable arrangements (H₁H₃, as predicted by CHSalign_p) to unnatural arrangements (H₁H₂, as manually set by us). In the second case, CHSalign_u produced a high score of 179.68, which indicates the possibility that two different RNA structures can have very similar helical arrangements when we manually set these arrangements. Thus, CHSalign_u could help investigate the structural diversity of all possible helical arrangements, including natural or hypothetical conformations for two RNA 2D structures.

Conclusions

We have presented a novel method (CHSalign) capable of producing an optimal alignment between two input RNA secondary (2D) structures with coaxial helical stacking (CHS), based on our previously developed Junction-Explorer [25] and RNAJAG [4]. The method is junction-aware, CHS-favored in the sense that it assigns a weight to the alignment of two RNA junctions with the same number of branches and the same coaxial helical stacking status while prohibiting the alignment of two junctions that do not have the same number of branches or the same coaxial helical stacking status. The method transforms each input RNA 2D structure to an ordered labeled tree, and employs dynamic programming techniques and a constrained tree matching algorithm to align the two input RNA 2D structures. CHSalign has two versions; CHSalign_u requires the user to manually annotate the coaxial stacking patterns in the input structures while CHSalign_p automatically predicts the coaxial stacking patterns in the input structures. Experimental results demonstrate that both versions outperform the existing alignment programs that do not take into account coaxial stacking configurations in the input RNA structures.

It has been observed that several functional RNA families such as tRNA, RNase P, and large ribosomal subunits have conserved structural features while having very diverse sequence patterns. RNA structure alignment tools such as CHSalign can help measure the structural similarity between these RNAs, even without sequence relevance in the RNAs. Similar RNA structural motifs are encountered on a variety of RNAs. While these motifs exist in different contexts, their functions are related. For instance, sarcin-ricin motifs often bind to proteins, and GNRA tetraloops act as receptors for RNA-RNA long-range interactions. Furthermore, examples of larger structure-function similarity are observed in the tRNA-like structure found in the transfer-messenger RNA (tmRNA), whose structure similarity with tRNA helps identify the functional role of tmRNAs to aid in translation via stalled ribosome rescue. Other tRNA-like structures found in viruses such as HIV and internal ribosome entry sites (IRES) mimic the 3D “L-shape” of tRNAs to take control of the host ribosome.

As our knowledge on RNA structure progresses, more sophisticated secondary structure alignment tools are required that allow for comparison of tertiary motifs such as coaxial stacking patterns. Indeed, experimental probing techniques such as RNA SHAPE chemistry, SAXS, NMR, and fluorescence resonance energy transfer (FRET), can often provide sufficient information to determine coaxial helical stacking [47,68,69]. Because the structure and function of RNA are highly interrelated, a tool that addresses coaxial stacking patterns can assist the comparison of structures with high functional relevance.

CHSalign is the first tool that can compute an RNA secondary structure alignment in the presence of coaxial helical stacking. When coaxial stacking configurations are available from experimental data such as FRET, NMR or SAXS data, the user can input such information to aid in the alignment. However, if no knowledge of coaxial stacking configurations is available, CHSalign can infer this information by employing Junction-Explorer [25], which predicts coaxial helical stacking with 81% accuracy.

Existing RNA secondary structure alignment tools [37,39] do not distinguish between structural elements such as helices, junctions and hairpin loops. However, each element type has its special property and function. In contrast, CHSalign only matches structural elements of the same type. Furthermore, the tool imposes a constraint that a junction of RNA1 can be aligned with a junction of RNA2 only if they have the same number of branches and the same coaxial helical stacking status. We also implemented an extension of CHSalign, which relaxes this constraint. This extension is able to align two junctions with different numbers of branches and simply requires that coaxially stacked helices be aligned with coaxially stacked helices when matching a p-way junction with a q-way junction for p different than q. The source code of both CHSalign and its extension can be downloaded from the web server site.

Supporting Information

S1 File. Results obtained by aligning five pairs of riboswitches from Table 2 using CHSalign_p.

For each pair of riboswitches, the input and output of the CHSalign_p program are displayed. The input includes two riboswitches in bpseq format. CHSalign_p invokes Junction-Explorer to predict coaxial helical stacking (CHS) motifs in the input molecules, and aligns the predicted structures. The output includes the CPU time spent in performing the alignment, the alignment score and alignment details.

https://doi.org/10.1371/journal.pone.0147097.s001

(TXT)

S2 File. Results obtained by aligning two pairs of riboswitches from Table 2 using CHSalign_u.

For each pair of riboswitches, the input and output of the CHSalign_u program are displayed. The input includes two riboswitches in bpseq format along with CHS motifs annotated manually by the user. The output includes the CPU time spent in performing the alignment, the alignment score and alignment details.

https://doi.org/10.1371/journal.pone.0147097.s002

(TXT)

Acknowledgments

We would like to acknowledge Drs. Bruce Shapiro and Kaizhong Zhang for helpful conversations. We also thank Dongrong Wen and Akhila Nagula for their contributions in the early stage of this work. J.W. acknowledges support of this work by the National Science Foundation Grant IIS-0707571. T.S. acknowledges support of this work by the National Institute of General Medical Sciences Grants GM100469 and GM081410. Computing resources, utilized by the NYU team, of the Computational Center for Nanotechnology Innovations and Empire State Development's Division of Science, Technology and Innovation [through National Science Foundation (NSF) Group Award TG-MCB080036N] and the New York Center for Computational Sciences at Stony Brook University/Brookhaven National Laboratory (supported by Department of Energy Grant DE-AC02-98CH10886 and the State of New York) are gratefully acknowledged.

Author Contributions

Conceived and designed the experiments: TS JW. Performed the experiments: LH YS NK. Analyzed the data: CL NK. Contributed reagents/materials/analysis tools: LH YS CL. Wrote the paper: LH YS NK CL JW TS.

References

1. Brimacombe R, Stiege W (1985) Structure and function of ribosomal RNA. Biochem J 229: 1–17. pmid:3899100
- View Article
- PubMed/NCBI
- Google Scholar
2. Woychik NA, Hampsey M (2002) The RNA polymerase II machinery: structure illuminates function. Cell 108: 453–463. pmid:11909517
- View Article
- PubMed/NCBI
- Google Scholar
3. Zhong X, Tao X, Stombaugh J, Leontis N, Ding B (2007) Tertiary structure and function of an RNA motif required for plant vascular entry to initiate systemic trafficking. EMBO J 26: 3836–3846. pmid:17660743
- View Article
- PubMed/NCBI
- Google Scholar
4. Laing C, Jung S, Kim N, Elmetwaly S, Zahran M, Schlick T (2013) Predicting helical topologies in RNA junctions as tree graphs. PLoS One 8: e71947. pmid:23991010
- View Article
- PubMed/NCBI
- Google Scholar
5. Bindewald E, Hayes R, Yingling YG, Kasprzak W, Shapiro BA (2008) RNAJunction: a database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign. Nucleic Acids Res 36: D392–397. pmid:17947325
- View Article
- PubMed/NCBI
- Google Scholar
6. Ouellet J, Melcher S, Iqbal A, Ding Y, Lilley DM (2010) Structure of the three-way helical junction of the hepatitis C virus IRES element. RNA 16: 1597–1609. pmid:20581129
- View Article
- PubMed/NCBI
- Google Scholar
7. Lilley DM, Clegg RM, Diekmann S, Seeman NC, Von Kitzing E, Hagerman PJ (1995) A nomenclature of junctions and branchpoints in nucleic acids. Nucleic Acids Res 23: 3363–3364. pmid:16617514
- View Article
- PubMed/NCBI
- Google Scholar
8. Liu L, Chen SJ (2012) Coarse-grained prediction of RNA loop structures. PLoS One 7: e48460. pmid:23144887
- View Article
- PubMed/NCBI
- Google Scholar
9. Popovic M, Nelson JD, Schroeder KT, Greenbaum NL (2012) Impact of base pair identity 5' to the spliceosomal branch site adenosine on branch site conformation. RNA 18: 2093–2103. pmid:23002123
- View Article
- PubMed/NCBI
- Google Scholar
10. Yuan F, Griffin L, Phelps L, Buschmann V, Weston K, Greenbaum NL (2007) Use of a novel Forster resonance energy transfer method to identify locations of site-bound metal ions in the U2-U6 snRNA complex. Nucleic Acids Res 35: 2833–2845. pmid:17430967
- View Article
- PubMed/NCBI
- Google Scholar
11. Scott WG, Murray JB, Arnold JR, Stoddard BL, Klug A (1996) Capturing the structure of a catalytic RNA intermediate: the hammerhead ribozyme. Science 274: 2065–2069. pmid:8953035
- View Article
- PubMed/NCBI
- Google Scholar
12. Batey RT, Gilbert SD, Montange RK (2004) Structure of a natural guanine-responsive riboswitch complexed with the metabolite hypoxanthine. Nature 432: 411–415. pmid:15549109
- View Article
- PubMed/NCBI
- Google Scholar
13. Kieft JS, Zhou K, Grech A, Jubin R, Doudna JA (2002) Crystal structure of an RNA tertiary domain essential to HCV IRES-mediated translation initiation. Nat Struct Biol 9: 370–374. pmid:11927953
- View Article
- PubMed/NCBI
- Google Scholar
14. Holbrook SR (2008) Structural principles from large RNAs. Annu Rev Biophys 37: 445–464. pmid:18573090
- View Article
- PubMed/NCBI
- Google Scholar
15. Laing C, Schlick T (2009) Analysis of four-way junctions in RNA structures. J Mol Biol 390: 547–559. pmid:19445952
- View Article
- PubMed/NCBI
- Google Scholar
16. Cohen A, Bocobza S, Veksler I, Gabdank I, Barash D, Aharoni A, et al. (2008) Computational identification of three-way junctions in folded RNAs: a case study in Arabidopsis. In Silico Biol 8: 105–120. pmid:18928199
- View Article
- PubMed/NCBI
- Google Scholar
17. Xin Y, Laing C, Leontis NB, Schlick T (2008) Annotation of tertiary interactions in RNA structures reveals variations and correlations. RNA 14: 2465–2477. pmid:18957492
- View Article
- PubMed/NCBI
- Google Scholar
18. Kim SH, Sussman JL, Suddath FL, Quigley GJ, McPherson A, Wang AH, et al. (1974) The general structure of transfer RNA molecules. Proc Natl Acad Sci U S A 71: 4970–4974. pmid:4612535
- View Article
- PubMed/NCBI
- Google Scholar
19. Butcher SE, Pyle AM (2011) The molecular interactions that stabilize RNA tertiary structure: RNA motifs, patterns, and networks. Acc Chem Res 44: 1302–1311. pmid:21899297
- View Article
- PubMed/NCBI
- Google Scholar
20. Byron K, Laing C, Wen D, Wang JTL (2013) A computational approach to finding RNA tertiary motifs in genomic sequences: a case study. Recent Pat DNA Gene Seq 7: 115–122. pmid:22974261
- View Article
- PubMed/NCBI
- Google Scholar
21. Kim J, Walter AE, Turner DH (1996) Thermodynamics of coaxially stacked helixes with GA and CC mismatches. Biochemistry 35: 13753–13761. pmid:8901517
- View Article
- PubMed/NCBI
- Google Scholar
22. Aalberts DP, Nandagopal N (2010) A two-length-scale polymer theory for RNA loop free energies and helix stacking. RNA 16: 1350–1355. pmid:20504955
- View Article
- PubMed/NCBI
- Google Scholar
23. Shapiro BA, Zhang K (1990) Comparing multiple RNA secondary structures using tree comparisons. Comput Appl Biosci 6: 309–318. pmid:1701685
- View Article
- PubMed/NCBI
- Google Scholar
24. Laing C, Jung S, Iqbal A, Schlick T (2009) Tertiary motifs revealed in analyses of higher-order RNA junctions. J Mol Biol 393: 67–82. pmid:19660472
- View Article
- PubMed/NCBI
- Google Scholar
25. Laing C, Wen D, Wang JTL, Schlick T (2012) Predicting coaxial helical stacking in RNA junctions. Nucleic Acids Res 40: 487–498. pmid:21917853
- View Article
- PubMed/NCBI
- Google Scholar
26. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242. pmid:10592235
- View Article
- PubMed/NCBI
- Google Scholar
27. Ferre F, Ponty Y, Lorenz WA, Clote P (2007) DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities. Nucleic Acids Res 35: W659–668. pmid:17567620
- View Article
- PubMed/NCBI
- Google Scholar
28. Capriotti E, Marti-Renom MA (2009) SARA: a server for function annotation of RNA structures. Nucleic Acids Res 37: W260–265. pmid:19483098
- View Article
- PubMed/NCBI
- Google Scholar
29. Chang YF, Huang YL, Lu CL (2008) SARSA: a web tool for structural alignment of RNA using a structural alphabet. Nucleic Acids Res 36: W19–24. pmid:18502774
- View Article
- PubMed/NCBI
- Google Scholar
30. Wang CW, Chen KT, Lu CL (2010) iPARTS: an improved tool of pairwise alignment of RNA tertiary structures. Nucleic Acids Res 38: W340–347. pmid:20507908
- View Article
- PubMed/NCBI
- Google Scholar
31. Hoksza D, Svozil D (2012) Efficient RNA pairwise structure comparison by SETTER method. Bioinformatics 28: 1858–1864. pmid:22611129
- View Article
- PubMed/NCBI
- Google Scholar
32. Sarver M, Zirbel CL, Stombaugh J, Mokdad A, Leontis NB (2008) FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J Math Biol 56: 215–252. pmid:17694311
- View Article
- PubMed/NCBI
- Google Scholar
33. Dror O, Nussinov R, Wolfson H (2005) ARTS: alignment of RNA tertiary structures. Bioinformatics 21 Suppl 2: ii47–53. pmid:16204124
- View Article
- PubMed/NCBI
- Google Scholar
34. Abraham M, Dror O, Nussinov R, Wolfson HJ (2008) Analysis and classification of RNA tertiary structures. RNA 14: 2274–2289. pmid:18824509
- View Article
- PubMed/NCBI
- Google Scholar
35. Rahrig RR, Leontis NB, Zirbel CL (2010) R3D Align: global pairwise alignment of RNA 3D structures using local superpositions. Bioinformatics 26: 2689–2697. pmid:20929913
- View Article
- PubMed/NCBI
- Google Scholar
36. He G, Steppi A, Laborde J, Srivastava A, Zhao P, Zhang J (2014) RASS: a web server for RNA alignment in the joint sequence-structure space. Nucleic Acids Res 42: W377–381. pmid:24831547
- View Article
- PubMed/NCBI
- Google Scholar
37. Liu J, Wang JTL, Hu J, Tian B (2005) A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 6: 89. pmid:15817128
- View Article
- PubMed/NCBI
- Google Scholar
38. Jiang T, Lin G, Ma B, Zhang K (2002) A general edit distance between RNA structures. J Comput Biol 9: 371–388. pmid:12015887
- View Article
- PubMed/NCBI
- Google Scholar
39. Hochsmann M, Voss B, Giegerich R (2004) Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE/ACM Trans Comput Biol Bioinform 1: 53–62. pmid:17048408
- View Article
- PubMed/NCBI
- Google Scholar
40. Chen S, Zhang K (2014) An improved algorithm for tree edit distance with applications for RNA secondary structure comparison. J Comb Optim 27: 778–797.
- View Article
- Google Scholar
41. Havgaard JH, Torarinsson E, Gorodkin J (2007) Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput Biol 3: 1896–1908. pmid:17937495
- View Article
- PubMed/NCBI
- Google Scholar
42. Mathews DH, Turner DH (2002) Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 317: 191–203. pmid:11902836
- View Article
- PubMed/NCBI
- Google Scholar
43. Sato K, Kato Y, Akutsu T, Asai K, Sakakibara Y (2012) DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition. Bioinformatics 28: 3218–3224. pmid:23060618
- View Article
- PubMed/NCBI
- Google Scholar
44. Hamada M, Sato K, Kiryu H, Mituyama T, Asai K (2009) CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score. Bioinformatics 25: 3236–3243. pmid:19808876
- View Article
- PubMed/NCBI
- Google Scholar
45. Meyer IM, Miklos I (2007) SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput Biol 3: e149. pmid:17696604
- View Article
- PubMed/NCBI
- Google Scholar
46. Tabei Y, Tsuda K, Kin T, Asai K (2006) SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments. Bioinformatics 22: 1723–1729. pmid:16690634
- View Article
- PubMed/NCBI
- Google Scholar
47. Walter F, Murchie AI, Duckett DR, Lilley DM (1998) Global structure of four-way RNA junctions studied using fluorescence resonance energy transfer. RNA 4: 719–728. pmid:9622130
- View Article
- PubMed/NCBI
- Google Scholar
48. Hernandez-Verdun D, Roussel P, Thiry M, Sirri V, Lafontaine DL (2010) The nucleolus: structure/function relationship in RNA metabolism. Wiley Interdiscip Rev RNA 1: 415–431. pmid:21956940
- View Article
- PubMed/NCBI
- Google Scholar
49. Bindewald E, Wendeler M, Legiewicz M, Bona MK, Wang Y, Pritt MJ, et al. (2011) Correlating SHAPE signatures with three-dimensional RNA structures. RNA 17: 1688–1696. pmid:21752927
- View Article
- PubMed/NCBI
- Google Scholar
50. Jiang T, Wang L, Zhang K (1995) Alignment of trees—an alternative to tree edit. Theoretical Computer Science 143: 137–148.
- View Article
- Google Scholar
51. Wang L, Zhao J (2003) Parametric alignment of ordered trees. Bioinformatics 19: 2237–2245. pmid:14630652
- View Article
- PubMed/NCBI
- Google Scholar
52. Kim N, Laing C, Elmetwaly S, Jung S, Curuksu J, Schlick T (2014) Graph-based sampling for approximating global helical topologies of RNA. Proc Natl Acad Sci U S A 111: 4079–4084. pmid:24591615
- View Article
- PubMed/NCBI
- Google Scholar
53. Yang H, Jossinet F, Leontis N, Chen L, Westbrook J, Berman H, et al. (2003) Tools for the automatic identification and classification of RNA base pairs. Nucleic Acids Research 31: 3450–3460. pmid:12824344
- View Article
- PubMed/NCBI
- Google Scholar
54. Shang L, Xu W, Ozer S, Gutell RR (2012) Structural constraints identified with covariation analysis in ribosomal RNA. PLoS One 7: e39383. pmid:22724009
- View Article
- PubMed/NCBI
- Google Scholar
55. Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4: 44. pmid:14499004
- View Article
- PubMed/NCBI
- Google Scholar
56. Gardner PP, Wilm A, Washietl S (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33: 2433–2439. pmid:15860779
- View Article
- PubMed/NCBI
- Google Scholar
57. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, et al. (2013) Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41: D226–232. pmid:23125362
- View Article
- PubMed/NCBI
- Google Scholar
58. Andronescu M, Bereg V, Hoos HH, Condon A (2008) RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinformatics 9: 340. pmid:18700982
- View Article
- PubMed/NCBI
- Google Scholar
59. Smit S, Rother K, Heringa J, Knight R (2008) From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. RNA 14: 410–416. pmid:18230758
- View Article
- PubMed/NCBI
- Google Scholar
60. Goody TA, Lilley DM, Norman DG (2004) The chirality of a four-way helical junction in RNA. J Am Chem Soc 126: 4126–4127. pmid:15053600
- View Article
- PubMed/NCBI
- Google Scholar
61. Lafontaine DA, Norman DG, Lilley DM (2001) Structure, folding and activity of the VS ribozyme: importance of the 2-3-6 helical junction. EMBO J 20: 1415–1424. pmid:11250907
- View Article
- PubMed/NCBI
- Google Scholar
62. Lescoute A, Westhof E (2006) Topology of three-way junctions in folded RNAs. RNA 12: 83–93. pmid:16373494
- View Article
- PubMed/NCBI
- Google Scholar
63. Zirbel CL, Roll J, Sweeney BA, Petrov AI, Pirrung M, Leontis NB (2015) Identifying novel sequence variants of RNA 3D motifs. Nucleic Acids Research 43: 7504–7520. pmid:26130723
- View Article
- PubMed/NCBI
- Google Scholar
64. Petrov AI, Zirbel CL, Leontis NB (2013) Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. RNA 19: 1327–1340. pmid:23970545
- View Article
- PubMed/NCBI
- Google Scholar
65. Petrov AI, Zirbel CL, Leontis NB (2011) WebFR3D—a server for finding, aligning and analyzing recurrent RNA 3D motifs. Nucleic Acids Res 39: W50–55. pmid:21515634
- View Article
- PubMed/NCBI
- Google Scholar
66. Zirbel CL, Sponer JE, Sponer J, Stombaugh J, Leontis NB (2009) Classification and energetics of the base-phosphate interactions in RNA. Nucleic Acids Res 37: 4898–4918. pmid:19528080
- View Article
- PubMed/NCBI
- Google Scholar
67. Kim N, Zahran M, Schlick T (2015) Computational prediction of riboswitch tertiary structures including pseudoknots by RAGTOP: a hierarchical graph sampling approach. Methods Enzymol 553: 115–135. pmid:25726463
- View Article
- PubMed/NCBI
- Google Scholar
68. McGinnis JL, Dunkle JA, Cate JH, Weeks KM (2012) The mechanisms of RNA SHAPE chemistry. J Am Chem Soc 134: 6617–6624. pmid:22475022
- View Article
- PubMed/NCBI
- Google Scholar
69. Yang S, Parisien M, Major F, Roux B (2010) RNA structure determination using SAXS data. J Phys Chem B 114: 10039–10048. pmid:20684627
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Brimacombe R, Stiege W (1985) Structure and function of ribosomal RNA. Biochem J 229: 1–17. pmid:3899100
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Woychik NA, Hampsey M (2002) The RNA polymerase II machinery: structure illuminates function. Cell 108: 453–463. pmid:11909517
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Zhong X, Tao X, Stombaugh J, Leontis N, Ding B (2007) Tertiary structure and function of an RNA motif required for plant vascular entry to initiate systemic trafficking. EMBO J 26: 3836–3846. pmid:17660743
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Laing C, Jung S, Kim N, Elmetwaly S, Zahran M, Schlick T (2013) Predicting helical topologies in RNA junctions as tree graphs. PLoS One 8: e71947. pmid:23991010
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Bindewald E, Hayes R, Yingling YG, Kasprzak W, Shapiro BA (2008) RNAJunction: a database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign. Nucleic Acids Res 36: D392–397. pmid:17947325
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Ouellet J, Melcher S, Iqbal A, Ding Y, Lilley DM (2010) Structure of the three-way helical junction of the hepatitis C virus IRES element. RNA 16: 1597–1609. pmid:20581129
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Lilley DM, Clegg RM, Diekmann S, Seeman NC, Von Kitzing E, Hagerman PJ (1995) A nomenclature of junctions and branchpoints in nucleic acids. Nucleic Acids Res 23: 3363–3364. pmid:16617514
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Liu L, Chen SJ (2012) Coarse-grained prediction of RNA loop structures. PLoS One 7: e48460. pmid:23144887
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Popovic M, Nelson JD, Schroeder KT, Greenbaum NL (2012) Impact of base pair identity 5' to the spliceosomal branch site adenosine on branch site conformation. RNA 18: 2093–2103. pmid:23002123
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Yuan F, Griffin L, Phelps L, Buschmann V, Weston K, Greenbaum NL (2007) Use of a novel Forster resonance energy transfer method to identify locations of site-bound metal ions in the U2-U6 snRNA complex. Nucleic Acids Res 35: 2833–2845. pmid:17430967
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Scott WG, Murray JB, Arnold JR, Stoddard BL, Klug A (1996) Capturing the structure of a catalytic RNA intermediate: the hammerhead ribozyme. Science 274: 2065–2069. pmid:8953035
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Batey RT, Gilbert SD, Montange RK (2004) Structure of a natural guanine-responsive riboswitch complexed with the metabolite hypoxanthine. Nature 432: 411–415. pmid:15549109
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Kieft JS, Zhou K, Grech A, Jubin R, Doudna JA (2002) Crystal structure of an RNA tertiary domain essential to HCV IRES-mediated translation initiation. Nat Struct Biol 9: 370–374. pmid:11927953
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Holbrook SR (2008) Structural principles from large RNAs. Annu Rev Biophys 37: 445–464. pmid:18573090
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Laing C, Schlick T (2009) Analysis of four-way junctions in RNA structures. J Mol Biol 390: 547–559. pmid:19445952
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref16] 16. Cohen A, Bocobza S, Veksler I, Gabdank I, Barash D, Aharoni A, et al. (2008) Computational identification of three-way junctions in folded RNAs: a case study in Arabidopsis. In Silico Biol 8: 105–120. pmid:18928199
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref17] 17. Xin Y, Laing C, Leontis NB, Schlick T (2008) Annotation of tertiary interactions in RNA structures reveals variations and correlations. RNA 14: 2465–2477. pmid:18957492
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref18] 18. Kim SH, Sussman JL, Suddath FL, Quigley GJ, McPherson A, Wang AH, et al. (1974) The general structure of transfer RNA molecules. Proc Natl Acad Sci U S A 71: 4970–4974. pmid:4612535
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref19] 19. Butcher SE, Pyle AM (2011) The molecular interactions that stabilize RNA tertiary structure: RNA motifs, patterns, and networks. Acc Chem Res 44: 1302–1311. pmid:21899297
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref20] 20. Byron K, Laing C, Wen D, Wang JTL (2013) A computational approach to finding RNA tertiary motifs in genomic sequences: a case study. Recent Pat DNA Gene Seq 7: 115–122. pmid:22974261
View Article
PubMed/NCBI
Google Scholar

[78] View Article

[79] PubMed/NCBI

[80] Google Scholar

[ref21] 21. Kim J, Walter AE, Turner DH (1996) Thermodynamics of coaxially stacked helixes with GA and CC mismatches. Biochemistry 35: 13753–13761. pmid:8901517
View Article
PubMed/NCBI
Google Scholar

[82] View Article

[83] PubMed/NCBI

[84] Google Scholar

[ref22] 22. Aalberts DP, Nandagopal N (2010) A two-length-scale polymer theory for RNA loop free energies and helix stacking. RNA 16: 1350–1355. pmid:20504955
View Article
PubMed/NCBI
Google Scholar

[86] View Article

[87] PubMed/NCBI

[88] Google Scholar

[ref23] 23. Shapiro BA, Zhang K (1990) Comparing multiple RNA secondary structures using tree comparisons. Comput Appl Biosci 6: 309–318. pmid:1701685
View Article
PubMed/NCBI
Google Scholar

[90] View Article

[91] PubMed/NCBI

[92] Google Scholar

[ref24] 24. Laing C, Jung S, Iqbal A, Schlick T (2009) Tertiary motifs revealed in analyses of higher-order RNA junctions. J Mol Biol 393: 67–82. pmid:19660472
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref25] 25. Laing C, Wen D, Wang JTL, Schlick T (2012) Predicting coaxial helical stacking in RNA junctions. Nucleic Acids Res 40: 487–498. pmid:21917853
View Article
PubMed/NCBI
Google Scholar

[98] View Article

[99] PubMed/NCBI

[100] Google Scholar

[ref26] 26. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242. pmid:10592235
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref27] 27. Ferre F, Ponty Y, Lorenz WA, Clote P (2007) DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities. Nucleic Acids Res 35: W659–668. pmid:17567620
View Article
PubMed/NCBI
Google Scholar

[106] View Article

[107] PubMed/NCBI

[108] Google Scholar

[ref28] 28. Capriotti E, Marti-Renom MA (2009) SARA: a server for function annotation of RNA structures. Nucleic Acids Res 37: W260–265. pmid:19483098
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref29] 29. Chang YF, Huang YL, Lu CL (2008) SARSA: a web tool for structural alignment of RNA using a structural alphabet. Nucleic Acids Res 36: W19–24. pmid:18502774
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref30] 30. Wang CW, Chen KT, Lu CL (2010) iPARTS: an improved tool of pairwise alignment of RNA tertiary structures. Nucleic Acids Res 38: W340–347. pmid:20507908
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref31] 31. Hoksza D, Svozil D (2012) Efficient RNA pairwise structure comparison by SETTER method. Bioinformatics 28: 1858–1864. pmid:22611129
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref32] 32. Sarver M, Zirbel CL, Stombaugh J, Mokdad A, Leontis NB (2008) FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J Math Biol 56: 215–252. pmid:17694311
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref33] 33. Dror O, Nussinov R, Wolfson H (2005) ARTS: alignment of RNA tertiary structures. Bioinformatics 21 Suppl 2: ii47–53. pmid:16204124
View Article
PubMed/NCBI
Google Scholar

[130] View Article

[131] PubMed/NCBI

[132] Google Scholar

[ref34] 34. Abraham M, Dror O, Nussinov R, Wolfson HJ (2008) Analysis and classification of RNA tertiary structures. RNA 14: 2274–2289. pmid:18824509
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

[ref35] 35. Rahrig RR, Leontis NB, Zirbel CL (2010) R3D Align: global pairwise alignment of RNA 3D structures using local superpositions. Bioinformatics 26: 2689–2697. pmid:20929913
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref36] 36. He G, Steppi A, Laborde J, Srivastava A, Zhao P, Zhang J (2014) RASS: a web server for RNA alignment in the joint sequence-structure space. Nucleic Acids Res 42: W377–381. pmid:24831547
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref37] 37. Liu J, Wang JTL, Hu J, Tian B (2005) A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 6: 89. pmid:15817128
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref38] 38. Jiang T, Lin G, Ma B, Zhang K (2002) A general edit distance between RNA structures. J Comput Biol 9: 371–388. pmid:12015887
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref39] 39. Hochsmann M, Voss B, Giegerich R (2004) Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE/ACM Trans Comput Biol Bioinform 1: 53–62. pmid:17048408
View Article
PubMed/NCBI
Google Scholar

[154] View Article

[155] PubMed/NCBI

[156] Google Scholar

[ref40] 40. Chen S, Zhang K (2014) An improved algorithm for tree edit distance with applications for RNA secondary structure comparison. J Comb Optim 27: 778–797.
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref41] 41. Havgaard JH, Torarinsson E, Gorodkin J (2007) Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput Biol 3: 1896–1908. pmid:17937495
View Article
PubMed/NCBI
Google Scholar

[161] View Article

[162] PubMed/NCBI

[163] Google Scholar

[ref42] 42. Mathews DH, Turner DH (2002) Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 317: 191–203. pmid:11902836
View Article
PubMed/NCBI
Google Scholar

[165] View Article

[166] PubMed/NCBI

[167] Google Scholar

[ref43] 43. Sato K, Kato Y, Akutsu T, Asai K, Sakakibara Y (2012) DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition. Bioinformatics 28: 3218–3224. pmid:23060618
View Article
PubMed/NCBI
Google Scholar

[169] View Article

[170] PubMed/NCBI

[171] Google Scholar

[ref44] 44. Hamada M, Sato K, Kiryu H, Mituyama T, Asai K (2009) CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score. Bioinformatics 25: 3236–3243. pmid:19808876
View Article
PubMed/NCBI
Google Scholar

[173] View Article

[174] PubMed/NCBI

[175] Google Scholar

[ref45] 45. Meyer IM, Miklos I (2007) SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput Biol 3: e149. pmid:17696604
View Article
PubMed/NCBI
Google Scholar

[177] View Article

[178] PubMed/NCBI

[179] Google Scholar

[ref46] 46. Tabei Y, Tsuda K, Kin T, Asai K (2006) SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments. Bioinformatics 22: 1723–1729. pmid:16690634
View Article
PubMed/NCBI
Google Scholar

[181] View Article

[182] PubMed/NCBI

[183] Google Scholar

[ref47] 47. Walter F, Murchie AI, Duckett DR, Lilley DM (1998) Global structure of four-way RNA junctions studied using fluorescence resonance energy transfer. RNA 4: 719–728. pmid:9622130
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref48] 48. Hernandez-Verdun D, Roussel P, Thiry M, Sirri V, Lafontaine DL (2010) The nucleolus: structure/function relationship in RNA metabolism. Wiley Interdiscip Rev RNA 1: 415–431. pmid:21956940
View Article
PubMed/NCBI
Google Scholar

[189] View Article

[190] PubMed/NCBI

[191] Google Scholar

[ref49] 49. Bindewald E, Wendeler M, Legiewicz M, Bona MK, Wang Y, Pritt MJ, et al. (2011) Correlating SHAPE signatures with three-dimensional RNA structures. RNA 17: 1688–1696. pmid:21752927
View Article
PubMed/NCBI
Google Scholar

[193] View Article

[194] PubMed/NCBI

[195] Google Scholar

[ref50] 50. Jiang T, Wang L, Zhang K (1995) Alignment of trees—an alternative to tree edit. Theoretical Computer Science 143: 137–148.
View Article
Google Scholar

[197] View Article

[198] Google Scholar

[ref51] 51. Wang L, Zhao J (2003) Parametric alignment of ordered trees. Bioinformatics 19: 2237–2245. pmid:14630652
View Article
PubMed/NCBI
Google Scholar

[200] View Article

[201] PubMed/NCBI

[202] Google Scholar

[ref52] 52. Kim N, Laing C, Elmetwaly S, Jung S, Curuksu J, Schlick T (2014) Graph-based sampling for approximating global helical topologies of RNA. Proc Natl Acad Sci U S A 111: 4079–4084. pmid:24591615
View Article
PubMed/NCBI
Google Scholar

[204] View Article

[205] PubMed/NCBI

[206] Google Scholar

[ref53] 53. Yang H, Jossinet F, Leontis N, Chen L, Westbrook J, Berman H, et al. (2003) Tools for the automatic identification and classification of RNA base pairs. Nucleic Acids Research 31: 3450–3460. pmid:12824344
View Article
PubMed/NCBI
Google Scholar

[208] View Article

[209] PubMed/NCBI

[210] Google Scholar

[ref54] 54. Shang L, Xu W, Ozer S, Gutell RR (2012) Structural constraints identified with covariation analysis in ribosomal RNA. PLoS One 7: e39383. pmid:22724009
View Article
PubMed/NCBI
Google Scholar

[212] View Article

[213] PubMed/NCBI

[214] Google Scholar

[ref55] 55. Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4: 44. pmid:14499004
View Article
PubMed/NCBI
Google Scholar

[216] View Article

[217] PubMed/NCBI

[218] Google Scholar

[ref56] 56. Gardner PP, Wilm A, Washietl S (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33: 2433–2439. pmid:15860779
View Article
PubMed/NCBI
Google Scholar

[220] View Article

[221] PubMed/NCBI

[222] Google Scholar

[ref57] 57. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, et al. (2013) Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41: D226–232. pmid:23125362
View Article
PubMed/NCBI
Google Scholar

[224] View Article

[225] PubMed/NCBI

[226] Google Scholar

[ref58] 58. Andronescu M, Bereg V, Hoos HH, Condon A (2008) RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinformatics 9: 340. pmid:18700982
View Article
PubMed/NCBI
Google Scholar

[228] View Article

[229] PubMed/NCBI

[230] Google Scholar

[ref59] 59. Smit S, Rother K, Heringa J, Knight R (2008) From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. RNA 14: 410–416. pmid:18230758
View Article
PubMed/NCBI
Google Scholar

[232] View Article

[233] PubMed/NCBI

[234] Google Scholar

[ref60] 60. Goody TA, Lilley DM, Norman DG (2004) The chirality of a four-way helical junction in RNA. J Am Chem Soc 126: 4126–4127. pmid:15053600
View Article
PubMed/NCBI
Google Scholar

[236] View Article

[237] PubMed/NCBI

[238] Google Scholar

[ref61] 61. Lafontaine DA, Norman DG, Lilley DM (2001) Structure, folding and activity of the VS ribozyme: importance of the 2-3-6 helical junction. EMBO J 20: 1415–1424. pmid:11250907
View Article
PubMed/NCBI
Google Scholar

[240] View Article

[241] PubMed/NCBI

[242] Google Scholar

[ref62] 62. Lescoute A, Westhof E (2006) Topology of three-way junctions in folded RNAs. RNA 12: 83–93. pmid:16373494
View Article
PubMed/NCBI
Google Scholar

[244] View Article

[245] PubMed/NCBI

[246] Google Scholar

[ref63] 63. Zirbel CL, Roll J, Sweeney BA, Petrov AI, Pirrung M, Leontis NB (2015) Identifying novel sequence variants of RNA 3D motifs. Nucleic Acids Research 43: 7504–7520. pmid:26130723
View Article
PubMed/NCBI
Google Scholar

[248] View Article

[249] PubMed/NCBI

[250] Google Scholar

[ref64] 64. Petrov AI, Zirbel CL, Leontis NB (2013) Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. RNA 19: 1327–1340. pmid:23970545
View Article
PubMed/NCBI
Google Scholar

[252] View Article

[253] PubMed/NCBI

[254] Google Scholar

[ref65] 65. Petrov AI, Zirbel CL, Leontis NB (2011) WebFR3D—a server for finding, aligning and analyzing recurrent RNA 3D motifs. Nucleic Acids Res 39: W50–55. pmid:21515634
View Article
PubMed/NCBI
Google Scholar

[256] View Article

[257] PubMed/NCBI

[258] Google Scholar

[ref66] 66. Zirbel CL, Sponer JE, Sponer J, Stombaugh J, Leontis NB (2009) Classification and energetics of the base-phosphate interactions in RNA. Nucleic Acids Res 37: 4898–4918. pmid:19528080
View Article
PubMed/NCBI
Google Scholar

[260] View Article

[261] PubMed/NCBI

[262] Google Scholar

[ref67] 67. Kim N, Zahran M, Schlick T (2015) Computational prediction of riboswitch tertiary structures including pseudoknots by RAGTOP: a hierarchical graph sampling approach. Methods Enzymol 553: 115–135. pmid:25726463
View Article
PubMed/NCBI
Google Scholar

[264] View Article

[265] PubMed/NCBI

[266] Google Scholar

[ref68] 68. McGinnis JL, Dunkle JA, Cate JH, Weeks KM (2012) The mechanisms of RNA SHAPE chemistry. J Am Chem Soc 134: 6617–6624. pmid:22475022
View Article
PubMed/NCBI
Google Scholar

[268] View Article

[269] PubMed/NCBI

[270] Google Scholar

[ref69] 69. Yang S, Parisien M, Major F, Roux B (2010) RNA structure determination using SAXS data. J Phys Chem B 114: 10039–10048. pmid:20684627
View Article
PubMed/NCBI
Google Scholar

[272] View Article

[273] PubMed/NCBI

[274] Google Scholar

Figures

Abstract

Introduction

Materials and Methods

Tree model formalism

Alignment scheme

Time and space complexity

Data sets

Results and Discussion

Two CHSalign web server versions

Performance evaluation using RMSD

Performance evaluation using precision

Potential application of CHSalign

Conclusions

Supporting Information

S1 File. Results obtained by aligning five pairs of riboswitches from Table 2 using CHSalign_p.

S2 File. Results obtained by aligning two pairs of riboswitches from Table 2 using CHSalign_u.

Acknowledgments

Author Contributions

References