Skip to main content
Advertisement

< Back to Article

Fig 1.

Secondary structure and module.

In (1) we show an RNA and its secondary structure with non-canonical interactions. Base pair interactions in blue are local (both nucleotides involved are in the same or in adjacent SSEs) while the ones in red are long range interactions (between two distant SSEs). The canonical base pair interactions are represented with double lines. We highlighted the loops in the structure with green dotted lines. Loops A and C are hairpins, loops D and E are interior loops, and loop B is a multi-loop. In (2) we show an instance of a module found in the RNA secondary structure in (1). On the right is the base pair pattern that characterizes this module and on the left is the sequence profile of this module (i.e. the nucleotide sequences of the corresponding parts of RNAs this module has been observed in). The first sequence in the profile, for instance, corresponds to the RNA displayed in (1).

More »

Fig 1 Expand

Fig 2.

From 3D structure to directed edge-labelled graph.

In this figure we illustrate the transition from the 3D structure (a) to RNA 2D structure graph (b) and finally directed edge-labelled graph (c) with a simple RNA structure. Each edge label of the directed edge-labelled graph is a pair which first element represents the type of interaction (using the same symbols as in the RNA 2D structure graph) while the second denotes the local (blue) vs. long-range (red) property of the interaction (using the same colors as in the RNA 2D structure graph). Moreover, the set of edge labels forms a directed proper edge-coloring, as illustrated with the last panel (d) where each different geometric type of interaction has been associated a color. Note that panel (d) is only an illustration of the edge labels forming a proper edge-coloring as our method does not actually replace the labels by colors.

More »

Fig 2 Expand

Fig 3.

Impact of proper edge-coloring on graph-matching.

This figure displays a piece of two graphs (G on the left and H on the right) in which the nodes 0 and a are already matched together. The next step is to match their neighbours. In the generic case, all permutations have to be tested. On the contrary, in the example displayed, the colors of the edges limit the options to consider to a single one.

More »

Fig 3 Expand

Fig 4.

Illustration of the extension process.

This figure illustrates the extension process from a “starting point” (here ((g0, h0), (g0, h0)), in blue). We first consider the neighbors of g0 and h0 (in purple). Thanks to the PEC, there is only one way to match them. We then consider the neighbors of g1 and h1 (in green). We match g5 and h5 but discover that their neighborhoods are not compatible. At this point the behaviours of the three algorithms differ. This discovery implies that the matching cannot be extended to cover all of G so the Graph Isomorphism and Subgraph Isomorphism will abandon it and pass on to another “starting point”. The All Maximal Common Subgraphs on the contrary will take note of this discrepancy and keep extending the matching nevertheless. This extension will output a maximal common subgraph of G and H and a new branch will be created to explore the alternative solution suggested by the discrepancy found.

More »

Fig 4 Expand

Fig 5.

Exploration tree with backtracking.

This figure displays the exploration tree representing a posteriori the relation between the different branches created. In this tree, the root is a starting point (i.e. the nodes that are already matched at the start of an exploration) and each leaf is a different maximal common subgraph. Each path from the root to a leaf describes an exploration. For instance, the node (14,20) of the exploration tree corresponds to the action of matching the node 14 from G to the node 20 of H. All the leafs in the right subtree have matched 14 to 20 and all the ones in the left subtree have not. Note that only the nodes with a left child are represented, all other nodes have been collapsed since they bear no information about the exploration process. The first exploration always produces the right most maximal common subgraph. In this exemple, the first exploration encountered two conflicts and the algorithm thus produced two new branches which respectively were instructed not to add (24,26) and not to add (14,20). The first of the two produced another maximal common subgraph without any trouble but the second encountered another conflict and so on and so forth.

More »

Fig 5 Expand

Fig 6.

Simplified display of the full pipeline.

The RNA 2D structure graphs given as input are pre-processed for the sake of optimization. Each pair of graphs in the pre-processed data is then given to the maximal common subgraphs algorithm as input and the output is post-processed into partial sets of . All partial sets of are finally merged into the complete set of which is the output of the whole pipeline.

More »

Fig 6 Expand

Fig 7.

Examples of structures to illustrate the three RIN classes.

Those three graphs displayed inside a Venn diagram are subgraphs of Fig 1 with the same SSEs annotations (SSEs D,C and E figured with colored areas). Graph #1 is valid for all three classes. Graph #2 spans over 3 SSEs and so cannot be a valid RINabc. Graph #3 does not contain long-range interactions and thus is only valid for class RINa.

More »

Fig 7 Expand

Table 1.

Rules and RIN classes.

Summary of the relation between the rules and the three RIN classes.

More »

Table 1 Expand

Fig 8.

Distribution of in Dataset 2.92.

Numbers of distinct (in blue) and all their occurrences (in green) over the different numbers of SSEs they span over in Dataset 2.92.

More »

Fig 8 Expand

Table 2.

and variation on SSEs span.

For each RINab we compute how the number of SSEs covered varies between the occurrences. A value of 0 means that all occurrences are over the same number of SSEs while ±1 (resp. ±2) means that the RINab can span two different number of SSEs (resp. three).

More »

Table 2 Expand

Fig 9.

Distribution of in Dataset 2.92.

Numbers of distinct (in red) and all their occurrences (in rose) over the different numbers of long range interactions they contain in Dataset 2.92.

More »

Fig 9 Expand

Fig 10.

Distribution of .

Numbers of distinct (in blue) and all their occurrences (in green) over the different numbers of SSEs they span over in Dataset 2.92.

More »

Fig 10 Expand

Table 3.

Variation in the number of SSEs over the occurrences of the same RINa.

(Cf. Table 2). Those numbers show that the variation in the number of SSEs amongst the occurrences of a given RINa is both uncommon and limited, even more than with RINab, albeit slightly (82% of with no variation vs 78% of ).

More »

Table 3 Expand

Table 4.

Summary of numbers of unique RINs found in the different classes with the total numbers of occurrences.

Please note that this table also displays the numbers for the RINa class in Dataset 3.137 that we will present in section 3.4.

More »

Table 4 Expand

Table 5.

Runtimes over 20 CPUs.

This table displays the runtime of previous method (CaRNAval) and our method (others rows) for different classes of structures extracted. The values have been measured with the linux time command and are real CPU times i.e. clock time elapsed between the start and the end of the execution. All runs have been performed on the same machine.

More »

Table 5 Expand

Table 6.

Number of occurrences found in Dataset 2.92 and Dataset 3.137 for 5 structures of interest.

The 5 structures of interest are denoted using both their name in the litterature (first column) and their ID in our database (second column). Note that it is the same ID displayed in CaRNAval.

More »

Table 6 Expand

Fig 11.

Distribution of in Dataset 3.137.

Numbers of distinct (in red) and all their occurrences (in rose) over the different numbers of long range interactions they contain in Dataset 3.13.

More »

Fig 11 Expand

Fig 12.

Distribution of in Dataset 3.137.

Numbers of distinct (in blue) and all their occurrences (in green) over the different numbers of SSEs they span over in Dataset 3.13.

More »

Fig 12 Expand

Fig 13.

3 Largest in their contexts.

The figure displays three 3D structures of ribosomal RNAs: 4Y4O (chain: 2A), 5J7L (chain: DA) and 6SPB (chain: A). The colored parts correspond to the 3 largest found in Dataset 3.137: RINa#1984 in red, RINa#1983 in cyan and RINa#1982 in lime green. The overlap of two is colored in indigo. Additional information about those and their overlap is provided in Table 7.

More »

Fig 13 Expand

Table 7.

Additional information on the 3 largest found in Dataset 3.137.

The colors correspond to ones used in Fig 13. The values for the overlaps correspond to the number of nodes shared between the . The RNA chains are denoted using the name of the file (ex:4Y4O) plus the name of the chain (ex:2A).

More »

Table 7 Expand

Fig 14.

Kink-turn found in Dataset 3.137 with our method.

More »

Fig 14 Expand