On the Normalization of the Minimum Free Energy of RNAs by Sequence Length

doi:10.1371/journal.pone.0113380

Figure 1.

MFE and AMFE versus sequence length.

For each sequence length, containing an exact equal frequency of the four nucleotides, 1000 randomly shuffled sequences were simulated. The mean values of the MFE (open circles) and AMFE (closed circles) of the shuffled sequences are plotted versus the sequence length. Vertical bars indicate standard deviations (N = 1000). MFE was computed by RNAfold using default parameters.

More »

Expand

Figure 2.

Free energy contributions of RNA structural elements.

The free energy contributions of the different structural elements calculated by Quickfold are plotted versus sequence length: external loop (closed diamonds), hairpin loop (open circles), helix (closed circles), bulged loop (X), multi-loop (open squares), and interior loop (plus). The upper panel shows the contributions of structural elements to MFE and the lower panel the contributions to AMFE.

More »

Expand

Figure 3.

Minimum folding energy of randomly shuffled sequences.

MFE (panel A) and AMFE (panel C) versus length at different GC-content: 20%, 40%, 50%, 60%, and 80%. MFE (panel B) and AMFE (panel D) versus GC-content for different sequence lengths: 20 nt, 100 nt, 200 nt, 300 nt, 400 nt, 500 nt, and 600 nt. Vertical bars indicate standard deviations (N = 100).

More »

Expand

Figure 4.

Residual plot from the linear fit of MFE versus length.

Residual plot of the linear regression analysis of MFE versus sequence length. The MFE assigned to each length corresponds to the mean value of 1000 shuffled sequences with exact equimolar ratios of A, C, G, and U. Residuals are the differences between the computed MFEs and the corresponding values that are predicted by a linear regression analysis of MFEs with length.

More »

Expand

Figure 5.

MFEden versus length.

Plot of the mean MFEden versus length for shuffled sequences with GC-content of 20%, 40%, 50%, 60% and 80%, and for sequence lengths ranging between 40 and 600 nt with steps of 20 nt. Each point corresponds to the mean value of 100 shufflings. The lines connect MFEden values with the same GC-content. Vertical lines indicate standard deviation (N = 100).

More »

Expand

Figure 6.

MFEden and AMFE versus length.

Comparison of MFEden (black points) and AMFE (grey open circles) for shuffled sequences with GC-content of 20%, 40%, 50%, 60% and 80%, and for sequence lengths ranging between 40 and 600 nt with steps of 20 nt. Each point corresponds to the mean value of 100 shufflings. The lines connect values with the same GC-content.

More »

Expand

Figure 7.

MFEden of human CDSs and pre-miRNA.

MFEden of CDSs (red circles) and pre-miRNA (blue circles) are plotted versus sequence length (panel A) and GC-content (panel B). Black symbols indicate the mean MFEden values computed from shuffled sequences: GC-content: 20% (circle), 40% (plus), 50% (square), 60% (×), and 80% (triangle). A horizontal broken line indicates the MFEden level expected for the genomic GC-content.

More »

Expand

Figure 8.

MFEden of human RNA families.

The MFEden of the functional RNA families SRP RNAs (black points), U6 snRNAs (blue squares), RSV RNAs (red Xs), and H/ACA box RNAs (green triangles) plotted versus the sequence length (panel A) and the GC-content (panel B). Orange points indicate the 17 human SRP seed sequences of Rfam database.

More »

Expand

Figure 9.

MFEden of 14 human functional RNA families.

Bar plot showing the mean MFEden of 14 human functional RNA families (grey bars) compared with the mean MFEden of shuffled sequences with GC-content equal to 20%, 40%, 50%, 60% and 80% (white bars), the mean MFEden of 2400, 100 nt-long, genomic sequences taken at random and the MFEden expected for the genomic GC-content (black bars). The vertical bars indicate the standard errors of the means.

More »

Expand