Fig 1.
(A) Schematic representation of the P-site offset. Two offsets can be defined, one for each extremity of the read. (B) Flowchart representing the basic steps of riboWaltz, the input requirements and the outputs. (C) An example of ribosome occupancy profile obtained from the alignment of the 5’ and the 3’ end of reads around the start codon (reads length, 28 nucleotides) is superimposed to the schematic representations of a transcript, a ribosome positioned on the translation initiation site (TIS) and a set of reads used for generating the profiles.
Fig 2.
(A) Distribution of the read lengths. (B) Left, percentage of P-sites in the 5’ UTR, CDS and 3’ UTR of mRNAs from ribosome profiling data. Right, percentage of region lengths in mRNAs sequences. (C) Percentage of P-sites in the three frames along the 5’ UTR, CDS and 3’ UTR, stratified for read length. (D) Example of meta-gene heatmap reporting the signal associated to the 5’ end (upper panel) and 3’ end (lower panel) of the reads aligning around the start and the stop codon for different read lengths. (E) Codon usage analysis based on in-frame P-sites. The codon usage index is calculated as the frequency of in-frame P-sites along the coding sequence associated to each codon, normalized for codon frequency in sequences. The amino-acids corresponding to the codons are displayed above each bar. All panels were obtained from ribosome profiling of whole mouse brain (GSE102318).
Fig 3.
(A) Percentage of P-sites in the three frames (Periodicity score) along the 5’ UTR, CDS and 3’ UTR from ribosome profiling performed in mouse brain (GSE102318). The statistical significances from two-tailed Wilcoxon–Mann–Whitney test comparing RiboProfiling and Plastid with respect to riboWaltz are reported (P-value: ** < 0.01, *** < 0.001). (B) Meta-profiles showing the periodicity of ribosomes along the transcripts at the genome-wide scale. The three metaprofiles are based on the P-site identification obtained by using riboWaltz, RiboProfiling and Plastid. The shaded areas to the left of the start codon highlight the shift of the periodicity toward the 5’ UTR that is absent in the case of data analysed using riboWaltz. (C) Comparison between the codon usage index based on in-frame P-sites from riboWaltz and RiboProfiling (left panel) and between the codon usage index based on in-frame P-sites from riboWaltz and Plastid (right panel). The length of the reads ranges from 19 up to 38 nucleotides (see Table 1) with the optimal PO used in the correction step of riboWaltz being 16 nucleotides from the 3’ end.
Table 1.
Comparison of the P-site offsets identified for each read length by riboWaltz, RiboProfiling and Plastid in mouse (GSE102318).
Table 2.
Comparison between temporary and corrected P-site offsets identified by riboWaltz in mouse (GSE102318).
Fig 4.
(A) Comparison of the percentage of P-sites in frame 0 (Periodicity score) along the coding sequence and (B) comparison of the average TIS accuracy score based on P-sites identification by riboWaltz, RiboProfiling and Plastid. Both panels display the results obtained from 7 datasets (2 yeast, 3 mouse and 2 human), each dataset represented by a dot. Statistical significances from paired one-tailed Wilcoxon–Mann–Whitney test are shown (* P<0.05, ** P<0.01).
Table 3.
Summary and comparison of the percentage of P-sites in frame 0 along the coding sequence (Periodicity score) based on P-sites identification by riboWaltz, RiboProfiling and Plastid.
Table 4.
Summary and comparison of the average TIS accuracy score based on P-sites identification by riboWaltz, RiboProfiling and Plastid.