A Genome-Wide Identification Analysis of Small Regulatory RNAs in Mycobacterium tuberculosis by RNA-Seq and Conservation Analysis

doi:10.1371/journal.pone.0032723

Figure 1.

Outline of the bioinformatic pipeline.

(a) Construction of the effective target genome (ETG) in terms both of sequences and coordinates. (b) Construction of the two strand specific reads maps. (c) Construction of two strand specific conservation maps. (d) Combination of reads and conservation map to allow for the identification of putative sRNA encoding regions. (e) annotations of putative sRNA to assess their reliability.

More »

Expand

Figure 2.

Reads maps construction.

Reads map (blue curve) is obtained by assembling together all reads (sequences in red) mapping uniquely and completely within the same IGR or AS region (sequence in black). The BioPerl procedure implemented merges NGS mappers output and T_IGRAScoord files.

More »

Expand

Figure 3.

SRNA identification process.

For each IGR (sequence in black), reads (blue curve) and conservation (green curve) maps are superimposed. First Type A candidates (highlighted in blue) are identified and extracted by testing length constrains (conditions I and II) and reads coverage above ExprT1 (dotted blue line). On the remaining portions of IGRs, Type B candidates (highlight in yellow) are identified and extracted by testing length constrains (conditions I and II) and contemporaneously both reads coverage above ExprT2 and conservation depth above ConsT2 (dot and dashed yellow lines). Finally, Type C candidate (highlighted in green) are identified in the remaining IGRs on the basis of high sequence conservation (above ConsT1 threshold reported as dotted green line).

More »