Fig 1.
Sketch of the computation of the three parameters step height, step factor and enrichment factor required for TSS prediction.
The yellow line shows the enriched profile (), with an increased expression value upstream of the annotated gene.
Fig 2.
Classification of TSS based on the distance to annotated genes as defined by [1].
Primary and secondary TSS are located upstream of annotated genes, where secondary TSS show a lower enriched expression signal compared to the respective primary TSS. Internal TSS are located within the genes themselves, while antisense TSS are located on the antisense strand close to a gene (within 150 bp) or within the gene itself. Lastly, all other identified TSS are called orphan TSS.
Fig 3.
Requirements identified for the improvement of the TSS prediction workflow.
Each of them will be tackled with the implementation of TSSpredator-Web.
Fig 4.
Sketch of different views of the genome browser of TSSpredator-Web.
The top section (aggregated view) displays the aggregated stacked bar chart. Each bar shows the count of TSS of a specific class in the respective genomic region bin. Below, the gene track provides a hints for the location of annotated genes. The bottom section illustrates how the plot changes upon falling below the 50,000 bp threshold (detailed view). Individual TSS locations are represented by colored glyphs, and genes are shown in their entirety with their corresponding name or locus_tag. The visualization also includes expression data from both control (in gray) and enriched libraries (in yellow), along with a track that illustrates the upstream region of a TSS. For simplicity, only 10 bp of the upstream region are shown on this representation instead of the actual 50 bp.
Fig 5.
Visual representation of the two modes of the genome browser to show data of multiple experiments.
Each experiment consists of one component per strand, differing on the orientation of the data (for example, the reverse strand is flipped vertically). The single view mode (top section) groups the components based on the experiments for a direct exploration within each experiment. Differently, the aligned view (bottom section) groups the visualization components vertically with respect to the two strands. This allows an easier comparison across experiments. Regardless of the chosen view, a synced crossline is shown on hover. For simplicity, in this figure this line is only shown in the single view.
Fig 6.
Analysis of the overall distribution of TSS across conditions, classes and location in the genome for data collected for E. coli on four different conditions (one control and three treatment with antibiotics).
(a) UpSet plot showing the distribution of enriched TSS aggregated only by their location (i.e., position and strand). (b) UpSet plot showing the distribution of enriched TSS aggregated by their location and the condition they occur. For this UpSet plot, each TSS is counted for each condition separately. (c) Aggregated view of the genome browser showing the distribution of TSS colored by class and binned by their position in the genome for the control condition.
Table 1.
Excerpt of the MasterTable showing the top 10 enriched primary TSS enriched based on step height and their associated genes. These TSS are enriched in all antibiotic treatment conditions for E. coli.
Fig 7.
Analysis of primary TSS for E. coli across conditions, especially those TSS occurring only under the treatment with each antibiotic.
(a) UpSet plot showing the distribution of primary enriched TSS across conditions, aggregated only by their location. The highlighted set refers to those TSS positions enriched only under the treatment with each antibiotic. (b) Aligned mode of the genome browser showing a primary TSS position on the reverse strand (shown via the direction of the glyphs and the bars of the expression profiles) occurring in all conditions with antibiotic treatment, but not in the control condition. The TSS is located upstream of the gene udp (locus_tag: b2028). The enriched libraries (orange bars) can be seen increased in all conditions. However, the treatment with novobiocin shows the highest expression value. For simplicity, the empty genome track has been removed from the figure.
Fig 8.
Usage of the genome browser for the exploration and characterization of orphan TSS in E. coli.
For simplicity, empty visualization tracks have been removed from the figures. (a) Genome viewer on Single view mode visualizing a region with two orphan TSS that occur in all conditions, shown here only for the rifampicin treatment. On the reverse strand (bottom), the TSS position shows a prominent step height rising up to an expression of around 350. Interestingly, on the forward strand at position
a further orphan TSS position was identified. (b) Zoomed view of TSS at position
of the reverse strand with an expression value of the enriched library (orange bars) around 350. The upstream region shows a putative Pribnow box (TATAAA) starting at −9 nt upstream of the TSS. (c) Zoomed view of TSS at position
with an expression value of the enriched library (orange bars) around 25. The upstream region shows again a clear putative Pribnow box (TAATATAA) starting at −13 nt upstream of the TSS.