Improved transcriptome assembly using a hybrid of long and short reads with StringTie
Fig 1
A) Artifacts present in the long read alignments: i) retained introns; ii) disagreement around the splice sites; iii) spurious extra exons; iv) falsely skipped exons; v) false alternative splice sites. B) Example of a human transcript that can only be correctly assembled using both the long and short reads. This is human transcript ENST000000361722.7 from the TBKBP1 gene. Blue lines in the middle of the reads (gray boxes) indicate a spliced alignment. Purple lines within the reads indicate mismatches in the alignment. The long reads alignments do not have coverage of exons 1–3 and contain a retained intron. The short-read alignments lack adequate splice-site support across the 4th intron and the 7th intron and do not have complete coverage of exons 5 and 8.