Long read single cell RNA sequencing reveals the isoform diversity of Plasmodium vivax transcripts
Fig 2
Examples of full-length isoform sequencing using PacBio and resulting transcript predictions.
(A) Example of protein-coding transcripts matching the current gene annotations. The figure shows 5.7 kb of chromosome 13 containing two annotated P. vivax genes, the ubiquitin fusion degradation protein 1 (PVP01_1330900) and an ATP synthase-associated protein (PVP01_1331000). The blue horizontal bars at the bottom shows the annotations for these genes in plasmoDB v54, while the top panel shows the PacBio reads mapping to this locus (each red horizontal line is a unique read mapped to the positive strand with the grey lines indicating spliced introns). Note that, while the PacBio reads support a shorter 3’-UTR than annotated for PVP01_133100, the predicted protein coding sequences are identical to the ones annotated. (B) Example of protein coding transcript differing from the current gene annotation. The figure shows 5 kb of chromosome 13 surrounding the rhoptry-associated protein 1 (RAP1, PVP01_1338500). The PacBio reads (mapped to the negative strand and displayed in blue) support the presence of an unannotated intron (red box), leading to additional predicted coding sequences upstream of this intron, and a different protein than annotated in the genome (thick blue bars at the bottom). (C) Example of two isoforms with different predicted protein coding sequences. The top panel shows that blood-stage parasites express a transcript for cytochrome b5-like heme/steroid binding protein (PVP01_0716500) identical to the annotated protein-coding sequence (although with a shorter 5’-UTR). The middle panel shows that P. vivax sporozoites express this gene from a different start site (red box) resulting in a shorter transcript and a different predicted protein. (Note also the presence of an unannotated, and alternatively spliced, intron in the 3’UTR).