Fig 1.
Overview of the experimental design.
The figure shows, on the left, the standard 10X scRNA-seq protocol based on Illumina sequencing and, on the right, the protocol used for sequencing full-length transcripts using Pacific Biosciences chemistry. Abbreviations: GEMs, Gel Beads in Emulsion; UMI, Unique Molecule Identifier; TSO, template switch oligo.
Table 1.
Illumina and PacBio sequencing results and transcript predictions.
Fig 2.
Examples of full-length isoform sequencing using PacBio and resulting transcript predictions.
(A) Example of protein-coding transcripts matching the current gene annotations. The figure shows 5.7 kb of chromosome 13 containing two annotated P. vivax genes, the ubiquitin fusion degradation protein 1 (PVP01_1330900) and an ATP synthase-associated protein (PVP01_1331000). The blue horizontal bars at the bottom shows the annotations for these genes in plasmoDB v54, while the top panel shows the PacBio reads mapping to this locus (each red horizontal line is a unique read mapped to the positive strand with the grey lines indicating spliced introns). Note that, while the PacBio reads support a shorter 3’-UTR than annotated for PVP01_133100, the predicted protein coding sequences are identical to the ones annotated. (B) Example of protein coding transcript differing from the current gene annotation. The figure shows 5 kb of chromosome 13 surrounding the rhoptry-associated protein 1 (RAP1, PVP01_1338500). The PacBio reads (mapped to the negative strand and displayed in blue) support the presence of an unannotated intron (red box), leading to additional predicted coding sequences upstream of this intron, and a different protein than annotated in the genome (thick blue bars at the bottom). (C) Example of two isoforms with different predicted protein coding sequences. The top panel shows that blood-stage parasites express a transcript for cytochrome b5-like heme/steroid binding protein (PVP01_0716500) identical to the annotated protein-coding sequence (although with a shorter 5’-UTR). The middle panel shows that P. vivax sporozoites express this gene from a different start site (red box) resulting in a shorter transcript and a different predicted protein. (Note also the presence of an unannotated, and alternatively spliced, intron in the 3’UTR).
Fig 3.
Distribution of UTR lengths (x-axis, in bp) for transcripts expressed by blood-stage parasites (top) and sporozoites (bottom).
Table 2.
Summary of isoform types from PacBio predictions.
Fig 4.
Example of isoforms expressed in a stage-specific manner.
(A) Each panel shows the PacBio reads mapped to the glutaredoxin 1 (PVP01_0833900) and split in four groups according to the stage of the parasites they derived from: early trophozoites, late trophozoites, schizonts and female gametocytes (the terminology used for each group reflects developmental categories based on pseudotime analysis and might not exactly correspond to stages determined by microscopy). Female gametocytes express glutaredoxin 1 from a more upstream TSS than asexual parasites, and the resulting transcripts have an additional intron in the 5’-UTR (red box). (Note also the presence on an alternatively spliced intron in the 3’UTR of some transcripts). (B) PCAs showing that the short (blue) and long (red) isoforms for glutaredoxin-1 are expressed at stages of the parasite development, with the long isoform almost exclusively expressed in female gametocytes.