Fig 1.
Annotated spliced pangenome example.
(a) Reference genome with a purple box representing a gene locus and diamonds representing alternate alleles of variations. In this example, we will consider two haplotypes, H1 containing the reference alleles and H2 containing the alternate alleles. (b) The gene has two transcripts, namely T1 and T2, expressing an exon skipping event. There is a total of 4 haplotype-aware transcripts. The alternate alleles are color-coded depending on the exon they fall in. (c) Spliced pangenome built from the reference genome, gene annotation, and set of variations. Colored vertices and edges belongs to the transcript walks and they represent portions of exons and splice junctions. White vertices are genomic portions coming from intron and intergenic regions that do not belong to the transcript walks. The two haplotypes are represented as colored bars above and below the vertices. (d) Annotated spliced pangenome where each colored vertex (exon portion) is annotated by the transcripts and exons it belongs to and each colored edge (junction) is annotated with the junction information. (e) Spliced alignments of two RNA-Seq reads to the annotated spliced pangenome. Read A aligns over the reference allele of the first variation and supports the annotated splice junction between the first and second exons (blue edge). Read B, instead, aligns over the alternate allele of the variations and then supports a novel splice junction between the two exons (the novel junction induces an alternative donor events).
Fig 2.
Annotated events in a spliced pangenome.
All the annotated events expressed within the annotated spliced pangenome showing the different tags used: exon skipping (a), alternative donor splicing site (b), alternative acceptor splicing site (c), and intron retention (d). Haplotype information and weights are omitted for readability. Blue squares represent exons with their tags, green and purple edges are the annotated junctions with their tags, grey vertices are exonic and white are intronic. Exons are labeled with the transcript walks that are represented in the figure.
Fig 3.
Results on simulated data from Drosophila Melanogaster (annotated events).
Precision and recall are computed by comparing asimulator truth (filtered based on , that is the minimum number of reads supporting an event) with the output of each tool. Results are broken down by event type (ES: Exon Skipping, IR: Intron Retention, A3: Alternative 3’, A5: Alternative 5’).
Fig 4.
Results on real data from Drosophila Melanogaster.
(a) Venn diagram showing the number of AS events reported by each tool. (b) All-vs-all correlation plots of the Δψ reported by the considered tools for the 164 events shared among the tools. Detailed version where each point is color coded based on the event type is available at Fig E in S1 Text.
Fig 5.
Results on real data from human.
(a) Venn diagram showing the number of statistically significant differential AS events reported by each tool. The legend reports the total number of events reported by the tool. (b) Boxplot showing the distribution of the difference between the Δψ predicted by each tool and the Δψ provided by RT-PCR. The x axis labels report the tool name and the Pearson correlation (r) between the predicted Δψ and the RT-PCR Δψ. Note that since pantas do not compute statistical significance, we filtered only by the reported Δψ.