Fig 1.
Flow chart of the VAP workflow.
FastQ files are QC using FastQC, mapped using three aligners. BAM files are pre-processed by Picard and GATK, then merged, annotated and filtered to achieve high-confident SNPs.
Table 1.
Criteria used in the VAP filtering workflow.
Table 2.
Summary from the multiple aligners; read mapping statistics and variant calls.
Fig 2.
Comparison of RNA-seq SNPs identified in the different mapping tools.
Fig 3.
Comparison of RNA-seq SNPs found in either dbSNP or WGS.
Fig 4.
The mutational profile of RNA-seq variants.
Fig 5.
Comparison of SNPs identified as homozygous and heterozygous in RNA-seq.
Table 3.
SNPs belonging to different annotation categories.
Fig 6.
Overlap of SNPs found in coding regions from RNA-seq and WGS.
66% of the coding variants identified in WGS data were found in RNA-seq. However, the remaining WGS coding variants were not detected as a result of either: lack of expression/transcription (“no transcription”), the position was homozygous in RNA (“no variation”), “found but filtered” signifying that the position was detected but removed by one of our filtering steps, or “filtered” which indicates the position was heterozygous but filtered because it didn’t meet the default parameters for variant detection.
Fig 7.
Specificity and number of RNA-seq SNPs detected in relation to the genes expressed (FPKM values).
Fig 8.
Distribution of expression levels for genes with RNA-seq SNPs.
Fig 9.
Comparison of SNP calls between 600k Genotyping panel, RNA-seq SNPs, WGS SNPs and dbSNP v150.
(a) all autosomal SNPs and (b) autosomal SNPs found in exons.
Table 4.
Explanation for the 14,147 RNA SNPs not found in WGS data.
Table 5.
Potentially functional RDD candidates found in Fayoumi.