Figure 1.
A model of the effect of APA-SNPs in the 3′UTR of a gene.
(A) For the C allele, the second cleavage site (CS) is used, because the first polyA signal (PAS) is not functional. For the A allele, the first PAS is functional, therefore the pre-mRNA can be cleaved at the first CS, resulting in a loss of functional miRNA target sites downstream (indicated with loss of Argonaute (AGO) binding), and increased gene expression (B). (C) EST sequences enable identifying APA-SNP alleles and 3′UTR length. (D) RNA-seq reads enable genotyping APA-SNPs and quantifying expression patterns.
Figure 2.
Panels (A) and (B) show 3′ ends of the MIER1 and PNN genes as annotated in PolyA_Db (3′ ends of the horizontal lines), and their candidate APA SNP. The four other graphs show the inverse cumulative distribution of EST sequence ending position for APA alleles (triangles) and non-APA alleles (circles). The dashed vertical line shows the threshold separating short and long transcripts. The transcript proportion is decreasing before the threshold for APA alleles, compared to non-APA alleles. This decrease indicates that APA alleles are more likely to produce shorter transcripts. Panels (A), (C) and (E) show the MIER1 gene. Panels (B), (D) and (F) show the PNN gene. Several unknown alleles could be imputed through haplotypes (included in Panels (C) and (D)).
Table 1.
Significant genes in the EST analysis.
Figure 3.
Increased allelic imbalance correlates with signal strength and depends on downstream GU-content.
Log allelic ratio distribution of APA allele over non-APA allele for each polyA signal ordered by strength. Panel (A): log allelic ratio is negatively correlated with signal rank for all APA-SNPs. Compared with all APA-SNPs, APA-SNPs with a GU-rich region (Panel (B)) have a stronger negative correlation between log allelic ratio and signal rank. For APA-SNPs without a GU-rich region (Panel (C)), there is no significant correlation between signal rank and log allelic ratio. The graphs include data from the 19 non-mixed cell lines and tissues. The line in each panel shows the linear regression line; the corresponding Pearson correlation coefficient is in the panel's upper left corner.
Figure 4.
Allelic imbalance distributions according to signal strength and downstream GU levels.
Allelic imbalance is increased towards APA alleles for APA-SNPs in strong (S) signals with high downstream GU levels. The graph shows a box-plot of the log AR distribution of APA-SNPs grouped by signal strength (weak (W) and strong (S)) and downstream GU levels.
Figure 5.
SNP expression difference between SNPs with positive and negative log allelic ratios.
Logarithm of SNP expression median difference between SNPs with positive log allelic ratios and those with negative log allelic ratios, in several groups (low and high GU level, low (LMS) and high (HMS) miRNA score, and weak (W) and strong (S) signal). Crosses show median differences. Bootstrapping median differences gives 95% CI. Only one CI does not contain zero: the one with high GU, HMS and S, indicating that positive allelic imbalance for SNPs in strong polyA sites and affecting miRNA target sites, is associated with increased SNP expression, and therefore increased gene expression.
Figure 6.
SNP expression distributions according to allelic imbalance direction.
SNPs in strong APA signal, with high GU level and high miRNA score, have a significantly higher logarithm of SNP expression for SNPs with imbalance towards APA allele (positive (P) log allelic ratio), compared to SNPs with imbalance towards non-APA allele (negative (N) log allelic ratio).
Figure 7.
APA homozygotes have an increased gene expression for strong polyA signals and high miRNA score.
Gene expression medians in several groups are shown: Median differences between the APA homozygotes and non-APA homozygotes (Rhombus), and between heterozygotes and non-APA homozygotes (Cross). 95% CI for median differences are shown. Expression of APA homozygotes is generally higher, followed by heterozygotes, and then finally non-APA homozygotes. (A): genes where alternative polyadenylation does not affect miRNA targeting (low miRNA score). Strong signals (S) have a slightly higher median difference compared to weak signals (W). (B): genes where alternative polyadenylation affects miRNA targeting (high miRNA score). Strong signals have a significantly higher median difference.
Table 2.
Potentially functional APA alleles are positively correlated with risk alleles from GWAS SNPs.