Figure 1.
Comparison of hESC differentiation and adult tissue array profiles.
Human Affymetrix exon array data were compared for REX+ hESCs and derived CPs; Cythera, HUES6 hESCs and derived NPs; fetal human central nervous system stem cells (hCNS-scns); and 11 adult tissues, processed by RMA together. (A) Relative changes in gene expression (log2 fold, relative to the global expression mean) for all samples were clustered by array (rather than genes) for any Ensembl gene with a relative change in gene expression >2. Biological triplicates are indicated for each tissue or cell line. (B) Gene expression profiles for this combined dataset and for specific markers of CP-specification (columns 1 and 2) and for pluripotency (column 3).
Figure 2.
Assigning AltAnalyze mRNA and protein annotations.
Theoretical transcripts with distinct exon compositions are shown. (A) Distinct alternative (Alt.) exon annotations for five mRNA transcripts, where the filled boxes are sequences retained in the processed mRNA transcript. Black filled boxes are exons common to all isoforms (constitutive). AltAnalyze considers all alternative exon annotations as AS except for alternative-N-terminal exons (expressed through alternative promoter selection). (B) All pairs of mRNA transcripts that do or do not align to an exon array probe set are compared to identify a single pair of competitive isoforms that minimally differ in exon composition. Curved arrows indicate all possible competitive transcript comparisons. The top selected competitive isoforms (dashed box) have the fewest exon differences and have the most exons in common. AltAnalyze selects this transcript pair for analysis of downstream protein domain/motif composition, after corresponding protein sequences are selected. (C) Protein domains and motifs differing between competitive isoforms. Exons for the two transcripts are labeled in order, 5′ to 3′, with protein sequence and Uniprot features (UPF) or InterPro regions (IPR) corresponding to each exon displayed above or below them. Yellow filled boxes indicate domains and motifs differencing between the competitive isoforms. (D) Domains and motifs directly aligning to a probe set's genomic position. A theoretical probe set aligning to the intron of a gene is shown. InterPro domains/motifs whose genomic position (genomic exon start and exon end position) overlaps with a given probe set (genomic start and end position) are shown with a yellow filled box. Rather than comparison of two protein sequences with the competitive isoform analysis, only a single protein sequence is required for the direct genomic alignment method.
Table 1.
Alternative gene regulation during CP differentiation.
Table 2.
Regulation of miRNA binding sites during CP differentiation.
Figure 3.
Analysis of verified AS events identifies novel functional associations.
(A) Expression of splice isoforms confirmed by RT-PCR of genes with prior evidence of AS. ANXA7, SLK, NF1, and VCL were confirmed with flanking primers, and PKM2 and ATP2A2 with isoform-specific primers. DNA agarose gel images, with REX+ hESCs RNA on the left side of the gel and CPs on the right. (B–C) Exon structure (top graphic) and expression profiles (bottom graphic) for ANXA7 and ATP2A2. (B) SI fold changes are shown for probe sets aligning to exons and introns in the prototype Cytoscape plugin SubgeneViewer. Light red boxes indicate upregulation for CP versus hESC; blue boxes, downregulation; gray boxes no significant change; white boxes no probe set detected above expression thresholds. Probe set expression values (log2) are displayed for both CP (top graphs) and NP differentiation (bottom graph), ranked in order of genomic position on the x-axis. Blue data points indicate hESC expression; red data points indicate CP expression; green data points indicate NP expression. (D) Domain/motif annotations for each PKM2 alternative isoform (M1 and M2). The two mutually exclusive isoforms produce proteins differing in the predicted inclusion of an FBP binding region and intersubunit contact (ISC) sequence as defined by UniProt. Yellow and green mutually exclusive exons are shown relative to the translated position of these exons in resulting proteins. (E) miRNA binding sites that overlap with the last intron of ATP2A2. Exons for ATP2A2 transcripts (solid dark blue, red, and black boxes) are displayed 5′ to 3′ (forward strand) along with UCSC splicing annotations (purple box), and aligning probe sets, downregulated in CPs versus hESCs (blue boxes). These downregulated probes sets correspond to those shown in panel C. ATP2A2 isoforms 2a and 2b are indicated. The term ”multiple algorithms” indicates that two or more miRNA binding site prediction algorithms (PicTar, miRanda, miRbase or TargetScan) predicted a binding site in aligning probe sets.
Figure 4.
Genes with common CP-NP or CP-specific AS patterns associate with distinct pathways.
AS predictions with evidence of (A) a common CP-NP differentiation or (B) a CP-specific expression pattern, relative to undifferentiated hESCs. Adjacent to each heatmap are alternative exons, ranked according to the ANOVA false-discovery rate (FDR) p value. Next to this p value, are the SI fold changes reported by AltAnalyze (negative values indicate increased alternative exon expression in CPs and vice versa). Gene names in blue have prior evidence of AS during hESC differentiation; genes in red have prior evidence of AS during cardiac differentiation. Genes associated with GO terms and WikiPathways are graphed that are overrepresented in genes with a (C) common CP-NP or (D) CP-specific AS pattern.
Figure 5.
Genes with confirmed AS events have distinct domain-level changes.
RT-PCR results for a panel of predicted CP differentiation-splicing events with both a common CP-NP differentiation and CP-specific ANOVA pattern. Genes are categorized based on predicted domain/motif changes: truncation, disruption, modification, exchange or no associated predictions. The higher band in each gel image is the longer isoform with exon inclusion (in); the lower band is the shorter isoform with exon exclusion (ex), unless indicated as a constitutive (cs), mutually exclusive (mx), or miRNA (miR)-containing exon. Additional confirmed genes are shown in Figures 3 and 7 and are further described in Table 3.
Figure 6.
Comparison of differentiation and tissue AS patterns.
(A) For the genes KIF13A and CAPZB, log2 expression values for exon aligning probe sets are shown; probe sets are ranked in order of genomic position on the x-axis and expression values are plotted on the y-axis. (B) For both genes, relative exon-inclusion is assessed for CP and NP differentiation conditions and 11 adult tissue conditions by plotting the mean constitutive gene expression (y-axis) against the expression of the interrogated alternative exon (x-axis). Each diamond represents a distinct tissue. For KIF13A, exon 41 (E41) is most highly expressed in the hESC lines (H9 and Cythera), suggesting E41 inclusion is greatest in hESCs. For CAPZB, exon 12 (E12) is most highly expressed in muscle (CP, heart and muscle). H9 = REX+ hESC, Cy = Cythera hESCs, Mus = muscle, Hrt = heart.
Table 3.
AltAnalyze functional predictions for confirmed CP differentiation AS events.
Figure 7.
Both miRNAs and miRNA binding sites are regulated with hESC differentiation.
(A) Expression profiles of two previously characterized miRNAs, mir-302a and mir-133-1, from combined tissue/cell-line gene expression data. (B) RT-PCR isoform expression of genes with putative miRNA binding sites within the regulated probe set. The presence of one or more putative miRNAs is indicated by the notation miR. (C–E) The 3′ region of genes corresponding to three genes are shown, where the regulated isoforms are displayed from the UCSC genome browser along with regulated probe sets and putative miRNA binding site locations. Exons are indicated by thin boxes, UTR regions by thinner boxes and introns by a line with overlapping arrows. Each gene (MAFB, SEPT6, and CDC42) represents distinct possible modes of exon regulation that lead to altered miRNA binding site inclusion: shorter 3′UTR, alternate cassette exon inclusion, and alternate C-terminal exon. Both MAFB and SETP6 are on the reverse genomic strand, where orientation is 3′ to 5′. The term ”multiple algorithms” indicates that two or more miRNA binding site prediction algorithms (PicTar, miRanda, miRbase or TargetScan) predicted a binding site in aligning probe sets.