Alternative splicing (AS) is pervasive in human multi-exon genes and is a major contributor to expansion of the transcriptome and proteome diversity. The accurate recognition of alternative splice sites is regulated by information contained in networks of protein-protein and protein-RNA interactions. However, the mechanisms leading to splice site selection are not fully understood. Although numerous databases have been built to describe AS, molecular interaction databases associated with AS have only recently emerged. In this study, we present a new database, MiasDB, that provides a description of molecular interactions associated with human AS events. This database covers 938 interactions between human splicing factors, RNA elements, transcription factors, kinases and modified histones for 173 human AS events. Every entry includes the interaction partners, interaction type, experimental methods, AS type, tissue specificity or disease-relevant information, a simple description of the functionally tested interaction in the AS event and references. The database can be queried easily using a web server (http://18.104.22.168/Miasdb). We display some interaction figures for several genes. With this database, users can view the regulation network describing AS events for 12 given genes.
Citation: Xing Y, Zhao X, Yu T, Liang D, Li J, Wei G, et al. (2016) MiasDB: A Database of Molecular Interactions Associated with Alternative Splicing of Human Pre-mRNAs. PLoS ONE 11(5): e0155443. https://doi.org/10.1371/journal.pone.0155443
Editor: Ruben Artero, University of Valencia, SPAIN
Received: November 19, 2015; Accepted: April 28, 2016; Published: May 11, 2016
Copyright: © 2016 Xing et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are available via Figshare (https://dx.doi.org/10.6084/m9.figshare.3103057.v1).
Funding: This work was supported by the Natural Science Foundation of China (61271448, 11547150, 61361014, 61102162, 31260274), the Inner Mongolia Science & Technology Plan (20140401), and the Natural Science Foundation of Inner Mongolia (2014MS0312, 2014MS0311, 2013MS0514, 2015MS0335).
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: AS, alternative splicing; SF, splicing factor; MiasDB, the database of molecular interactions associated with alternative splicing
Since the discovery that the number of genes in a genome is not linearly correlated with the complexity and functional diversity of an organism, alternative splicing (AS) has increasingly attracted the interest of researchers. AS, which is widespread in the human genome, has been investigated intensively for many genes and according to recent estimates, over 95% of human multi-exon genes undergo this process [1–3]. AS significantly complicates the processing of pre-mRNA. In higher eukaryotes, AS of pre-mRNAs is essential for regulating gene expression, as it alters the function of a gene in different tissues and developmental stages by generating various mRNA isoforms composed of different combinations of exons. Indeed, AS plays an important role in numerous processes, including cell proliferation, apoptosis, development, and differentiation [4–5], and dysregulation of AS leads to a number of human genetic diseases [6–8].
The process of removing intron and joining exons to form mature mRNAs occurs in the nucleus and is accomplished by five small nuclear ribonucleoproteins (U1, U2, U4, U5 and U6 snRNPs) and more than two hundred proteins through the step-by-step assembly of the spliceosome . Recognition of a 5’ splice site involves a base-pairing interaction between the 5’ splice site sequence and the snRNA component of the U1 snRNP. The first step in the recognition of the 3’ splice site is the binding of splicing factor 1 (SF1) to the branch point sequence (BPS). Then, the 65 kDa U2AF subunit binds the polypyrimidine tract (PPT), while the 35 kDa subunit contacts the AG at the end of the intron. Next, the U2 snRNP displaces SF1 and interacts with the BPS through base-pairing. The U4/U6 and U5 snRNPs are then recruited as a preassembled U4/U6.U5 tri-snRNP and, after rearrangement, form the catalytically active complex to perform the chemical reactions of splicing . Although U2-type introns coexist with U12-type introns in most eukaryotes, the latter account for less than 0.5% of all introns in any given genome. U12-type introns are processed by a specific U12-dependent spliceosome, which is similar to, but distinct from, the major U2-dependent spliceosome [11–13]. AS events can be categorized into seven major types: (i) exon skipping; (ii) alternative 3’ splice site; (iii) alternative 5’ splice site; (iv) intron retention; (v) mutually exclusive exon; (vi) alternative first exon; (vii) alternative last exon .
The well understood mechanisms of AS regulation involve interactions between splicing factors (SFs) and their target RNA elements [15–17]. Strong splice sites are more efficiently selected than weak, or sub-optimal, splice sites, and alternative exons are frequently associated with the latter. The recognition of weak splice sites depends on the binding of specific trans-factors to cis-elements of the pre-mRNA. Trans-factors include serine-arginine rich (SR) proteins and heterogeneous nuclear ribonucleoproteins (hnRNPs), etc. The cis-elements include exonic splicing enhancers (ESEs), intronic splicing enhancers (ISEs), exonic splicing silencers (ESSs) and intronic splicing silencers (ISSs). Unlike enhancers, silencer sequences such as ESSs and ISSs negatively regulate the inclusion of AS exons by interacting with SFs. Additional proteins that do not directly bind RNA, such as transcription factors (TFs), kinases, and histone-modifying enzymes, have also been shown to regulate AS [17, 18–19].
The construction of AS databases is helpful for the identification, classification, functional annotation, and expression profiling of alternative transcripts and for elucidating the regulatory mechanism of AS. Several AS databases have been constructed, and these resources are currently available to the public on the Internet. Most were developed to identify AS events based on either automated large-scale comparisons of expressed sequence tags (ESTs) extracted from publicly available databanks, such as GenBank, EMBL, or DDBJ, or from mining experimental databases. For example, Hollywood , ASD , ECGENE , ASAP , PALS db , EASED , SPLICEINFO , Fast DB  and HEXEvent  were constructed based on ESTs and AsMamDB , ASDB  and SpliceDB  were constructed by searching experimental databases. However, the alignment algorithms are different among these databases due to the differences in primary sequences. Furthermore, most of these AS databases are incomplete because they are largely based on partially and imprecisely sequenced cDNAs (ESTs) or on computationally derived exon information. Other databases that depict AS-induced alterations in protein structures or interactions between RNA and SFs are available. AS-ALPS provides spatial relationships between protein regions altered by AS and the protein’s hydrophobic core and sites of inter-molecular interactions .
SpliceAid-F was established by screening the literature; it is currently the only database describing interactions between SFs and their RNA-binding sites . This database includes many artificially mutated RNA elements and does not include any records related to proteins other than those that bind to RNA elements. Furthermore, SpliceAid-F contains only a small number of SFs and focuses on their RNA-binding specificity. Although a large number of molecular interactions associated with AS have been identified through experimental analysis, AS databases do not generally include this information. Thus, it is increasingly important to create comprehensive databases that include the molecular interactions involved in AS regulation.
By manually screening the literature, we retrieved experimentally validated interactions that regulate human AS events and assembled them into an online database called the database of molecular interactions associated with alternative splicing (MiasDB) (http://22.214.171.124/Miasdb). Our database collected 938 human interactions between RNA elements, SFs, TFs, splicing-associated kinases and modified histones for 173 human AS events. Then, the web server for free browsing was built. MiasDB and a number of other available databases on AS complement each other and are indispensable for many computational biologists and molecular biologists. Data-based inferences of the regulation network describing AS events for a given gene are necessary to extrapolate connections between splicing factors and other signal pathways, and as a proof of principle, we built interaction figures for 12 genes based on MiasDB. Overall, MiasDB provides a comprehensive resource of AS interactions in humans, and this database will aid in uncovering the regulatory principles of splicing processes.
Results and Discussion
MiasDB data statistics
MiasDB, launched in January 2016, provides AS interaction information for the human genome and includes a total of 938 interactions of AS in humans, of which 29 are specific for the minor splicing pathway (S1 Table). For each interaction, the database provides basic information including interaction partners (defined as interactors on the webpage), interaction type, experimental technology, AS type, tissue specificity and disease-related information, a simple description on the function of the interaction in the AS event, and references (PubMed ID). Hyperlinks to PubMed are also provided. Although some of the experiments were carried out in non-human mammalian cell models, the interactions also occur in humans; these interactions are flagged as human genes in this database.
MiasDB includes 538 experimentally validated interactions regulating specific AS events for 131 genes. The names of the 131 genes, annotated using approved symbols according to the HUGO Gene Nomenclature Committee (HGNC), are shown in Table 1. A user can link to the HGNC by clicking on ‘gene name’. MiasDB also includes 400 interactions for which the gene name has not been determined.
The 938 interactions were classified into two groups: (1) the interactions between RNA binding proteins in which the proteins also called splicing factors and RNA elements (SF-RNA); and (2) the protein-protein interactions (PPIs) in which the proteins may be the RNA binding proteins or other proteins that do not physically interact with the RNA elements. These other proteins may include TFs, kinases and modified histones, etc. There are 525 SF-RNAs interactions in the first group and 413 PPIs in the second group. Increasing evidence suggests that histone modifications play important roles in modulating AS [34–37]. In MiasDB, PPIs include 21 entries describing interactions between splicing factors and histone modifications, including H3K4me3, H3K9me, H3K9me3, H3K36me3, H3K79me, and H3S10P. Furthermore, 909 physical and 29 functional interactions are included in the current database. Overall, the 342 protein factors included in the database have been shown to regulate AS (Table 2). Protein kinases are important regulators of AS, as reflected by the fact that the database includes 75 interactions involved with kinases. The major splicing types involved in these interactions are exon skipping (411), mutually exclusive exons (65), intron retention (22), alternative 5’ splice site (17), alternative 3’ splice site (8), alternative first exons (5) and alternative last exons (11). The interactions in MiasDB are involved in 22 specific tissues and 22 diseases (S2 Table).
Access to the database
MiasDB is a comprehensive information resource describing SF-RNA and protein-protein interactions associated with AS. The data in MiasDB are freely accessible through the web interface, which allows users to access and intuitively browse through the information. The search entry allows users to retrieve interaction information using one of three features: the name of a gene with an AS event, the name of SF or RNA elements, or the AS type (see S1 Fig). The output for each selected feature is displayed in a table. Detailed instructions on the operation process of the database can be found at the help entry on the webpage.
Comparison with other AS databases
MiasDB has features that clearly distinguish it from other AS databases. Most existing AS databases are aimed at collecting AS events but do not provide information regarding the regulatory mechanisms. SpliceAid-F is the only database that shares some features with MiasDB (Table 3) . However, SpliceAid-F only focuses on the RNA-binding specificity of trans-acting SFs and contains 71 human RNA-binding splicing regulatory proteins, whereas MiasDB includes interactions of SF-RNA and PPIs and contains 342 human RNA-binding splicing regulatory proteins. In addition, SpliceAid-F includes information for multiple organisms, including human, mouse, and HIV-1, etc. In total, 655 human RNA-binding sequences and 111 genes are included in SpliceAid-F. These RNA-binding sequences can be divided into 456 natural binding sites and 199 mutant binding sites; deleting some repeated binding sites results in 331 binding sequences consisting of 236 natural binding sites and 95 mutant binding sites. In 2009, Chen and Manley collected binding sequence information for 18 SR proteins, 14 hnRNPs and 17 tissue-specific AS factors . In total, the advantage of MiasDB compared to other databases can be summarized as follows. (1) MiasDB is mainly concerned with human genome AS. (2) MiasDB not only contains interactions between cis-acting elements and trans-acting factors but also interactions among trans-acting factors. (3) MiasDB stores a larger number of AS interactions and genes than other databases.
Applications of the database
MiasDB has many potential applications. One important application is constructing regulatory networks for AS events that involve multiple RNA elements, SFs and other proteins. Examples of networks for specific genes such as BCL2L1, CSK, CD44, PTPRC, CFTR, FAS, FGFR2, FN1, INSR, NF1, SMN2, and MAPT are presented in the current version of MiasDB. A user can observe the regulatory network by searching for the gene name. Here, the regulatory network for fibronectin 1 (FN1) is shown as an example to illustrate the application of MiasDB (see Fig 1). The gene has three alternatively spliced regions: extra domain A (EDA, also known as EDI or EIIIA), extra domain B (EDB, also known as EDII or EIIIB) and type III connecting segment (IIICS or V region). The AS type for EIIIA and EIIIB is the exon skipping. These two exons tend to be excluded in most adult tissues and included during events that involve tissue growth or regeneration, such as embryogenesis and wound healing . The explanation of the AS regulatory network for FN1 can be downloaded by searching for ‘FN1’ in MiasDB.
In Fig 1, the boxes represent exons separated by introns, which are shown as lines. The cis-acting elements and trans-acting factors regulating FN1 exon selection are indicated. The blue box in the intron downstream of EIIIB denotes an intron splicing enhancer (UGCAUG). The ‘+’ symbol denotes promoting inclusion of the exon; the ‘-’ denotes repressing inclusion of the exon. Direct physical interactions are depicted as a solid line, whereas functional interactions are shown as dotted lines. The black oval denotes the boundary between the nucleus and cytoplasm. Other regulatory networks can be queried by searching for the gene name in MiasDB.
On-going developments and future directions
Although many AS databases have been developed over the past few years, most were constructed by comparing the EST content of transcripts from the same gene. Databases including information regarding AS regulation remain scarce. In this regard, MiasDB provides a comprehensive database that describes the interactions among RNA, SFs and other protein factors in AS regulation. Accordingly, MiasDB is helpful for constructing AS regulatory networks and provides a guide for experimental investigations of the mechanisms that regulate AS.
MiasDB release 1.0 will serve as a central resource for AS factor interaction. Updates, improvements and further developments will be performed annually. We will continue to update the interaction information related to human AS events, and in the future, we expect to add interaction information for other organisms via carefully curated screenings of the literature. In addition, a linkage between MiasDB and other databases, such as KEGG (Kyoto Encyclopedia of Genes and Genomes), will also be built. The existing network of AS mechanisms and the analytical capabilities of the web interface will be expanded with further novel data-mining and visualization tools. Due to the cotranscriptional nature of splicing, splicing factors and transcription factors can influence each other, thus we will also include information regarding interactions between SFs and TFs. By integrating information on splicing pathways in MiasDB release 1.0 and other related databases, we will also develop theoretical models to infer new nodes and edges in the network.
The database is freely accessible through the web server at http://126.96.36.199/Miasdb. Furthermore, all metadata records, statistics and supporting information for MiasDB have also been uploaded to Figshare. The URLs at which data from MiasDB can be accessed in Figshare is ‘https://dx.doi.org/10.6084/m9.figshare.3103057.v1‘.
Materials and Methods
In MiasDB, all of the interaction information associated with AS was obtained from literature in which the interactions were experimentally validated. We performed searches in PubMed resources by entering the term ‘alternative splicing’. Several thousands of papers published before January 2016 were screened, and 330 publications containing experimentally validated interaction information among RNA, SFs and other protein factors on AS events were used to populate the database.
Design of the MiasDB interface
The web frontend of MiasDB was created in HTML with PHP language. The database was developed under a relational database framework using MySQL. The interface consists of five different sections (see S2 Fig): a ‘home page’ to introduce the database, ‘Database Statistics’, a ‘search’ entry to query the database and present the results of a query, ‘Help’ to provide instructions on the operation process, and ‘Contact Us’ to show the correspondence information for our group.
S1 Table. Interactions in the minor splicing pathway.
We thank Hua Lou (Case Western Reserve University) for helpful discussions and critical reading of the manuscript.
Conceived and designed the experiments: LC YX. Performed the experiments: YX XZ JL GW GL XC HZ LC. Analyzed the data: YX GL XZ LC. Contributed reagents/materials/analysis tools: TY DL. Wrote the paper: YX.
- 1. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008; 40:1413–1415. pmid:18978789
- 2. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008; 456, 470–476. pmid:18978772
- 3. Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science. 2012; 338:1593–1599. pmid:23258891
- 4. Kornblihtt AR, Schor IE, Allo M, Blencowe BJ. When chromatin meets splicing. Nat Struct Mol Biol. 2009; 16: 902–903. pmid:19739285
- 5. Kalsotra A, Cooper TA. Functional consequences of developmentally regulated alternative splicing. Nat Rev Genet. 2011; 12: 715–729. pmid:21921927
- 6. Singh RK, Cooper TA. Pre-mRNA splicing in disease and therapeutics. Trends Mol Med. 2012; 18(8):472–482. pmid:22819011
- 7. Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RK, et al. RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease. Science. 2015; 347(6218):1254806. pmid:25525159
- 8. Scotti MM, Swanson MS. RNA mis-splicing in disease. Nat Rev Genet. 2016; 17(1):19–32. pmid:26593421
- 9. Lee Y, Rio DC. Mechanisms and regulation of alternative pre-mRNA splicing. Annu Rev Biochem. 2015; 84:291–323. pmid:25784052
- 10. Pérez-Valle J, Vilardell J. Intronic features that determine the selection of the 3’ splice site. WIREs RNA. 2012; 3: 707–717. pmid:22807288
- 11. Niemelä EH, Verbeeren J, Singha P, Nurmi V, Frilander MJ. Evolutionarily conserved exon definition interactions with U11 snRNP mediate alternative splicing regulation on U11-48K and U11/U12-65K genes. RNA Biol. 2015; 12:1256–1264. pmid:26479860
- 12. Turunen JJ, Verma B, Nyman TA, Frilander MJ. HnRNPH1/H2, U1 snRNP, and U11 snRNP cooperate to regulate the stability of the U11-48K pre-mRNA. RNA. 2013; 19:380–389. pmid:23335637
- 13. Turunen JJ, Niemela EH, Verma B, Frilander MJ. The significant other: splicing by the minor spliceosome. WIREs RNA. 2013; 4:61–76. pmid:23074130
- 14. Matlin AJ, Clark F, Smith CW. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol. 2005; 6: 386–398. pmid:15956978
- 15. Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, et al. Deciphering the splicing code. Nature. 2010; 465: 53–59. pmid:20445623
- 16. Luco RF, Allo M, Schor IE, Kornblihtt AR, Misteli T. Epigenetics in alternative pre-mRNA splicing. Cell. 2011; 144: 16–26. pmid:21215366
- 17. Gómez-Acuña LI, Fiszbein A, Alló M, Schor IE, Kornblihtt AR. Connections between chromatin signatures and splicing. Wiley Interdiscip Rev RNA. 2013; 4: 77–91. pmid:23074139
- 18. Shin C, Manley JL. Cell signaling and the control of pre-mRNA splicing. Nat Rev Mol Cell Biol. 2004; 5: 727–738. pmid:15340380
- 19. Martinez NM, Lynch KW. Control of alternative splicing in immune responses: many regulators, many predictions, much still to learn. Immunol Rev. 2013; 253: 216–236. pmid:23550649
- 20. Holste D, Huo G, Tung V, Burge CB. HOLLYWOOD: a comparative relational database of alternative splicing. Nucleic Acids Res. 2006; 34: D56–D62. pmid:16381932
- 21. Stamm S, Riethoven JJ, Le Texier V, Gopalakrishnan C, Kumanduri V, Tang Y, et al. ASD: a bioinformatics resource on alternative Splicing. Nucleic Acids Res. 2006; 34: D46–D55. pmid:16381912
- 22. Lee Y, Lee Y, Kim B, Shin Y, Nam S, Kim P, et al. ECgene: an alternative splicing database update. Nucleic Acids Res. 2007; 35: D99–D103. pmid:17132829
- 23. Namshin K, Alekseyenko AV, Roy M, Lee C. The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species. Nucleic Acids Res. 2007; 35: D93–D98. pmid:17108355
- 24. Huang YH, Chen YT, Lai JJ, Yang ST, Yang UC. PALS db: Putative Alternative Splicing database. Nucleic Acids Res. 2002; 30:186–190. pmid:11752289
- 25. Pospisil H, Herrmann A, Bortfeldt RH, Reich JG. EASED: Extended Alternatively Spliced EST Database. Nucleic Acids Res. 2004; 32: D70–D74. pmid:14681361
- 26. Huang HD, Horng JT, Lin FM, Chang YC, Huang CC. SpliceInfo: an information repository for mRNA alternative splicing in human genome. Nucleic Acids Res. 2005; 33: D80–D85. pmid:15608290
- 27. de la Grange P, Dutertre M, Correa M, Auboeuf D. A new advance in alternative splicing databases: from catalogue to detailed analysis of regulation of expression and function of human alternative splicing variants. BMC Bioinfor. 2007; 8: 180
- 28. Busch A, Hertel KJ. HEXEvent: a database of human exon splicing events. Nucleic Acids Res. 2013; 41: D118–D124. pmid:23118488
- 29. Ji HK, Zhou Q, Wen F, Xia HY, Lu X, Li YD. AsMamDB: an alternative splice database of mammals. Nucleic Acids Res. 2001; 29: 260–263. pmid:11125106
- 30. Dubchak I, Brudno M, Gelfand MS, Zorn M, Dubchak I. ASDB: database of alternatively spliced genes. Nucleic acids Res. 2000; 28: 296–297. pmid:10592252
- 31. Burset M, Seledtsov IA, Solovyev VV. SpliceDB: database of canonical and non-canonical mammalian splice sites. Nucleic Acids Res. 2001; 29: 255–259. pmid:11125105
- 32. Shionyu M, Yamaguchi A, Shinoda K, Takahashi K, Go M. AS-ALPS: a database for analyzing the effects of alternative splicing on protein structure, interaction and network in human and mouse. Nucleic Acids Res. 2009; 37: D305–D309. pmid:19015123
- 33. Giulietti M, Piva F, D'Antonio M, D'Onorio De Meo P, Paoletti D, Castrignanò T, et al. SpliceAid-F: a database of human splicing factors and their RNA-binding sites. Nucleic Acids Res. 2013; 41: D125–D131. pmid:23118479
- 34. Zhou HL, Luo G, Wise JA, Lou H. Regulation of alternative splicing by local histone modifications: potential roles for RNA-guided mechanisms. Nucleic Acids Res. 2014; 42: 701–713. pmid:24081581
- 35. Luco RF, Pan Q, Tominaga K, Blencowe BJ, Pereira-Smith OM, Misteli T. Regulation of alternative splicing by histone modifications. Science. 2010; 327: 996–1000. pmid:20133523
- 36. Shukla S, Oberdoerffer S. Co-transcriptional regulation of alternative pre-mRNA splicing. Biochim Biophys Acta. 2012; 19: 673–683.
- 37. Hnilicová J, Hozeifi S, Duskova E, Icha J, Tomankova T, Stanek D. Histone deacetylase activity modulates alternative splicing. PLoS One. 2011; 6: e16727. pmid:21311748
- 38. Chen M, Manley JL. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat Rev Mol Cell Biol. 2009; 10: 741–754. pmid:19773805
- 39. Blaustein M, Pelisch F, Tanos T, Muñoz MJ, Wengier D, Quadrana L, et al. Concerted regulation of nuclear and cytoplasmic activities of SR proteins by AKT. Nat Struct Mol Biol. 2005; 12: 1037–1044. pmid:16299516