Alternative Splicing Regulated by Butyrate in Bovine Epithelial Cells

As a signaling molecule and an inhibitor of histone deacetylases (HDACs), butyrate exerts its impact on a broad range of biological processes, such as apoptosis and cell proliferation, in addition to its critical role in energy metabolism in ruminants. This study examined the effect of butyrate on alternative splicing in bovine epithelial cells using RNA-seq technology. Junction reads account for 11.28 and 12.32% of total mapped reads between the butyrate-treated (BT) and control (CT) groups. 201,326 potential splicing junctions detected were supported by ≥3 junction reads. Approximately 94% of these junctions conformed to the consensus sequence (GT/AG) while ∼3% were GC/AG junctions. No AT/AC junctions were observed. A total of 2,834 exon skipping events, supported by a minimum of 3 junction reads, were detected. At least 7 genes, their mRNA expression significantly affected by butyrate, also had exon skipping events differentially regulated by butyrate. Furthermore, COL5A3, which was induced 310-fold by butyrate (FDR <0.001) at the gene level, had a significantly higher number of junction reads mapped to Exon#8 (Donor) and Exon#11 (Acceptor) in BT. This event had the potential to result in the formation of a COL5A3 mRNA isoform with 2 of the 69 exons missing. In addition, 216 differentially expressed transcript isoforms regulated by butyrate were detected. For example, Isoform 1 of ORC1 was strongly repressed by butyrate while Isoform 2 remained unchanged. Butyrate physically binds to and inhibits all zinc-dependent HDACs except HDAC6 and HDAC10. Our results provided evidence that butyrate also regulated deacetylase activities of classical HDACs via its transcriptional control. Moreover, thirteen gene fusion events differentially affected by butyrate were identified. Our results provided a snapshot into complex transcriptome dynamics regulated by butyrate, which will facilitate our understanding of the biological effects of butyrate and other HDAC inhibitors.


Introduction
Alternative splicing (AS) is an important driving force behind vast protein diversity and the evolution of phenotypic complexity in mammals. AS generates various protein isoforms from single genes with different biological properties [1] and plays a critical role in development and disease [2][3][4]. Furthermore, AS is under strict regulatory control, including an accurate recognition of the splice junction that defines intron and exon boundary [5]. AS is mediated by elaborate molecular machinery, the spliceosome, which consists of at least 5 small nuclear RNAs (snRNAs) and over 100 accessory proteins [6]. Mutations in the core element of the spliceosome cause disease [7]. Up to 94% of human genes are estimated to undergo AS [4]; hence, aberrant AS is frequently associated with numerous human diseases [1]. Therefore, splicing modulation has been touted as a therapeutic means for treating genetic diseases caused by splicing mutations [8].
Butyrate is a preferred substrate for gut epithelial cells. In ruminants, butyrate contributes to 70% of energy metabolism. In monogastric species, butyrate also plays an important role in energy metabolism in the hindgut [9]. Moreover, butyrate acts as a histone deacetylase inhibitor and can induce apoptosis and inhibit cell proliferation in vitro [10]. As a signaling molecule, butyrate induces a profound change in gene expression and results in a significant change in transcript abundance of ,50% genes transcribed in the transcriptome of epithelial cells [11]. The effect of butyrate on growth rate of ovarian carcinoma cells in vitro is associated with a significant decrease of protooncogene MYC. Down-regulation of MYC is accomplished at least partially by accelerated MYC mRNA degradation and by inhibiting splicing [12]. Furthermore, butyrate is known to have an impact on AS of vascular endothelial growth factor (VEGF) gene [13]. Butyrate significantly increases the expression levels of both mRNA and protein of transcript variants that are antiangiogenic in human lung endothelial cells while the proangiogenic isoform is not detectable [13]. However, the effect of butyrate on AS has not been systematically investigated in cattle. In this study, we presented evidence that butyrate can influence AS using high-throughput transcriptome technology and bioinformatics.

Splice Junction Inference
The total number of reads mapped to the bovine genome (UMD3.1) using the Genomic Short-read Nucleotide Alignment Program (GSNAP) [14] were 52. 25 and 47.73 million for the butyrate-treated (BT) and control (CT) groups, respectively (Table 1). For the sake of simplicity, only reads uniquely mapped to the genome (i.e., hitting only one locus) were counted. Among them, junction reads-those spanning exon-exon junctions with at least 8 bp of overhangs in each exon-accounted for 11.28 and 12.32% for the butyrate and control groups, respectively (P,0.001). These junction coverage results were in agreement with a published report [15], which shows 10-16% reads containing splice junctions in breast tumor samples. A total of 799,589 potential splicing junction sites were detected. Of them, 201,326 splice junction sites were supported by a minimum of 3 junction reads (mean; N = 8). Approximately 94% of all splicing sites conformed to the consensus sequence, GT/AG while ,3% of junction sites belonged to the most common non-consensus splice junction GC/AG. The percentage of GC/AG junction observed in bovine epithelial cells was higher than that observed in humans and other species [16] where the GT-GC conversation represents only a small fraction at the donor sites (,1%). No AT/AC splice junction sites were observed in the bovine epithelial cell in this study (Table 1).

Exon Skipping
A total of 2,834 potentially consistent exon skipping events were detected using GSNAP. A minimum of three junction reads (mean across all eight samples tested) supported each of these events. These junction reads were mapped to the same chromosome and the same strand in the correct order within a close proximity of 20,000 bp as defined (consistent), as compared to other common events, such as scrambles, where the two halves of junction reads were mapped to the same chromosome and on the same strand but in the wrong order. The 2,504 events were involved in 2,113 genes. Table 2 lists select exon skipping events with the difference in mean junction read counts greater than 20.0 between butyratetreated and control groups. The gene expression level of all genes involved in these exon skipping events was significantly regulated by butyrate treatment at a stringent false discovery rate (FDR ,0.0001) [11]. For example, collagen, type V, alpha 3 (COL5A3) was significantly up-regulated (,310 fold) by butyrate (P,0.0001 and FDR ,0.0001). The mean number of junction reads mapped to exon #8 (donor) and exon#11 (acceptor) in butyrate-treated cells was 16.00 (65.10, sd) while no junction reads spanning this region (0.0060.00) were detected in control cells. This event had the potential to result in the formation of an mRNA isoform with 2 of the COL5A3 69 exons missing (exon#9 and exon#10). Quantitative RT-PCR confirmed this event (data not shown). Similarly, Exon#2 of S100A1 gene may be skipped, supported by multiple junction reads in BT, and the resultant new isoform was also differentially regulated by butyrate (Table 2).

Gene Fusion Events Significantly Affected by Butyrate
Only consistent splice junctions-both halves of junction reads mapped to the same chromosome strand in the correct order within a default distance, 20,000 bp-were considered in this study. Translocation (two halves mapped to different chromosomes), inversion (two halves mapped to the same chromosome but on opposite strands), scramble (on the same chromosome strand but in an incorrect order) as well as the genomic deletion [17] were excluded. The local gene fusion events or read-through in the bovine epithelial cells were readily detected by GSNAP. Of these, 13 events were significantly affected by butyrate (P,0.05; Table 3). The genes involved in these events were within the genomic interval set by the default parameter, -w = 20kb. As the table shows, about half of the events are involved in the members of gene families. For example, the gene fusion event involving keratin 8 (donor gene) and keratin 4 (acceptor gene) was supported by 236 and 165 junction reads for BT and CT groups, respectively (P,0.0001). This event was verified independently using RT-PCR. Similarly, a gene fusion between homeobox A6 (HOXA6) and homeobox A5 (HOXA5) was supported by 3.47 and 7.19 junction reads for BT and CT, respectively (P,0.05). Interestingly, this gene fusion also started from an alternative 59 splice junction (donor site), changing the 39 boundary of the upstream exon.

Differentially Expressed Isoforms Induced by Butyrate
Mixture-of-Isoforms algorithm (MISO), a probabilistic framework that enables quantification of the expression level of alternatively spliced genes and identification of differentially regulated isoforms from RNA-Seq data [18], was used to detect differentially expressed isoforms induced by butyrate in bovine epithelial cells. In this study, MISO detected a total of 216 isoforms from 98 genes that were differentially regulated by butyrate using a combined cutoff of DY .0.20 (i.e., the posterior distribution over the change in Psi for each event) and Bayes factor .10. The Bayes factor represents the weight of the evidence in the data in favor of differential expression. For example, Bayes factor = 2 means that the isoform is two times more likely to be differentially expressed. Of these events, 50 isoforms from 36 genes were supported by at least 3 pair-wise comparisons ( Table 4); 29 of the 36 genes had two mRNA isoforms while the rest had at least 3 isoforms. The mRNA-level expression of defensin, beta 1 gene (DEFB1) was not detected in control cells but significantly upregulated by butyrate (FDR ,0.001). This gene had two isoforms. Isoform #1 had 3 exons and was not detectable in control cells but barely detectable in butyrate-treated cells. However, Isoform#2 had 2 exons and was strongly induced by butyrate (Fig. 1). Similarly, IL-18 (ENSBTAG00000000277) had 2 isoforms with the same number of exons. These 2 isoforms differed in the exon structure, nevertheless. Isoform#1 had a similar number of assigned sequence reads while Isoform#2 had approximately 10 times more assigned reads in butyrate-treated cells (252.08630.51, mean 6 sd) than in control cells (25.94613.69) (P,0.0001, Fig. 2). Isoform #1 of origin recognition complex, subunit 1 (ORC1, The mean length of input reads was 49 bp and two mismatches were allowed by GSNAP. Reads hitting more than one locus in the genome were excluded. Each exon-exon junction was supported by at least one junction (spliced) read with a minimum of eight bases aligned to each exon. Read counts were normalized. Numbers denote mean 6 sd (N = 4). doi:10.1371/journal.pone.0039182.t001 transcript# ENSBTAT00000052505) had mean read counts of 168.26652.75 (6 sd) and 1239.75648.02 in butyrate and control groups, respectively, and was strongly repressed by butyrate. Isoform#2 of ORC1 remained unchanged. Histone deacetylase 10 (HDAC10) had 2 isoforms with the read count ratio between dominant (ENSBTAT00000014602) and minor (EN-SBTAT00000042583) isoforms greater than 150 in the control cells. Similarly, the dominant isoform was significantly repressed (P,0.00001), while the minor isoform was not affected by butyrate. However, the expression of both isoforms of HDAC5 was strongly enhanced by butyrate treatment (data not shown). For genes with $3 isoforms, one or 2 isoforms were often differentially regulated by butyrate. As Figure 3 shows, the expression level of Isoforms# 1 and 3 of aminoacylase 1gene (ACY1) was unchanged but that of Isoform#2 was strongly repressed by butyrate (Fig. 3). On the other hand, Isoform#2 of coiled-coil domain containing 24 (CCDC24) was little changed in response to butyrate treatment while Isoforms#1 and 3 were significantly induced by butyrate (Fig. 4). Moreover, differential expression of 6 selected transcript isoforms induced by butyrate, including transcripts such as ENSBTAT00000009005 (LGALS9), ENSBTAT00000010687 (CLSTN3), ENSBTAT00000012035 (MX1), ENSBTAT00000020401 (NASP), EN-SBTAT00000036545 (GUK1), and ENSBTAT00000061158 (TACC2), were confirmed using RT-PCR.

Discussion
Butyrate is one of important short-chain fatty acids (SCFAs). Produced in the rumen and hindgut by gut microorganisms, butyrate is rapidly absorbed and utilized by the rumen and colon epithelium, contributing to 75% of the total energy requirement in ruminants and ,10% for humans [9,19]. Additionally, butyrate is shown to reinforce intestinal barriers and modulate motility and visceral sensitivity of the intestine [20]. As a signaling molecule, butyrate induces apoptosis and inhibits cell proliferation, and therefore, it has antitumorigenic properties. Butyrate is an inhibitor of histone deacetylases (HDACs), one of three classes of enzymes epigenetically modifying chromatin histones. Mounting evidence suggests histone acetylation plays a major role in controlling transcriptional activities of genes [21,22]. HDAC inhibitors such as butyrate induce hyperacetylation of histones, and therefore increase transcriptional activities [2]. This was supported by our observation that the total number of genes transcribed in butyrate-treated cells (mean 6 sd = 19,3226155)    [11,19]. However, these studies focus on the transcription at a gene level. The effect of butyrate on individual transcript isoforms and alternative splicing has been systematically studied only recently in human cells [27]. In this study, we examined the regulation of alternative splicing by butyrate in bovine epithelial cells. Our results should facilitate a better understanding of alternative splicing in the development of epithelial cells-derived diseases.
Of four classes of histone deacetylases, butyrate inhibits enzymatic activities of most HDACs in Class I, II, and IV, which are zinc-dependent, except HDAC6 and HDAC10 [28]. Class III  The number under Butyrate and Control denotes mean counts of junction reads (6 sd; N = 4). Exon denotes the exons in the transcript. Exon 1-17 indicates that this transcript contains exon 1 to exon 17. DY denotes the difference in mean posterior distribution between two samples. Bayes factor represents the odds of differential expression over no differential expression (mean of all significant comparisons). P values were calculated from junction reads between butyrate-treated and untreated control groups using a modified t-test. doi:10.1371/journal.pone.0039182.t004  HDACs (also called sirtuins or SIRTs) depend on nicotinamide adenine dinucleotide for their catalytic activity [29]. SIRTs are associated with chromtain regulation and affect genome stability in yeast and may represent pivotal regulators of lifespan and aging [30]. SIRTs catalyze two major biochemical reactions: deacetylation on lysine residues of target proteins by altering cellular [NAD + ]/[NADH] ratios (SIRT1, SIRT2, SIRT3, SIRT5, and SIRT7) and ADP-ribosylation (SIRT4 and SIRT6) [31]. In neuronal cells, SIRT1, SIRT5, and SIRT6 are down-regulated, whereas SIRT2, SIRT4, and SIRT7 up-regulated by butyrate [32]. Our RNA-seq data suggest that butyrate regulated the transcript abundance or gene expression of the majority of HDACs (Table 5). Butyrate significantly increased the expression of HDAC3, HDAC5, and HDAC11. On the other hand, the expression level of HDACs7-10 was significantly down-regulated. The mRNA levels of SIRT4 and SIRT6 were strongly upregulated while SIRT1 was significantly down-regulated by butyrate (Table 5). However, the relative abundance of HDAC1, HDAC2, HDAC4, and HDAC6 remained unchanged by butyrate. In addition to its effect on the expression at the whole gene level, butyrate selectively regulated the transcript abundance of different mRNA isoforms. While the abundance of both short and long isoforms of HDAC5 was significantly enhanced by butyrate, only the long and dominant isoform of HDAC10 was significantly down-regulated. Intriguingly, butyrate is unable to inhibit the catalytic activity of HDAC10 [28]. Nevertheless, butyrate may still exert its control on the deacetylase activity of HDAC10 via transcriptional regulation at the mRNA level. Our future work will focus on the biological relevance of various HDAC isoforms induced by butyrate, especially various SIRTs and their roles in cell senescence and aging. Distant gene fusion events are well known in tumors, often resulting from genomic abnormalities such as chromosomal translocation. These events, such as BCL-ABL, lead to the formation of a novel chimeric protein with different functions and are one of the common mechanisms for oncogene activation [33]. Recently, a new type of fusion involving two adjacent genes in the same orientation on the same chromosome has been described [34,35]. Adjacent genes are normally transcripted independently. However, a single transcript can be occasionally formed to include at least part of one exon from each of two or more distinct genes [36]. This phenomenon, Transcription Induced Chimeras (TICs) [35] or Conjoined Genes (CGs) [36], is widespread in mammalian genomes. It is estimated that at least 4%-5% of the tandem gene pairs in the human genome can be transcripted into TICs. Moreover, these TICs may possess novel functions because .70% of them are conserved in other vertebrate genomes [36]. In this  study, we detected 13 TICs that were supported by multiple junction reads. Intriguingly, these TICs were also differentially regulated by butyrate in bovine epithelial cells. Approximately 46% of these fusion events were involved in the members of gene families, which is much higher than 11% as previously reported [36]. For example, TICs were formed between 2 homeobox genes, HOXA6 and HOXA5 and between 2 keratin genes, KRT8 and KRT4. In addition, the fusion between zinc finger proteins ZNF865 and ZNF524 was also supported by multiple junction reads, and a significantly higher number of reads was detected in untreated control cells than in butyrate-treated cells. A similar fusion event between ZNF649 and ZNF577 was identified in prostate tumors [37]. A relatively higher percentage of TICs between genes with related functions identified in this study should be further examined. Most importantly, the functional significance of these fusion events, especially their possible role in transcription regulation, should be experimentally determined.
One advantage of GSNAP algorithm is its potential to identify novel splice sites and therefore possibly novel transcript isoforms. The algorithm relies on a maximum entropy model and uses frequencies of nucleotides neighboring a donor and acceptor splice site to discriminate between true and false splicing sites [14]. The power of this approach was exemplified by a case study involving the prohibitin gene (PHB). The prohibitin protein complex, located in mitochondrial inner membrane, is formed by heteromeric binding of both PHB and PHB2 [38] and is involved in transcription regulation and cell cycle progression by blocking the G1/S transition of the cell cycle [39]. Prohibitin induces apoptosis by interacting with the retinoblastoma protein as well as being involved in the p53 pathway. Its 39 untranslated region (UTR) acts as a novel class of non-coding regulatory RNAs. Additionally, PHB expression is up-regulated in the retina in aging and diabetic models and may serve as an oxidative marker [38]. A recent study using thyroid tumor cell lines demonstrates that butyrate increases PHB mRNA expression. Furthermore, butyrate as well as other HDAC inhibitors, such as trichostatin A, affects PHB splicing [2], leading to the over-expression of the longer isoform with 39 UTR. Both inhibitors decrease the mRNA levels of the shorter isoform but increase those of the longer isoform, which exerts a growthsuppressive action. Our results showed that butyrate significantly down-regulated the mRNA expression of both PHB and PHB2 in the bovine epithelial cell (FDR ,0.0001). No known isoforms in both genes have been annotated in cattle so far. Annotated PHB and PHB2 genes have 7 and 10 exons, respectively. GSNAP correctly identified all normally splicing exon-exon junctions. Moreover, GSNAP detected novel splice sites. For example, several junction reads detected in the untreated control cells suggest a possible exon skipping event that may result in skipping of Exon#2 in PHB2. Such reads were not detectable in the butyrate-treated cells. In the PHB gene, significantly higher numbers of junction reads in the control group than the butyratetreated group indicated that multiple alternative splicing events involved Exon#1 and Exon#2 and the intron between them. These events occurred in the 59 UTR and did not seem to alter its primary protein structure. In humans, the 39 UTR of PHB is attributed to its anti-tumorigenic and anti-proliferative properties [40]. The biological implication of various splicing events in the 59 UTR of PHB genes in cattle is worthy of further investigation.

Treatments
The bovine epithelial cells used in this study were previously described [11]. The culture and sodium butyrate treatment was essentially the same as reported [26]. Briefly, cells were treated with 10 mM sodium butyrate for 24h. As a result of butyrate treatment, cell cycle arrest was notable. The percentage of G1/G0 cells was increased from ,41% in normal cell populations to 79% in butyrate-treated cell populations, in a good agreement with previous results [10]. Cells were then harvested and high-quality total RNA (RNA Integrity number or RIN .9.0) was processed using an Illumina TruSeq RNA sample prep kit following the manufacturer's instruction (Illumina, San Diego, CA, USA). After quality control procedures, individual RNA-seq libraries were then pooled based on their respective sample-specific 6-bp adaptors and sequenced at 50bp/sequence read using an Illumina HiSeq 2000 sequencer as described previously [41]. The mean number of sequence reads generated per sample was 67,527,11168,330,388.58 (6 sd). A total of eight replicates in two groups, butyrate treated (BT) and untreated (control or CT; N = 4 for each group), were used. Raw sequence reads were deposited to the NCBI Sequence Read Archive (accession# SRA051007.1).

Data Analysis and Bioinformatics
Raw sequence reads were first checked using our quality control pipeline. Nucleotides of each raw read were scanned for low quality and trimmed using SolexaQA [42]. Input reads were then aligned to the bovine reference genome using GSNAP [14] with default parameters. Two mismatches were tolerated (''-m 20 flag). The intron size was specified by ''-w'' = 20,000 bp. Read counts, including junction reads, were extracted from GSNAP -N flag output files and further analyzed.
For differentially expressed isoform detection, Mixture-of-Isoforms program (MISO v 0.4.1, released February 1, 2012) [18] was used. Trimmed reads were first aligned to the bovine genome (UMD3.1) with TopHat [43] using default parameters. The TopHat SAM output files were converted into BAM files and input into MISO with the GTF file from ENSEMBL bovine genebuild release 65. Differentially expressed isoforms were detected using the following filtering parameters: the sum of inclusion and exclusion reads is greater than 10 ($1 inclusion read and $1 exclusion read), DY greater than 0.20, and the Bayes factor $10. In addition, each SAM output file from TopHat alignment, along with the GTF file from ENSEMBL bovine genebuild release v65, were used in the Cuffdiff program [44] in the Cufflink package (v1.3.0) to test for differential expression at the gene level, as described previously [19].

Quantitative RT-PCR
Ten selected alternative splicing events, including 2 exonskipping event, 2 gene fusion events, and 6 transcript isoforms, were tested using real-time RT-PCR as described previously [45]. Briefly, the cDNA synthesis was performed using an iScript cDNA Synthesis kit (Bio-Rad, Hercules, CA). Real-time RT-PCR analysis was carried out with an iQ SYBR Green Supermix kit (Biorad) using 200 nM of each primer and the 1 st -strand cDNA (100 ng of the input total RNA equivalents) in a 25 ml reaction volume. The amplification was carried out on an iCycler iQ TM Real Time PCR Detection System (BioRad) with the following profile: 95uC -60s; 40 cycles of 94uC-15s, 60uC -30s, and 72uC -30s. A melting curve analysis was performed for each primer pair. PCR products were further analyzed on a high-sensitive DNA chip using an Agilent Bioanalyzer 2100 (Agilent, Palo Alto, CA, USA) for product length.