Genome-wide identification and analysis of A-to-I RNA editing events in the malignantly transformed cell lines from BEP2D induced by α-particles radiation

Adenosine (A) to inosine (I) RNA editing is the most prevalent RNA editing mechanism in humans and play critical roles in tumorigenesis. However, the effects of radiation on RNA editing and the mechanisms of radiation-induced cancer were poorly understood. Here, we analyzed human bronchial epithelial BEP2D cells and radiation-induced malignantly transformed cells with next generation sequencing. By performing an integrated analysis of A-to-I RNA editing, we found that genome-encoded single-nucleotide polymorphisms (SNPs) might induce the downregulation of ADAR2 enzymes, and further caused the abnormal occurrence of RNA editing in malignantly transformed cells. These editing events were significantly enriched in differentially expressed genes between normal cells and cancer cells. In addition, oncogenes CTNNB1 and FN1 were highly edited and significantly overexpressed in cancer cells, thus may be responsible for the lung cancer progression. Our work provides a systematic analysis of RNA editing from lung tumor specimens with high-throughput RNA sequencing and DNA sequencing. Moreover, these results demonstrate further evidence for RNA editing as an important tumorigenesis mechanism.


Introduction
Lung cancer remains the leading cause of cancer death in both men and women, and radon exposure is the second most common cause of lung cancer after smoking [1]. However, the molecular mechanisms of radon-induced lung cancer remain unclear.
RNA editing is a post-transcriptional modification process, the deamination of adenosines (A) to inosines (I) is the prominent RNA editing event in humans, where ADAR enzymes convert A to I without affecting the DNA sequence identity [2]. Intriguingly, RNA editing plays an important role in tumorigenesis, such as recoding RNA editing of AZIN1 predisposes to hepatocellular carcinoma [3], RNA editing in RHOQ promotes invasion potential in colorectal cancer [4], and GABRA3 editing suppresses breast cancer metastasis [5]. Recent study has shown that, in non-small cell lung cancer samples, as ADAR gene amplification, the DNA repair enzyme NEIL1 (K242R) will increase recoding [6]. However, there were limited studies to date in further exploring the characteristics of RNA editing in lung cancer. What's more, the effect of radiation on RNA editing is poorly understood.
Here, we investigate A-to-I RNA editing in human bronchial epithelial cells (BEP2D) and malignantly transformed cells (BERP35T1 and BERP35T4), which are important models to characterize the radiation-mediated carcinogenesis of lung [7,8]. By performing high-throughput RNA sequencing, we identified A-to-I editing sites with three robust bioinformatics methods. We then systemically compared editing events in normal cells and cancer cells. Further, by performing genome-wide DNA sequencing, we revealed that the genomic variants in ADAR2 gene was responsible for the abnormal editing events in cancer cells. Finally, we reported two potential editing genes, CTNNB1 and FN1, in lung cancer.

Identification of A-to-I RNA editing
The prevalence and importance of A-to-I RNA editing have been illuminated in recent years largely owing to the rapid adoption of high-throughput sequencing technologies [9,10]. To analyze A-to-I RNA editing in BEP2D cells and malignantly transformed cell lines, we performed high-throughput RNA sequencing (RNA-Seq) on BEP2D cell and transformed BEP2D cells , which were irradiated with 1.5Gy dose of α -particles emitted by 238 PuO 2 . Two transformed cell lines, BERP35T1 and BERP35T4, were investigated (Fig. 1A). Three biological replicates were sequenced and analyzed for each cell line. We then calculated gene expression level with Cufflinks program [11], for each cell line, biological replicates of RNA-seq showed highly reproducible results (Fig. 1B). Thus, our sequencing data were of high quality for downstream analysis.
Recent studies have reported that the most challenging part of identifying RNA editing is the discrimination of RNA editing sites from genome-encoded single-nucleotide polymorphisms (SNPs) and technical artifacts caused by sequencing or read-mapping errors [12][13][14]. To accurately identify RNA editing sites, we performed three famous methods including GIREMI [15], RNAEditor [16] and Separate method from Jin Billy Li [12] (See Methods). The GIREMI method combines statistical inference of mutual information (MI) between pairs of single-nucleotide variants (SNVs) in RNA-seq reads with machine learning to predict RNA editing sites. RNAEditor identify RNA editing by detecting 'editing islands'. Separate method from Jin Billy Li identify RNA editing sites by strict filtering processes. For each sample, we only used RNA editing sites that can be detected in all three methods. For each cell line, we combined RNA editing sites from three biological replicates. As the most prevalent editing type in humans is adenosine-to-inosine (A-to-I) editing and most noncanonical editing are false positives [17], we only analyzed A-to-I RNA editing in this study. Final, 5659, 3820, 2446 A-to-I RNA editing sites were identified in BEP2D cell line and transformed cell lines BERP35T1 and BERP35T4, respectively (Table 1 and Supplemental Table 1).

A-to-I RNA editing and associated genes in BEP2D and malignantly transformed cell
We next investigated the difference of A-to-I RNA editing between BEP2D cell line and malignantly transformed cell lines. First, 3,683~4,217 editing events were disappeared and 1,004~1,844 new editing events occurred from normal BEP2D cell to malignantly transformed cells, indicating a dramatic changes of RNA editing when BEP2D cell was irradiated ( Fig. 2A). Generally, A-to-I editing is pervasive in Alu repeats because of the double-stranded RNA structures formed by inverted Alu repeats in many genes [18,19]. We found that although RNA editing is quite different in BEP2D and malignantly transformed cell lines, editing sites are still conserved in Alu repeats (Fig. 2B) and ~60% of RNA editing events occurred in intergenic regions (Fig. 2C), thus, the basic distribution of A-to-I RNA editing did not change.
Next, we examined genes targeted by A-to-I RNA editing sites (editing genes for short). In general, 484, 426 and 305 genes were edited in BEP2D, BERP35T1 and BERP35T4 cell lines (Supplemental Table 2). We found that, in BEP2D cell line, 88% of editing genes were targeted by BEP2D-specific editing sites, but in BERP35T1 and BERP35T4, only 70% and 53% of editing genes were targeted by cell-specific editing sites (Fig. 2D). This result suggested that the editing rate of genes decreased when cell was irradiated and malignantly transformed. In addition, we performed gene ontology (GO) analysis to reveal the biological function of editing genes. We found that editing genes were enriched in different biological processes. For BEP2D, editing genes were enriched in protein processes and translation process, for BERP3T1, nervous system development and hemophilic cell adhesion process were highlighted and editing genes were enriched in DNA replication in BERP35T4 cell line (Fig. 2E).
To examine whether RNA editing affects transcription activity, we identified differentially expressed genes (DEGs) by performing Cuffdiff program [11] (Supplemental Table 3). We found that DEGs between normal BEP2D cell and malignantly transformed cells were significantly enriched in genes with RNA editing events (p value < 1E-3, hypergeometric test, Fig. 2F). This observation indicated that the dynamics of RNA editing sites was related to gene dysregulation and may further induce tumorigenesis.

ADAR2 down-regulation by genome SNPs
We next investigated the mechanism responsible for the differences observed in RNA editing between normal BEP2D cell and malignantly transformed cells. In human, A-to-I editing is performed by the ADAR family, which contains 3 genes: ADAR1, ADAR2 and ADAR3 [20][21][22]. We thus examined the transcript levels of ADAR genes. The expression level of ADAR1 in BEP2D cell was comparable to that in BERP35T1 and BERP35T4 cell and ADAR3 was silence in both BEP2D cell and malignantly transformed cells (Fig. 3A). However, ADAR2 expression level significantly reduced in BERP35T1 and BERP35T4 (Fig. 3B). Previous studies have confirmed that ADAR2 is lowly expressed in cancer e.g. glioblastoma [23], gastric cancer [24]. Thus the tumor progression of BEP2D seems to mainly be induced by ADAR2 downregulation.
We further explored the possible mechanism of ADAR2 downregulation in malignantly transformed cells. Radiation induced DNA alterations change gene expression and further increase cancer risk [25,26]. We next performed DNA sequencing (DNA-seq) on BEP2D cell line and malignantly transformed cells. We identified genome-encoded single-nucleotide polymorphisms (SNPs) using GATK pipeline [27] (See Methods). Surprisingly, nearly 2-fold SNPs in ADAR2 gene were detected in malignantly transformed cells compared to normal BEP2D (Fig. 3C, Supplemental Table 4) and more SNPs in ADAR2 exon were observed (Fig. 3D). For example, known SNP rs11701974, a genetic variant of HLA-DQB1 associated with human longevity [28], was detected in 3' UTR of ADAR2 and specific in BERP35T1 cell and BERP35T4 cell (Fig. 3E). Moreover, we identified 32 and 24 novel SNPs in BERP35T1 cell and BERP35T4 cell, respectively (Supplemental Table 4). For example, chr21:46643782 was altered in malignantly transformed cells (Fig. 3F). These results indicate that genomic variants induced by radiation may lead ADAR2 downregulation and further decrease the editing rate of malignantly transformed cells.

Oncogene CTNNB1 and FN1 are highly edited and significantly overexpressed in malignantly transformed cell
To gain insights into the biological relevance of RNA editing in malignantly transformed cell, we investigated 285 oncogenes from previous studies [29] (Supplemental Table 5). We found that oncogenes CTNNB1, PABPC1 and VHL were edited in BERP35T1 cell. Notably, the expression level of CTNNB1 in BERP35T1 cell was significantly higher than that in BEP2D cell (p-value = 0.00185, Fig. 4A). Previous studies reported that activating mutations in CTNNB1 have oncogenic activity resulting in tumor development and somatic mutations are found in various tumor types [30][31][32][33]. We found two A-to-I editing events (chr3:41262966 and chr3:41262974) occurred in BERP35T1 cell and CTNNB1 was overexpression in BERP35T1 cell (Fig. 4B). Similarly, we found three oncogenes FN1, METTL14 and VHL were edited in BERP35T4 cell. Notably, the expression level of FN1 in BERP35T4 cell was significantly higher than that in BEP2D cell (p-value = 5E-5, Fig. 4C). Previous studies have reported that transcriptional activation of FN1 and gene fusions of FN1 promote the malignant behavior of multiple cancers [34][35][36]. A strong A-to-I editing event (chr2:216236508) and a weak A-to-I editing event (chr2:216236482) were observed in BERP35T4 and FN1 was overexpression in BERP35T4 cell (Fig. 4D). These results suggest that RNA editing are associated with oncogene overexpression and may further induce cancer progression.

Discussion
Radon is a recognized cause of lung cancer, however, the cellular and molecular mechanisms of radon-induced lung cancer remains unknown. To facilitate the study of this question, in our previous work, we established a model system of α -particle transformed human cells [37]. Then, we found a number of alterations in these cell models, including cytogenetics [38,39], gene expression [40], DNA repair [8], and genomic instability [41]. However, a genome-wide systematic analysis of this model based on next generation of sequencing is absent. In this work, we performed high-throughput RNA sequencing and genome-wide DNA sequencing to this model and discovered a new mechanism probably for tumorigenesis.
We all known the radon-oncogenic effect is DNA damage, but there were few articles about the effects of radiation on RNA editing. In this work, we provided genome-wide identification and analysis of A-to-I RNA editing events in the malignantly transformed cell line BEP2D induced by α -particles radiation, the results show that RNA editing sites changed greatly and the total amount decreased after radiation.
In cancer research, mutated proteins have been widely used as biomarkers and therapeutic targets, it is a fundamental issue to understanding the mechanisms contributing to protein diversity in cancer cells. RNA editing plays an important role in post-translational modification which can affect mRNA's structure and stability [40,42], but little is known about how RNA editing operates in cancer [43]. Our work found that, in these cell models, genome-encoded single-nucleotide polymorphisms (SNPs) induced the downregulation of ADAR2 enzymes, and further caused the abnormal occurrence of RNA editing in malignantly transformed cells. Then, the abnormal occurrence of RNA editing led to abnormal expression of oncogenes, such as, CTNNB1 and FN1, thus may be responsible for the lung cancer progression. These results demonstrate further evidence for RNA editing as an important tumorigenesis mechanism, and RNA editing sites might represent a new class of therapeutic targets.

Cell culture
The No. CRL-9609), which are three tranformants derived from human bronchial epithelial cell (supplementary S1). The BERP35T1 and BERP35T4 malignant transformant cell lines were derived from BEP2D cells irradiated with 1.5 Gy of α -particle emitted from 238 Pu source and were described in detail in a previous paper [8]. The cells were cultured in serum-free LHC-8 medium (Gibco, USA) at 37˚C under a 95% air/5% CO 2 atmosphere.

RNA sequencing
Total RNAs were extracted from cells with RNAiso Reagent (TaKaRa, Dalian, China) following the manufacturer's instruction. RNA degradation and contamination was monitored on 1% agarose gels.
RNA purity was checked using the NanoPhotometer® spectrophotometer (IMPLEN, CA, USA At last, PCR products were purified (AMPure XP system) and library quality was assessed on the Agilent Bioanalyzer 2100 system.

The clustering of the index-coded samples was performed on a cBot Cluster Generation
System using TruSeq PE Cluster Kit v3-cBot-HS (Illumia) according to the manufacturer's instructions.
After cluster generation, the library preparations were sequenced on an Illumina Hiseq platform and 150 bp paired-end reads were generated.

DNA sequencing
Total DNAs were extracted from cells with DNAiso Reagent (TaKaRa, Dalian, China) following the manufacturer's instruction. The quality of isolated genomic DNA was verified by using these two methods in combination: (1) DNA degradation and contamination were monitored on 1% agarose gels.

RNA editing identification
We adopted three previously published methods to accurately identify A-to-I RNA editing sites.
For Jinbilly's method [12], we used the Burrows-Wheeler algorithm (BWA) [45] to align RNA-seq reads to a combination of the reference genome (hg19) and exonic sequences surrounding known splice junctions from available gene models. We chose the length of the splice junction regions to be slightly shorter than the RNA-seq reads to prevent redundant hits. Picard (http://picard.sourceforge.net/ ) was then used to remove identical reads (PCR duplicates) that mapped to the same location. GATK tools were used to perform local realignment around insertion and/or deletion polymorphisms and to recalibrate base quality scores. Variant calling was performed using GATK UnifiedGenotyper tool with options stand_call_conf of 0 and stand_emit_conf of 0. Further filtering were performed as described [46].
For GIREMI method [15], RNA-seq mapping and preprocessing was same as Jinbilly's method. For each mismatch position, a total read coverage of ≧ 5 was required and the variant allele was required to be present in at least three reads. We then the following types of mismatches: those located in simple repeats regions or homopolymer runs of ≧ 5 nt, those associated with reads substantially biased toward one strand, those with extreme variant allele frequencies (>95% or <10%) and those located within 4 nt of a known spliced junction. Finally, we perform GIREMI tool to call RNA editing.
For RNAEditor [16], fastq format files from RNA-seq data were directly used as input for RNAEditor tools to call RNA editing sites.

SNP identification
We used the Bowtie2 [47] to align DNA-seq reads to reference genome hg19, Picard (http://picard.sourceforge.net/ ) was then used to remove identical reads (PCR duplicates) that mapped to the same location. GATK tools were used to perform local realignment around insertion and/or deletion polymorphisms and to recalibrate base quality scores. Variant calling was performed using GATK UnifiedGenotyper tool.

Statistical analysis
Gene expression was calculated using Cufflinks program default parameters. Differentially expressed genes (DEGs) were identified by Cuffdiff program, three biological replicates for each cell line were combined as the input of Cuffdiff and a p-value was reported to show the significance of DEGs.
RNA editing targeted genes were assigned with 'bedops' program. GO analysis were performed by using DAVID [48].

Accession numbers
The sequencing data have been deposited with the Gene Expression Omnibus under the accession ID GSE143212.    Supplementary Tables   Table S1. RNA editing sites in each cell line. Table S2. Genes with RNA editing sites (sheet 1) and genes with tissue-specific RNA editing sites (sheet 2).

Supplementary Materials
Report of Cell Line Authentication.