Identification of RNA biomarkers for chemical safety screening in mouse embryonic stem cells using RNA deep sequencing analysis

Although it is not yet possible to replace in vivo animal testing completely, the need for a more efficient method for toxicity testing, such as an in vitro cell-based assay, has been widely acknowledged. Previous studies have focused on mRNAs as biomarkers; however, recent studies have revealed that non-coding RNAs (ncRNAs) are also efficient novel biomarkers for toxicity testing. Here, we used deep sequencing analysis (RNA-seq) to identify novel RNA biomarkers, including ncRNAs, that exhibited a substantial response to general chemical toxicity from nine chemicals, and to benzene toxicity specifically. The nine chemicals are listed in the Japan Pollutant Release and Transfer Register as class I designated chemical substances. We used undifferentiated mouse embryonic stem cells (mESCs) as a simplified cell-based toxicity assay. RNA-seq revealed that many mRNAs and ncRNAs responded substantially to the chemical compounds in mESCs. This finding indicates that ncRNAs can be used as novel RNA biomarkers for chemical safety screening.


Introduction
The 7th Amendment to the Cosmetics Directive banned animal testing of cosmetic ingredients for human use in 2013 [1].Although it is not yet possible to replace in vivo animal testing completely, the need for a more efficient method for toxicity testing has been widely acknowledged [2].Among the alternative methods to animal testing, the use of in vitro cell-based assays appears to be one of the most appropriate approaches to predict the toxic properties of single chemicals, particulate matter, complex mixtures and environmental pollutants [3][4][5][6][7][8][9].
Over the past decade, global gene expression profiling has been used increasingly to investigate cellular toxicity in transformed and primary cells [6].Almost all previous studies used transformed cells such as Jurkat [10], A549 [5], or HepG2 cells [7,8], or primary cells such as human pulmonary artery endothelial cells [11], human bronchial epithelial cells [12], or human aortic endothelial cells [13].
These previous studies only focused on mRNAs as biomarkers.However, recent studies identified non-coding RNAs (ncRNAs) as efficient novel biomarkers for toxicity testing [14][15][16].ncRNAs can be roughly classified into three groups: small ncRNAs (20-30 nucleotides [nt]) such as microRNAs (miRNAs), intermediate-sized ncRNAs (30-200 nt) such as small nucleolar RNAs (snoRNAs), and long ncRNAs (lncRNAs; >200 nt) such as long intergenic non-coding RNAs (lincRNAs).LncRNAs are defined as RNA molecules greater than 200 nucleotides in length that do not contain any apparent protein-coding potential [17][18][19][20].The majority of lncRNAs are transcribed by RNA polymerase II (Pol II), as evidenced by Pol II occupancy, 5 0 caps, histone modifications associated with Pol II transcriptional elongation, and polyadenylation.Moreover, the previous studies used transformed or primary cells.Transformed cells are genetically altered, typically aneuploid, and may exhibit clinically irrelevant toxic responses to compounds.Primary cells from animal tissues lose their in vivo phenotypes, can exhibit high variability among isolations, and can often only be expanded by dedifferentiation [21].
The present study used deep sequencing analysis (RNA-seq) to identify novel RNA biomarkers including ncRNAs that exhibited substantial responses to general chemical toxicity from nine chemicals, and to benzene toxicity specifically.The nine chemicals are listed in the Japan Pollutant Release and Transfer Register as class I designated chemical substances.Moreover, we used mouse embryonic stem cells (mESCs) because mESCs have three important attributes [9,22]: (i) normality: they are regarded as native cells; (ii) pluripotency, the ability to differentiate into specialized cells; and (iii) self-renewal, the ability to undergo numerous cycles of cell division while remaining undifferentiated in culture.These characteristics make mESCs a promising choice for assessment of toxicity, and overcome the limitations of transformed or primary cells.

RNA-seq and data analysis
RNA-seq analyses were performed by Takara.Ribosomal RNA was removed using a Ribo-Zero Magnetic Gold kit (Human/Mouse/Rat; Illumina, USA).An RNA-seq library was constructed using a TruSeq Standard mRNA Sample Prep kit (Illumina).One hundred base paired-end read RNA-seq tags were generated using an Illumina HiSeq 2500 sequencer according to the standard protocol.The fluorescence images were processed to sequences using the analysis pipeline supplied by Illumina.RNA-seq tags were mapped to the mouse genome (hg19) from the National Center for Biotechnology Information using TopHat mapping software.More than 40 million RNA-seq tags from each sample were analysed.Genic representations using fragments per kilobase of exon per million mapped fragments (FPKM) to normalize for gene length and depth of sequencing were calculated.Sequencing tags were then mapped to the mouse reference genome sequence using mapping software, allowing no mismatches.RNA-seq tags were assigned to corresponding RefSeq transcripts when their genomic coordinates overlapped.We used RNA sequences available from public databases: mRNA from NM of RefSeq and lncRNA candidates from NR of RefSeq [24].In total, 32,586 RNAs from the NM and NR categories of the RefSeq Database were used for RNA annotation.The following expression ratio r (x, y) was used in this study.
where x and y were the FPKMs of the control and treatment groups, respectively.Note that if x and y were zero, then the smallest values (excluding zero) in the control and the treatment groups were used instead of x and y, respectively.

Data access
Short-read sequence archive data in this study are registered in GenBank (http://www.ncbi.nlm.nih.gov/genbank)/DDBJ(http://ddbj.sakura.ne.jp).The data used to determine the expression levels of transcripts are registered as accession numbers DRX076650-DRX076669.

Results
General up-and downregulation of mRNAs and ncRNAs after chemical exposure of chemicals as described previously [16].We identified the 30 RNAs whose expression was most upregulated following the exposure of mESCs to the nine chemicals in general (Table 1).We found that mRNA levels for these genes increased by approximately 100-to 30,000-fold after exposure to the chemicals.To confirm the reproducibility of the RNA-seq data, we determined the RNA expression levels by RT-qPCR in duplicate for Top 3 for upregulation of mRNAs and ncRNAs after benzene exposures.The results showed that the relative quantitative values (exposure/control) of NM_001177607, NM_178734, and NR_027375 were 746.6 ± 96.1, 570.3 ± 150.6, and 606.4 ± 52.4 (mean ± errors), respectively.The data of RT-qPCR were similar to those of RNA-seq.Thus, we confirmed the reproducibility of the RNA-seq data.We then categorized the upregulated mRNAs according to their Gene Ontology (GO) terms (Table 2).
Of the various GO terms, genes for regulation of cellular responses, such as cellular response to mechanical stimulus, cellular response to reactive oxygen species, and negative regulation of inflammatory response, occurred particularly frequently among the upregulated genes.Moreover, two ncRNAs, NR_027375 (Ythdf3_v3) and NR_033430 (Gm2694) were identified as being upregulated by general chemical exposure.The lengths of Ythdf3_v3 and Gm2694 are 5,308 nt and 682 nt, respectively.The functions of these ncRNAs are unknown; therefore, we cannot perform the correspondence analysis between ncRNA and the expression of mRNA.Next, we identified the 30 RNAs whose expression was most downregulated following the exposure of mESCs to the nine chemicals in general (Table 3).We found that the mRNA levels for these genes decreased to approximately 0.0001-to 0.006 times their original levels after exposure to the chemicals.To confirm the reproducibility of the RNA-seq data, we determined the RNA expression levels by RT-qPCR in duplicate for Top 3 for downregulation of mRNAs after benzene exposures.The results showed that the relative quantitative values (exposure/control) of NM_001166648, NM_001163553, and NM_145978 were 0.0006 ± 0.0001, 0.0003 ± 0.0002, and 0.0007 ± 0.0002 (mean ± errors), respectively.The data of RT-qPCR were similar to those of RNA-seq.Thus, we confirmed the reproducibility of the RNA-seq data.We then categorized the downregulated mRNAs according to their GO terms (Table 4).Of the various GO terms, genes for regulation of cellular processes, such as regulation of transcription, negative regulation of apoptosis, and regulation of cellular metabolism, occurred particularly frequently among the

Specific up-and downregulation of mRNAs and ncRNAs after exposure to benzene
We next explored toxic response to specific chemical exposure, using benzene as a representative chemical substance.We identified the 30 RNAs whose expression was most upregulated following the exposure of mESCs to benzene (Table 5).We found that mRNA levels for these genes increased by approximately 3000-to 13,000-fold after exposure to benzene.We then categorized the upregulated mRNAs according to their GO terms (Table 5).Of the various GO terms, genes involved in cellular responses, such as cellular response to mechanical stimulus, inflammatory response, and cellular response to DNA damage, occurred particularly frequently among the upregulated genes.Moreover, two ncRNAs, NR_038062 (Yipf2_v4) and NR_027375 (Ythdf3_v3) were identified as being upregulated by exposure to benzene.The lengths of Yipf2_v4 and Ythdf3_v3 are 1,919 nt and 5,306 nt, respectively.The functions of these ncRNAs are unknown; therefore, we cannot perform the correspondence analysis between ncRNA and the expression of mRNA.Next, we identified the 30 RNAs whose expression was most downregulated following the exposure of mESCs to benzene (Table 6).We found that mRNA levels for these genes decreased to approximately 0.000002 to 0.0002 times their original levels after exposure to benzene.We then categorized the downregulated mRNAs according to their GO terms (Table 6).Of the various GO terms, genes involved in regulation of cellular processes, such as multicellular organism development, cell cycle, and DNA replication, occurred particularly frequently among the downregulated genes.Moreover, two ncRNAs, NR_034050 (Snora44) and NR_102360 (Zbtb24_v4) were identified as being downregulated by exposure to benzene.The lengths of Snora44 and Zbtb24_v4 are 117 nt and 2,872 nt, respectively.Snora44 is a snoRNA.The functions of these ncRNAs are unknown; therefore, we cannot perform the correspondence analysis between ncRNA and the expression of mRNA.Other chemical compound exposure data are shown in S1-S16 Tables.

Discussion
In this study, we used RNA-seq to identify novel RNA biomarkers that exhibited a substantial response to general chemical toxicity from nine chemicals, and to benzene toxicity specifically.Some ncRNAs exhibited substantial responses to the chemical compounds, although fewer ncRNAs than mRNAs responded in this way.We considered that both mRNAs and ncRNAs expression levels might be independently changed by chemical stresses.We identified two ncRNAs (Ythdf3_v3 and Gm2694) that were upregulated and five ncRNAs (4930520O04Rik, F630042J09Rik, Atp11a_v4, Zbtb24_v4 and 1700124L16Rik) that were downregulated in response to general chemical exposure.These results indicate that ncRNAs as well as mRNAs have the potential to be surrogate indicators of chemical safety screening.We also identified two ncRNAs (Yipf2_v4 and Ythdf3_v3) that were upregulated and two ncRNAs (Snora44 and Zbtb24_v4) that were downregulated in benzene-treated cells.These findings indicate that ncRNAs can be used as novel RNA biomarkers for chemical safety screening.
Traditional RNA biomarkers of various types of cell stress have been identified, for example markers of oxidative stress response (Nfkb1, Jun, and Hif1a), DNA damage (Ppp1r15a, Gadd45a, Ddit3, and Cdkn1a), heat shock response (Hsp90aa1 and Hsf1), and endoplasmic reticulum stress (Atf3 and Bbc3), and hypoxia inducible factors (Arnt and Mtf1) [25].However, the expression levels of these RNA biomarkers did not appear among the 30 genes that were the most up-or downregulated by chemical exposure in this study.Therefore, we identified novel RNA biomarkers that were more efficient markers of chemical toxicity than traditional RNA biomarkers.
As expected, we observed upregulation of genes involved in regulation of cellular responses when cells were treated with the nine chemicals in general.This result suggests that the cells responded to the stress by increasing expression of genes involved in cellular responses.A similar phenomenon was observed in cells treated with benzene.Furthermore, we observed downregulation of genes involved in regulation of cellular processes when cells were treated with the nine chemicals in general.This suggests that cells downregulated basic processes such as proliferation in response to the cellular stress by decreasing expression of genes involved in these cellular processes.Profiles for small RNAs such as miRNAs have been reported for several animal species including humans, mice, and rats [26][27][28][29][30]. miRNAs play pivotal roles in regulation of gene expression, and have the potential to be useful biomarkers.However, small RNAs and long RNAs cannot be analysed at the same time using RNA-seq because they require different RNAseq application systems.lncRNAs have great potential to be useful biomarkers; for example, lncRNAs participate in diverse cellular functions including chromatin modification, transcription, splicing, mRNA decay, translation, and protein transport and assembly, and their RNA elements and RNA-protein complex machineries are also thought to be extremely diverse.We therefore focused on lncRNAs in the present study.Moreover, mESCs can differentiate into a variety of cell types [31], and thus allow assessment of chemical exposure risk in a variety of tissues and cell types.However, in the present study we used undifferentiated mESCs because we aimed to provide a basic framework for using mESCs for chemical safety screening.
We propose that many mRNAs and ncRNAs represent novel RNA biomarkers for chemical safety screening using mESCs.This study provides only a basic framework for such an application, and we plan to assess differentiated cells derived from mESCs, such as neurons, cardiomyocytes, and hepatocytes.We believe that these potential RNA biomarkers will be used for chemical safety screening in the future.For example, they could be quantified by a custommade microchip or array [32].