Transcriptome Profiling of Radish (Raphanus sativus L.) Root and Identification of Genes Involved in Response to Lead (Pb) Stress with Next Generation Sequencing

Lead (Pb), one of the most toxic heavy metals, can be absorbed and accumulated by plant roots and then enter the food chain resulting in potential health risks for human beings. The radish (Raphanus sativus L.) is an important root vegetable crop with fleshy taproots as the edible parts. Little is known about the mechanism by which radishes respond to Pb stress at the molecular level. In this study, Next Generation Sequencing (NGS)–based RNA-seq technology was employed to characterize the de novo transcriptome of radish roots and identify differentially expressed genes (DEGs) during Pb stress. A total of 68,940 assembled unique transcripts including 33,337 unigenes were obtained from radish root cDNA samples. Based on the assembled de novo transcriptome, 4,614 DEGs were detected between the two libraries of untreated (CK) and Pb-treated (Pb1000) roots. Gene Ontology (GO) and pathway enrichment analysis revealed that upregulated DEGs under Pb stress are predominately involved in defense responses in cell walls and glutathione metabolism-related processes, while downregulated DEGs were mainly involved in carbohydrate metabolism-related pathways. The expression patterns of 22 selected genes were validated by quantitative real-time PCR, and the results were highly accordant with the Solexa analysis. Furthermore, many candidate genes, which were involved in defense and detoxification mechanisms including signaling protein kinases, transcription factors, metal transporters and chelate compound biosynthesis related enzymes, were successfully identified in response to heavy metal Pb. Identification of potential DEGs involved in responses to Pb stress significantly reflected alterations in major biological processes and metabolic pathways. The molecular basis of the response to Pb stress in radishes was comprehensively characterized. Useful information and new insights were provided for investigating the molecular regulation mechanism of heavy metal Pb accumulation and tolerance in root vegetable crops.


Introduction
The problem of heavy metal pollution is becoming a major worldwide ecological concern due to its potential adverse impacts on human health through the food chain [1,2]. Lead (Pb) is one of the most toxic heavy metals and can be easily taken up by plants and accumulated in different parts [3,4]. The sources of Pb contamination not only include roots in natural processes but also derive anthropogenic activities, such as exhaust fumes of automobiles, emissions from industrial factories and land applications of agrochemicals including insecticides, herbicides and fertilizers [3,5].
Lead is absorbed by plants mainly through the roots, and in most plant species the Pb content in organs tends to decrease in the order of root, leaves, stem, inflorescence and seeds [6][7][8][9]. Many previous studies reported that Pb overload in plants causes a range of negative effects, and the visually nonspecific symptoms of Pb toxicity are rapid inhibition of seed germination, stunted growth of the plant and chlorosis [1,4]. Additionally, Pb can cause oxidative damage by stimulating the formation of free radicals and reactive oxygen species, resulting in oxidative stress and DNA damage [10][11][12].
Radish (Raphanus sativus L.), belonging to the family Brassicaceae, is an economically important vegetable crop with an edible taproot [13]. It was reported that radish roots could accumulate large amounts of lead (Pb), which could cause a potential health risk in polluted conditions [6,14]. Considerable efforts have been invested into investigating Pb stress in radishes, especially its accumulation, translocation, physiological and metabolic variations, and cell deposition [6,[15][16]. However, little is known about the mechanism of radish plant responses to Pb stress at the molecular level,i.e.gene expression variations under Pb stress.
The next-generation sequencing (NGS) technologies including Roche/454's sequencing [17],Illumina/Solexa's sequencing technology (San Diego, California, USA), and Applied Biosystems' SOLiD sequencing technology have provided novel opportunities for accelerating genome projects such as whole-genome resequencing for variation analysis, RNA sequencing (RNA-seq) for transcriptome and non-coding RNAome analysis [18,19]. Among them, NGS-based RNA-seq for transcriptome methods can permit simultaneous acquisition of sequences and can be used to characterize genes, detect gene expression, identify and quantify rare transcripts without prior knowledge of a particular gene, and provide information regarding alternative splicing and sequence variations in identified genes [20]. In recent years, RNA-seq has been proven as a powerful method for understanding the complexity of gene expression and regulation networks in several plant species responding to many kinds of biotic and abiotic stresses, such as cotton [21], chinese cabbage [22], chickpea [23], maize [24], potato [25], Ammopiptanthus mongolicus [26] and soybean [27].
Previous studies indicated that plant in different stress responses is controlled by a range of gene regulatory mechanisms which could act together in various response and defense systems. Transcription factors, transport proteins and some other critical genes involved in certain signal transduction and secondary metabolite pathways were considered to be common stress-related transcripts activated in both biotic and abiotic stresses [27][28][29]. However, there were some unique genes involved in response to a specific stress [28]. For example, plants counteract the heavy metal stress by adopting both common and some specific means to defense and detoxifying the excessive concentration occurring in their surroundings, which includes generating metal ions signal sensing and transduction related pathways, activating numerous metal transporters and modulating related transcription factors [30][31][32][33]. It was found that the major activated genes not only involved in common stress-induced responses but also in some specific pathways including sulfur assimilation, glutathione (GSH) metabolism, IAA and jasmonic acid (JA) biosynthesis in A. thaliana under Pb treatments [34].
Although the molecular regulation mechanisms and massive candidate genes involved in various stress conditions have been successfully revealed by employing NGS technologies based on two main platforms including Roche/454 and Solexa/Illumina, their applications in investigating the dynamically transcriptional changes in response to heavy metal Pb stress in radishes has not been reported. The present study is aimed to elucidate the molecular regulation mechanisms and the critical genes involved in regulating radish responses to Pb stress. NGS-based Illumina paired-end Solexa sequencing platform was employed to characterize the fleshy taproot de novo transcriptome of two radish root cDNA samples, one untreated control (CK) and one Pb-stressed with Pb(NO 3 ) 2 at 1000 mg L 21 (Pb1000). Furthermore, an overall gene expression analysis was compared between CK and Pb1000 based on the assembled de novo transcriptomes. From that, a large set of radish transcript sequences that could be used to discover root-specific functions were generated, an abundance of differentially expressed Pbresponsive genes were quantified, and the enriched networks for regulating heavy metal Pb stress were acquired in radishes. Additionally, expression profiling of some differentially regulated genes were validated by qRT-PCR. The molecular basis of the response to Pb stress was first comprehensively characterized in radish, and the resulting information would facilitate further investigation of the mechanisms of Pb accumulation/tolerance in plants and the effective management of Pb contamination in vegetable crops by genetic manipulation. A large volume data was generated with the Illumina HiSeq  2000 sequencing the two radish cDNA libraries, CK, an untreated  control and Pb1000, a Pb-stressed with Pb(NO 3 ) 2 at 1000 mg L 21 . According to Illumina's RNA transcriptome sequencing, a sequence can generate 26101 basepairs (bps) from each Paired-End (PE) of a cDNA fragment. After removing the low quality reads and trimming off the adapter sequences, 26,381,880 (CK) and 22,535,988 (Pb1000) high-quality, clean paired-end sequencing reads with a total of 2,636,208,012 and 2,251,686,397 nucleotides were obtained, respectively, for the two pools. Because no appropriate reference genome sequence was available for radishes, a de novo assembly program Trinity [35] was selected for de novo assembly of all the 48,918,868 clean reads. Then, we combined all the assembling transcripts with sequencing of 31, 4823 radish expressed sequence tags (ESTs) and 17,187 unigenes from the NCBI database (http://www.ncbi.nlm.nih.gov/). Finally, a total of 68,940 assembled transcripts including 33,337 unigenes were obtained with the contig size ranging from 306 to 16,101 bp in transcript length and the average size of 1,226.63 bp ( Table 1). The size distribution of the assembled transcripts is shown in Figure 1.
To further evaluate the completeness of the de novo assembly transcriptome and predict possible functions, all assembled transcripts were compared against the Cluster of Orthologous Groups (COG database) for the analysis of phylogenetically widespread domain families. The results revealed 22,518 sequences with significant homology and assigned them into the appropriate COG clusters. These COG classifications were grouped into 23 functional categories ( Figure 3). The five largest categories were 'general function prediction only' (30.05%), 'signal transduction mechanisms' (15.89%), 'transcription' (14.70%), 'post translational modification, protein turnover, chaperones' (13.59%), and 'function unknown (12.69%).
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for the systematic understanding of high-level gene functions in terms of networks of the biological system, such as the cell, organism and ecosystem, from molecular-level information (http://www.genome.jp/kegg/). The assembled transcript sequences were searched against the KEGG database using BLASTX with a cut-off E-value of 10 25 to identify the biological pathways activated in the roots of radishes in response to heavy metal Pb stress. Of a total of 48,864 annotated sequences, 19,480 had a significant match in the database and were assigned to 296 KEGG pathways (

Differentially Expressed Genes Involved in the Response to Pb Stress in Radish Roots
To investigate the changes in gene expression and understand the critical genes in radish roots responding to the stress of heavy metal Pb, clean reads of the Pb1000 and the CK libraries were respectively mapped to the de novo assembly transcriptome reference sequences with Bowtie [36][37] and were assigned to unigenes and isoforms with the RSEM (RNA-Seq by Expectation Maximization) software [38]. The assigned unigene and isoform expression levels were calculated using a normalizing statistic called FPKM (fragments mapped per kilobase of exon per million reads mapped), which provides a measure of expression level that accounts for variation in gene length [39]. The differentially expressed genes (DEGs) (including unigene and isoform) were determined with a log-fold expression change (log FC) greater than 2 or less than -2 using a greater statistically significant value (P,0.005) as well as false discovery rates (FDR ,0.001). From that, a total of 4,614 DEGs (1,398 unigenes) were detected between the two libraries, and these DEGs included both upregulated (2,154 transcripts) and downregulated genes (2,460 transcripts) under the Pb treatment ( Figure 4).

Gene Ontology (GO) and Pathway Functional Enrichment Analysis of the DEGs
All of the DEGs were performed on GO function and pathway enrichment analysis with a hypergeometric test and Bonferroni Correction [40]. GO terms significantly enriched in the DEGs were identified using goatools (https://github.com/tanghaibao/ goatools). According to GO functional enrichment analysis, a total of 270 terms for the upregulated transcripts were significantly enriched at a Bonferroni-corrected P-value #0.05, while for downregulated ones only 107 terms were enriched. The GO terms predominantly enriched for the upregulated transcripts were related to biological processes including 'defense response by callose deposition in cell wall' (GO:0052544), 'defense response by cell wall thickening' (GO:0052482), 'response to oxygen levels' (GO:0070482), 'glycoside catabolic process' (GO:0016139), and 'glutathione conjugation reaction' (GO:0006803), and those related to molecular function include 'glutathione transferase activity'(GO:0004364) ( Table S2). The predominant GO terms identified as enriched among the downregulated transcripts related to 'catalytic activity' (GO: 0003824) in molecular functions, such as 'cellulose synthase (GDP-forming) activity' (GO:0016761), 'carbon-oxygen lyase activity, acting on polysaccharides' (GO: 0016837), and 'cell wall organization or biogenesis' (GO:0071554) in biological processes such as 'cellular cell wall organization (GO:0007047) and cell wall modification (GO:0042545) ( Table  S3). The KEGG Orthology-Based Annotation System (KOBAS) was employed to find most statistically significantly enriched pathways for all the DEGs, which revealed that 13 and six   Table 3).

Validation of Illumina Expression Patterns by qRT-PCR Analysis
To confirm the reliability of the Solexa analysis, 22 candidate DEGs were selected and their expression was detected by qRT-PCR. The expression patterns from qRT-PCR showed general agreement with those from the Solexa sequencing (Table 4). The discrepancies with respect to ratio should be attributed to the essentially different algorithms and sensitivity between the two techniques [41][42]. In the analysis of gene expression profiling, the deep-sequencing method generated absolute rather than relative expression measurements.
For further investigating and verifying the expression variations of the DEGs, transcriptional qRT-PCR analysis of six selected genes including three upregulated (GST3-1, GGT1 and

Identification of DEGs Involved in Signal Sensing and Transduction Proteins
The main pathway by which heavy metal can penetrate into the plants is through root uptake from soils, and the root cell wall is directly in contact with metals in the soil solution [31,43]. When encounter an extra-cellular stimuli, the cell walls could activate a variety of specific stress-responsive signaling proteins to protect the cell from susceptible sites into the protoplast, such as Mitogenactivated protein kinases (MAPKs) and calcium-binding related protein kinase. In eukaryotes, MAPKs were consisted of three sequentially activated protein kinases including MAPK kinase kinase (MAPKKK), MAPK kinase (MAPKK), and MAPK, all of which were involved in response to a variety of environmental, hormonal and developmental stimuli [31]. In this study, four DEGs were identified that were highly homologous to genes encoding MAPKs such as MAPKKK7, MAPK6, MAPK18 and MPAK20, and 14 DEGs were similar to calcium-binding protein genes including CDPK9, CML45, CML22 and EICBP5, and most of these genes were upregulated under Pb stress (Table S4).

Identification of DEGs Involved in Transcription Factors
Transcription factors (TFs) could regulate corresponding transcriptional processes when plants were treated with toxic heavy metals [30]. Numerous TFs such as WRKY, basic leucine zipper (bZIP), ethylene-responsive factor (ERF) and myeloblastosis protein (MYB) have been verified to play a significant role in controlling the expression of specific stress-related genes [44-

Identification of DEGs Involved in Metal Transporters
Metal transporters could play a vital role in alleviating heavy metal toxicity by transporting metal ions out of the cell or sequestering it into the vacuole. It was reported that a wide range of transporter families including ATP binding cassette (ABC), Natural resistance-associated macrophage proteins (Nramps), ZRT, IRT-like proteins (ZIPs) and the Cation diffusion facilitators (CDFs), may contribute to heavy metal resistance [46][47]. In this study, 30 DGEs were identified as candidate genes involved as members of different metal transporter families, which were mainly related to ABC and ZIP families (Table S4).

Identification of Candidate Genes Involved in Biosynthesis of Chelating Compounds and Glutathione Metabolism
Synthesis of Metal-chelating compounds is another mechanism used by plants to combat heavy metal stress, which can sequester and ultimately detoxify the excess metal ions [48]. Metallothioneins (MTs) are low-molecular-weight cysteine-rich metal binding peptides, which were usually classified into four groups (MT1-4). Currently, MT genes have been identified in a number of higher plants such as Arabidopsis and Brassica juncea [48][49]. In the present study, three DEGs were found homologs with genes encoding MT3. Phytochelatins (PCs) is another important class of heavymetal-binding ligands, which can bind metal ions via thiolate coordination. PCs are not formed as a direct result of expression of a metal tolerance gene, but rather as a product of a biosynthetic pathway [50].Numerous physiological, biochemical and genetic studies have confirmed that glutathione (GSH) is the substrate for PC biosynthesis [51]. The conversion of GSH to PCs can be catalyzed by a special c-glutamyl cysteine dipeptidyl transpeptidase (EC 2.3.2.15) called Phytochelatin sythase (PCS), and many genes involved in the GSH metabolism pathway [ko00480] have been found crucial in regulating GSH and PCs levels [52].
In the present study, only one DEG sequence was discovered encoding PCS. Based on the KEGG pathway assignment, 163 transcript sequences were found to encode 18 putative enzymes involved in glutathione metabolism from the present assembled de novo transcriptome background (Table 5, S1). More than one transcript sequences were annotated as the same enzyme, implying that such transcript sequences may represent different fragments of a single transcript, or different members of a gene family [53]. Comparative analysis of sequences from the two cDNA libraries during lead stress revealed that 25 transcript sequences were significantly differentially expressed. These sequences encoded five enzymes including four upregulated, L-ascorbate peroxidase (APX, EC 1.11.1.11), glutathione S-transferase (GST, EC 2.5.1.18), glutathione synthase (GSHB, EC 6.3.2.3) and gammaglutamyltranspeptidase (GGT, EC 2.3.2.2), and one downregulated spermidine synthase (SPDS, EC 2.5.1.16) (Figure 6).

Discussion
Lead (Pb) is one of the most abundant, ubiquitous, toxic heavy metal elements posing a critical concern to human and environmental health [54][55][56]. Therefore, how to control Pb pollution and reduce Pb risks to human health has been an important health issue all over the world. Understanding the molecular mechanisms of plants responding to heavy metal stresses to control its uptake, translocation, and accumulation has been a focus work for plant genetic manipulation and biotechnology research [30,57].
Radish (Raphanus sativus L.) is a major root vegetable crop worldwide, which can accumulate a relatively high concentration of toxic heavy metal elements, such as lead (Pb) and cadmium (Cd) in roots when grown in heavy metal polluted conditions [14][15][16]. However, few works have been reported about the Pb-responsive genes and their regulation networks in radishes under Pb stress.
Because the genome sequence of the radish is still unavailable, expressed sequence tag (EST) analysis has become one of the most valuable tools to discover and identify novel genes. However, to date, only 31, 4823 radish ESTs can be retrieved from the GenBank of NCBI, which has a limited coverage of the transcriptome, and only a few candidate genes involved in complex biosynthetic pathways can be identified. The arrival of Next-Generation Sequencing (NGS) technologies could overcome the limitations and allow investigators to simultaneously measure the changes and regulation at a genome-wide level under certain biological conditions for non-model plants for which the background genomic information is not available [39,[58][59]. In recent years,the newly developed NGS-based RNA-seq technique has been widely used for de novo transcriptome sequencing assembly, discovery of novel genes and investigation of gene expression in many non-model plants such as rice [60], peanuts [61], purple sweet potato [62], medicinal plant Polygonum cuspidatum [63] and floral plant Dendrocalamus latiflorus [64]. In this study, based on de novo transcriptome sequencing and assembly, a total of 68,940 assembled transcripts including 33,337 unigenes were obtained from two radish root cDNA libraries of untreated control (CK) and Pb-treated with Pb(NO 3 ) 2 at 1000 mg L 21 (Pb1000). Comparison of the assembled transcripts to gene catalogs of other species, 48,864 transcript sequences (70.88% of 68,940 transcripts) were annotated by BLAST analysis and functional bioinformatics analyses including the GO, COG and KEGG databases. From the annotation results, it could be found that the main species with BLAST hits to annotated transcripts were A. thaliana, A. lyrata, B. napus, B. oleracea and B. rapa, which indicated that the sequences of the radish transcripts obtained in the present study were assembled and annotated correctly [65]. However, an abundance of unannotated transcripts were generated in the present study, which may represent a specific gene pool for radish root studies. The gene catalog provided a comprehensive understanding of the gene transcription profiles of radishes, and laid a solid foundation for further investigation of Pb-stress mechanisms and identification of novel genes in this species.
Gene expression analysis is often employed in exploring changes among different plant tissues, different developmental stages, or plants under different treatments [20,[66][67][68]. In this study, a total of 4,614 transcripts were identified as differentially expressed genes (DEGs). By analysis of the statistically enriched GO terms and pathways related to the DEGs, it was revealed that the most significant and high frequently enriched terms and pathways for upregulated transcripts during Pb stress were predominately involved in the defense response in cell walls (GO:0052544 and GO:0052482) and glutathione (GSH) metabolism-related (i.e., GO:0006803, GO:0004364, and ko00480) processes, while for downregulated genes, they were mainly related to carbohydrate metabolism-related pathways (i.e., ko00400, ko00500 and ko00520) ( Table 3, S2 and S3). This information could offer more direct biological insights to elucidate the response mecha-nism of Pb stress in radishes, which was in high accordance with previous studies of heavy metal stresses in plants [69][70][71].
The plant cell wall played an important role for heavy metal defense and detoxification, which could act as a barrier limiting metal uptake and penetration into the protoplast [43]. With electron microscopy observation, Inoue et al [15] reported that the presence of Pb in the cell wall of radish roots could form extremely large crystalline-like deposits, which can completely saturate the cell wall and need special accommodation space or thickening of the cell wall at the site. These findings supported the roles of cell wall-related terms 'defense response by callose deposition in cell wall' (GO: 0052544) and 'defense response by cell wall thickening' (GO: 0052482) identified in the present study following Pb stress in radishes. Additionally, plant cell walls have been consided as active metabolic sites, where a variety of specific stress-responsive signaling molecules such as mitogen-activated protein kinases (MAPKs) and calcium-binding related protein kinase could be generated in response to extra-cellular stimuli [31,72]. Many studies from different plant species indicated that several MAPK pathways were activated in response to heavy metal stress. A previous screen in Arabidopsis for cadmium-responsive genes identification revealed that a related MAPK named MAPKKK-MEK K1 could be induced by high concentrations of CdCl 2 at transcriptional level [73]. Jonak et al (2004) reported that exposure of alfalfa (Medicago sativa) seedlings to excess copper or cadmium ions could activated four distinct MAPKs [74]. Currently, four DEGs were identified as MAPKs including MAPKKK7, MAPK6, MAPK18 and MPAK20 in response to Pb treatment in radish. Because the activation of MAPK cascade is plant-species and metal variety-dependent, the role of distinct MAPK modules in radish responding to Pb stress still to be verified for further study. Calmodulin system was reported to be involved in sensing heavy metal stress such as Cd, Ni and Pb, which could regulate ion transport, gene expression and stress tolerance [31]. The different gene expression pattern in our study showed that 14 DEGs were found homologous encoding calmodulin-like protein including both up and down regulated following Pb stress. It was reported that GSH played a pivotal role in protecting plants from lead stress by quenching lead-induced reactive oxygen species (ROS). Liu et al. [34] by genome-wide microarray expression profiling analysis revealed that lead treatment could induce different GSH genes in Arabidopsis thaliana. Estrella-Gomez et al. indicated that lead accumulation in Salvinia minima could increase the accumulation of GSH, the enzymatic activity of glutathione synthase (GS), and cause expression changes of SmGS genes in both leaf and root tissues [75]. In addition Gupta et al. also reported the role of GSH in lead detoxification in S. alfredii [76]. In the present study, 18 GSH-related enzymes were found based on assembled radish transcriptome background and five of them were identified as DEGs including four upregulated and one downregulated under Pb stress (Table 5, Figure 6). The different gene expression showed a similar result to the previous reports. Moreover, many kinds of transcription factors (TFs), metal transporters, and metal-binding ligands have been found to be responsive and defensive to Pb stress in many other species [43]. As expected, these related gene families were also successfully identified in our study.
In summary, Next Generation Sequencing (NGS)-based RNAseq technology was firstly employed in this study to characterize the roots de novo transcriptome of the radish plant under Pb stress, and a total of 68,940 assembled unique transcripts representing 33,337 unigenes were obtained. Based on the assembled de novo transcriptome, 4,614 differentially expressed genes (DEGs) that play significant roles in the response to Pb stress were identified. Gene Ontology (GO) and pathway enrichment analysis revealed that the upregulated DEGs under Pb stress were predominately involved in defense responses in the cell wall and glutathione metabolism-related processes, while downregulated DEGs were mainly related to carbohydrate metabolism pathways. Multiple candidate genes involved in defense and detoxification were successfully identified in response to Pb stress, and expression patterns of selected DEGs were further validated with qRT-PCR, which reflected significant alterations in major biological processes and metabolic pathways during Pb stress. The molecular basis of the response to Pb stress in radishes was first comprehensively characterized in this study, which resulted in useful information and laid a solid foundation for future investigating the molecular regulation mechanism of heavy metal Pb accumulation and tolerance in root vegetable crops.

Plant Materials
The radish (Raphanus sativus L.) advanced inbred line, 'NAU-RG', was used in this study. The surface-sterilized seeds were sown into soil in plastic pots and the seedlings were cultured in a growth chamber with 14 h light at 25uC and 10 h dark at 18uC for 20 days. Seedlings with similar sizes were transplanted into a 20-L plastic container with modified half-strength Hoagland's nutrient solution (pH 5.4). One week later, the plants were treated with Pb(NO 3 ) 2 at 0 and 1000 mg L 21 , respectively, and cultured under glasshouse conditions with natural light and day/night temperature of 28/18uC. For Solexa analysis and qRT-PCR verification, plants were harvested when they were treated with Pb(NO 3 ) 2 at 0 and 1000 mg L 21 after 72 h. For qRT-PCR analysis of the different concentrations and temporal variation of Pb treatments, plants were collected after 72 h with different concentrations of Pb(NO 3 ) 2 at 0, 200 and 500 mg L 21 , and for the concentration at 1000 mg L 21 were collected after 0, 24, 48, and 72 h, respectively.
All samples were washed with deionized water and immediately frozen in liquid nitrogen and stored at 280uC for RNA extraction.

RNA Isolation and Illumina Sequencing
Total RNA from different samples was isolated using the RNAprep pure Plant Kit according to the manufacturer's protocol (Tiangen Biotech Co., Ltd., China). To avoid genomic DNA contamination, RNA samples were treated with RNase-free DNase I (Takara, Japan). Two radish root cDNA libraries were constructed using an mRNA-seq assay for paired-end transcriptome sequencing, which was performed by the Shanghai Majorbio Biopharm Technology Co., Ltd. (Shanghai, China). Poly(A) mRNA was enriched from total RNA by using Sera-mag Magnetic Oligo (dT) Beads (Thermo Fisher Scientific, USA) and then mRNA-enriched RNAs were chemically fragmented to short pieces using 16 fragmentation solution (Ambion, USA) for 2.5 min at 94uC. Double-stranded cDNA was generated using the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen, USA). After that, the Illumina Paired End Sample Prep kit was used for RNA-seq library construction and then was sequenced using Illumina HiSeq TM 2000. The sequencing data were deposited in NCBI Sequence Read Archive (SRA, http:// www.ncbi.nlm.nih.gov/Traces/sra) with accession numbers SRX256970 (CK) and SRX263753 (Pb1000).

Raw Sequence Processing and de novo Assembly
Raw reads generated by Illumina Hiseq TM 2000 were initially processed to get clean reads by removing the adapter sequences and low quality bases at the 3' end. Then, transcriptome de novo assembly was carried out using the short-read assembling program Trinity [35]. First, clean reads with a certain length of overlap were combined to form longer contiguous sequences (contigs), and then these reads were mapped back onto the contigs. The distance and relation among these contigs could be calculated based on paired-end reads, which enabled the detection of contigs from the same transcript and also calculation of the distances among these contigs. Finally, contigs were further assembled using Trinity, and the contigs that could not be extended on either end were defined as unique transcripts. Additionally, the unique assembled transcripts were further subjected to the process of sequencesplicing redundancy removal with sequence clustering software to acquire non-redundant transcripts called unigenes.

Functional Annotation of the Assembled Transcripts
All the assembled transcripts were searched against the NCBI non-redundant (nr) Swiss-Prot, COG and KEGG databases using BLASTX to identify the proteins that had the highest sequence similarity with the given transcripts to retrieve their function annotations and a typical cut-off E-value of ,10 25 was set. For the nr annotations, the BLAST2GO program was used to get GO annotations of unique assembled transcripts for describing biological processes, molecular functions and cellular components [77] After getting GO annotations for each transcript, WEGO software [78] was used to conduct GO functional classification for understanding the distribution of gene functions at the macroscopic level.

Identification and Functional Enrichment Analysis of the Differentially Expressed Genes (DEGs)
To identify DEGs between the two different treatments, the expression level for each transcript was calculated using the fragments per kilobase of exon per million mapped reads (FRKM) method. R statistical package software edgeR (Empirical analysis of Digital Gene Expression in R) was employed to quantify differential gene expression [79]. In addition, functional-enrichment analysis including GO and KEGG were performed to identify which DEGs were significantly enriched in GO terms and metabolic pathways at Bonferroni-corrected P-value #0.05 compared with the whole-transcriptome background using the formula described in the previous studies [24,42].
Validation of DEG Expression with Quantitative Real-time PCR (qRT-PCR) Quantitative real-time PCR was performed on a MyiQ Real-Time PCR Detection System (Bio-Rad) platform using the SYBR Green Master ROX (Roche, Japan) following the manufacturer's instructions. Primers were designed using Beacon Designer 7.0 software, and Actin2/7 (ACT) ( Table S5) was selected as the internal control gene according to a previous report [80]. The amplification was achieved by the following PCR program of first denaturation 95uC for 3 min, then 40 cycles of denaturation at 95uC for 5 s, annealing and extension at 58uC [62,80]. The relative expression levels of the selected transcripts normalized to ACT were calculated using the 2 2DDCt method. All reactions were performed with three replicates, and the data were analyzed using Bio-Rad CFX Manager software.

Supporting Information
Table S1 KEGG pathways of the assembled transcripts. (XLS)