De Novo Characterization of the Mung Bean Transcriptome and Transcriptomic Analysis of Adventitious Rooting in Seedlings Using RNA-Seq

Adventitious rooting is the most important mechanism underlying vegetative propagation and an important strategy for plant propagation under environmental stress. The present study was conducted to obtain transcriptomic data and examine gene expression using RNA-Seq and bioinformatics analysis, thereby providing a foundation for understanding the molecular mechanisms controlling adventitious rooting. Three cDNA libraries constructed from mRNA samples from mung bean hypocotyls during adventitious rooting were sequenced. These three samples generated a total of 73 million, 60 million, and 59 million 100-bp reads, respectively. These reads were assembled into 78,697 unigenes with an average length of 832 bp, totaling 65 Mb. The unigenes were aligned against six public protein databases, and 29,029 unigenes (36.77%) were annotated using BLASTx. Among them, 28,225 (35.75%) and 28,119 (35.62%) unigenes had homologs in the TrEMBL and NCBI non-redundant (Nr) databases, respectively. Of these unigenes, 21,140 were assigned to gene ontology classes, and a total of 11,990 unigenes were classified into 25 KOG functional categories. A total of 7,357 unigenes were annotated to 4,524 KOs, and 4,651 unigenes were mapped onto 342 KEGG pathways using BLAST comparison against the KEGG database. A total of 11,717 unigenes were differentially expressed (fold change>2) during the root induction stage, with 8,772 unigenes down-regulated and 2,945 unigenes up-regulated. A total of 12,737 unigenes were differentially expressed during the root initiation stage, with 9,303 unigenes down-regulated and 3,434 unigenes up-regulated. A total of 5,334 unigenes were differentially expressed between the root induction and initiation stage, with 2,167 unigenes down-regulated and 3,167 unigenes up-regulated. qRT-PCR validation of the 39 genes with known functions indicated a strong correlation (92.3%) with the RNA-Seq data. The GO enrichment, pathway mapping, and gene expression profiles reveal molecular traits for root induction and initiation. This study provides a platform for functional genomic research with this species.


Introduction
Adventitious roots refer to roots that form from any tissue that is not a root, such as leaves and stems.Adventitious rooting is one of the most important mechanisms of vegetative propagation in plants and one of the most important methods for the commercial production of horticultural species throughout the world [1].As an alternative or supplement to seed propagation in ecosystems where soil disturbances occur frequently, adventitious rooting is an important plant response to environmental stresses and a strategy for plant propagation under stress [2].The formation of adventitious roots has been associated with an important aspect of tissue dedifferentiation that involves shifting cells from normal morphogenetic pathways to functions associated with the development of root primordia [3].This shift leads to de novo root formation and a multitude of metabolic changes involving the enzymes and macromolecules associated with the induction, initiation, and development of root primordia in plant cuttings [4].Although the physiological and biochemical changes that occur during adventitious root formation have been extensively studied, the molecular mechanisms involved remain less well understood.
Various molecular and genetic approaches have been used to study adventitious root development in Arabidopsis and other plants [5].The physiological and biochemical changes that occur during the complex process of in vitro root development must be attributed to the presence and activity of metabolic pathways.In turn, these metabolic pathways must be controlled by the regulation of RNA transcription.Identifying the RNA transcription profile during this process will thus improve our understanding of the fundamental processes that control adventitious rooting.To this end, several studies have sought to investigate the transcriptional changes and differences in gene expression that occur during adventitious root formation using proteomic and cDNA microarrays [6][7][8][9][10].Using the latter method, Brinker et al. ( 2004) identified 220 genes that were changed significantly during root development in hypocotyl cuttings of Pinus contorta [6].Proteomic analyses were also used to investigate the proteins involved in the adventitious rooting of Arabidopsis thaliana mutants by Sorin et al. (2006), who identified 11 proteins predicted to be involved in different biological processes, including the regulation of auxin homeostasis and light-associated metabolic pathways [7].Using the Medicago GeneChip, Holmes et al. (2010) identified 904 and 993 up-and down-regulated probe sets in root-forming cultures of Medicago truncatula as well as significant changes in metabolism, signaling and the expression of transcription factors linked to in vitro adventitious root formation processes [8].Recently, using a NimbleGen microarray, Rigal et al. (2012) identified 7,107 transcript levels that changed during early stages of adventitious root development in the model tree Populus trichocarpa [9].
A major limitation of the microarray method is that only a portion of the total transcripts can be assayed.Many genes are not represented on the microarrays, while genes from large and highly similar families may yield ambiguous expression results due to non-specific hybridization [11].Recently, a high-throughput deep-sequencing technology (i.e., next-generation sequencing, NGS), RNA-Seq, has been widely used to explore transcriptomic data and study gene expression at the whole genome level in model and non-model organisms [12,13].Emerging de novo short read assembly technology has been successfully applied to identify gene expression profiles and discover new genes without a reference genome sequence [13,14].This technology platform enables the precise elucidation of transcripts present within a particular sample and can be used to calculate gene expression based on absolute transcript abundance [15].
The process of adventitious rooting consists of three successive but interdependent physiological stages, namely, induction, initiation and expression.The induction stage comprises molecular and biochemical events without visible changes.The initiation stage is characterized by cell division and organization of the root primordia [1,16].Studies in herbaceous plants reveal that the critical events that culminate in the formation of adventitious roots in hypocotyl cuttings occur within the first 3-12 h after excision of the primary roots [3,17].In mung bean hypocotyl cuttings, the induction stage lasts from 0 h to 12 h after primary root excision, and the initiation stage lasts from 12 h to 48 h.The first emerging adventitious root primordia were clearly visible at 48 h and adventitious roots grown through the epidermis of the hypocotyls within 72 h of the start of the cutting cultures [16,18].In cuttings of woody plants such as in Malus and Populus, cell divisions as early as 48 h after auxin exposure [17,19].Early significant physiological and biochemical changes in endogenous hormone pools occur during the first 48 h after excision [20].Transcriptome monitoring in Populus trichocarpa cuttings revealed significant shifts during 0-48 h time period after excision.27% of the genes were differentially regulated between 0 and 6 h, 36% between 6 and 24 h, and 4% between 24 and 48 h [2,19].The critical dedifferentiation events during the process of adventitious rooting occur within these two stages.
Herein, we exploited RNA-Seq technology to characterize the mung bean transcriptome and further to highlight global changes in gene expression during early stage of root development (i.e., induction and initiation stages) in mung bean hypocotyl cuttings.Mung bean is one of the most important tropical grain legumes that serves as a significant and a cheap source of carbohydrates and easily digestible protein for the people of Asia and Africa, but increasingly extends into Australia, USA, Canada and Ethiopia [21].However, the genomic and transcriptomic data of this plant have not been revealed so far.Furthermore, this plant has been widely used as a model plant species for studying physiological, biochemical, and molecular mechanisms under the process of adventitious root formation [16,18,[22][23][24][25].In present study, we aimed to characterize the molecular basis of physiological processes that occur during early stage of root development and to identify differentially expressed genes (DEGs) and metabolic pathways.Real-time quantitative PCR was used to validate several of the transcriptional changes observed.

Plant material and culture conditions
Mung bean [Vigna radiata (L.) R. Wilczek] seeds were washed in distilled water and sterilized in a 6% sodium hypochlorite solution for 15 min.The seeds were subsequently washed three times in sterile distilled water, sown in Petri dishes (30 seeds per a 12 cm-Petri dishes), covered with a 5 mm-layer of sterilized perlite, and incubated in a growth chamber at 25±1°C for 36 h in the dark and then at 25±1°C with a 14-h photoperiod under white fluorescent lamps (PAR of 100 μM m −2 s −1 ).Five days after germination, seedlings that were 5 cm in height were used for the experiments.To investigate gene expression changes during adventitious rooting, the primary roots of the seedlings were removed from the bases of the hypocotyls, and the resulting explants (10 per beaker) were cultured in 50-mL beakers containing 40 mL sterilized distilled water for 6 h (Wat6) or 24 h (Wat24) under the same aseptic conditions applied to the seedling culture.The basal 0.5 cm of each hypocotyl, where adventitious roots originated in vitro, was cut and harvested after a 6-or 24-h incubation.The same parts of seedling hypocotyls were directly harvested and used as the control tissues (Con) (Fig 1).The three parallel treatments were set in each group.All of the harvested tissues were immediately frozen in liquid nitrogen and stored at -80°C until further analysis.

Total RNA extraction
The tissues from 10 hypocotyls were fully ground in liquid nitrogen, and approximately 50 mg of tissue powder was mixed with 600 μL buffer Rlysis-P (from kit SK8631, Sangon, Shanghai, China) in a 1.5 mL RNase-free tube for 5 min in a water bath at 65°C to ensure sufficient lysis.
Next, 60 μL buffer PCA (from kit SK8631, Sangon) was added and mixed thoroughly, and the mixture was incubated at -20°C for 3 min.After centrifugation at 10,000 g for 5 min at 4°C, an equal volume of cooled phenol chloroform (phenol water) was added to the supernatant, mixed, and then centrifuged at 12,000 g for 5 min at 4°C.An equal volume of cooled chloroform was added to the supernatant and mixed.Following centrifugation at 12,000 g for 5 min at 4°C, an equal volume of cooled isopropanol was added to the supernatant, shaken gently, and left to precipitate for 10 min.After centrifugation at 12,000 g for 20 min at 4°C, the pellet was recovered, washed twice with 75% ethanol, dried for 5-15 min at ambient temperature, dissolved in 50 μL RNase-free water, and stored at -80°C.A 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) was used to confirm RNA integrity with RNA Integrity Number (RIN) values of 8.1-9.9.RNA concentration was determined using a NanoDrop ND-1000 Spectrophotometer (NanoDrop, Wilmington, DE, USA).

cDNA library construction and transcriptome sequencing
Equal amounts of total RNA from each sample were pooled to construct the cDNA library.Oligo(dT) 25 beads (Invitrogen) were used to enrich for poly(A) mRNAs from the total RNA pool.Following purification, the mRNA was cleaved into fragments using Fragment Mix reactive system at 94°C for 4 min.First-strand cDNA was synthesized using Superscript II reverse transcriptase (18064-014, Invitrogen), First Strand Master Mix, random hexamer (N6) primers, and the fragmented mRNA templates.The reaction was performed at 25°C for 10 min, 42°C for 50 min, 70°C for 15 min, and then held at 4°C.Subsequently, the second strand cDNA was synthesized using Second Strand Master Mix (18064-014, Invitrogen).The synthesized dscDNA fragments were purified with Agencourt AMPure XP Beads (Agencourt).The End Repair Control and AMPure XP beads were used to repair the 3' ends and purify the repaired cDNA fragments.Subsequently, adenylation of the 3' ends of the cDNA fragments was conducted using Klenow exo (M0212L, NEB).After end repair and A-tailing, Illumina paired-end adapters were ligated to the cDNA fragments using T4 Ligase (Fermentas) and purified twice with AMPure XP Beads.To prepare the cDNA sequencing library, the ligated cDNA was enriched and amplified using selective PCR.The PCR procedure was performed as follows: 98°C for 30 s; 15 cycles of 98°C for 10 s, 60°C for 30 s, 72°C for 30 s, and 72°C for 5 min; holding at 4°C, followed by purification with AMPure XP beads.The quality and quantity of the cDNA library were measured using the Agilent 2100 Bioanalyzer and Qubit 2.0 (Life Technologies).Finally, paired-end sequencing of the constructed cDNA library was carried out at Sangon Biotech.Co. Ltd. (Shanghai, China) on an Illumina HiSeq 2000 system (Illumina).

De novo assembly and sequence clustering
The raw reads were filtered, and high-quality clean read data were obtained by deleting adaptor sequences, removing reads containing more than 5% ambiguous bases (undetermined bases, N) and low-quality reads (reads containing more than 10% bases with a Q-value 20).The de novo assembly of the clean reads was carried out using the TRINITY paired-end assembly method (Trinity RNA-Seq r2013-02-25, http://trinityrnaseq.sourceforge.net/)[26] with an optimized k-mer length of 25.The assembled sequences were clustered with Chrysalis, a module of Trinity.The longest transcript that could not be extended on either end within each clustered loci was defined as a unigene.The assembled unigenes (longer than 200 bp) have been deposited in the Transcriptome Shotgun Assembly Sequence Database (http://www.ncbi.nih.gov/genbank/tsa.html)at DDBJ/EMBL/GenBank under the accession number GBXO01000001-GBXO01078617.
Similarity searches were performed using locally installed BLAST+ v2.2.27 software [27].The transcripts and unigenes were subjected to similarity searches against protein and nucleotide sequence databases using BLASTx and MEGABLAST, respectively, at an e-value cut-off of e-5.BLAST annotations were filtered using either subject or query coverage (>30%) and sequence identity (>50% for megablast and >30% for blastx).

Mapping reads, calling variations and quantifying transcripts
Due to the lack of a reference sequence, the assembled transcripts were assumed to be the reference sequence to compute transcript expression levels [26,28,29].The expression values were used to create an expression profile with the help of Agilent's GeneSpring program.The read sequences were aligned against these transcript reference sequences using BWA-0.6.2-http://bio-bwa.sourceforge.net/[30] in the end-to-end alignment mode.

Functional annotation and classification
All resulting unigenes that exceeded 200 bp in length were annotated according to their sequence similarity to previously annotated genes.First, the unigenes were aligned using BLASTx to the public protein databases NR, SWISS-PROT, TrEMBL, Pfam, and CDD with similarity set at >30% and an E-value 1e-5.The KOG (Clusters of Orthologous Groups for eukaryotic complete genomes) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway annotations were performed by sequence comparisons against the two databases using BLASTALL and KAAS software (ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.18/) with an E-value 1e-5.The resulting blast hits were processed using Blast2GO software (version 2.3.5, http://www.blast2go.de/)[31] with an E-value threshold of 1e-5 to retrieve associated GO terms.GO classification was achieved using WEGO software [32].The results that presented the best alignment were used to identify the sequence direction and to predict the coding regions using BLASTx searches against protein databases, with the priority order of NR, SWISS-PROT, KEGG and KOG if conflicting results were obtained.The ESTScan software [33] was used to analyze the unigenes that did not align to any of the above databases.KEGG mapping was used to determine the metabolic pathways.Enzyme codes were extracted, and KEGG [34][35][36] pathways were retrieved from the KEGG web server (http://www.genome.jp/kegg/).To further enrich the pathway annotations, unigenes were submitted to the KEGG Automatic Annotation Server (KAAS) [37], and the single-directional best hit information method was selected.To identify the enriched pathways, the phyper test was used to measure the relative coverage of the annotated KEGG orthologous groups of a pathway against the transcriptome background, and the pathways with a p-value 0.05 were classified as enriched.

Expression analysis and identification of DEGs
The expression levels of unigenes were measured by mapping back the number of clean reads to the assembled unigenes using BWA-0.6.2-http://bio-bwa.sourceforge.net/[30].The number of clean reads mapped to each unigene was calculated and then normalized to RPKM (reads per Kb per million reads) using ERANGE3.1 software [15].Unigene expression levels were analyzed using the DEGseq R package [38] with the MARS (MA-plot-based method with Random Sampling) model.The DEGs between each pair of samples were screened using the Audic-Claverie algorithm [39] with an FDR threshold of 0.001 and an absolute value of log2 1.Multiple test corrections of the p-value and FDR were performed with the Benjamini-Hochberg correction [40].

Real-time quantitative reverse transcription PCR validation
To validate the transcriptome data, 39 genes with known functions that were assumed to play roles in adventitious root initiation were selected for further analysis.Hypocotyl tissues were harvested from three biological replicates subjected to the same experimental design as that of the samples subjected to Illumina sequencing for RNA-Seq.Total RNA was extracted with TRIzol reagent (Invitrogen, Carlsbad, CA, USA) and purified on RNeasy mini spin columns (Qiagen) with on-column DNase I treatment according to the manufacturer's protocol.RNA integrity was examined with an Agilent Bioanalyzer 2100 (Agilent Technologies).First strand cDNA was synthesized using the AMV First Strand cDNA Synthesis Kit (Roche Applied Science, Mannheim, Germany) according to the manufacturer's instructions.The gene-specific primer pairs (S8 Table ) were designed using Primer Premier 5.0 software (Applied Biosystems, Foster City, CA, USA) according to the confirmed sequences.Real-time PCR was run in a LightCycler480 II (Roche Applied Science) with ABI SYBR Green PCR Master Mix (ABI, Foster, USA).The thermal cycling program was 95°C for 3 min and 40 cycles of 95°C for 15 s and 60°C for 40 s.Melting curve analysis was carried out for each primer set to verify the presence of a single melting peak after amplification.'No cDNA' samples (water) and 'no RT' samples were included as negative controls.Output data were generated with Sequence Detector version 1.3.1 software (ABI) and evaluated using Student's t-tests with the delta-delta Ct method described by Livak and Schmittgen [41].The standard error of the mean was calculated for the three biological replicates.Expression levels were calculated relative to the reference gene using the comparative threshold cycle method.

Solexa RNA paired-end sequencing
Total RNA extraction and cDNA synthesis were performed from three samples: hypocotyls (Con), hypocotyls after primary root excision and incubation in water for 6 h (Wat6, root induction stage), and hypocotyls after primary root removal and incubation in water for 24 h (Wat24, root initiation stage).The three cDNA libraries were sequenced separately using the Illumina HiSeq 2000 system and respectively generated 7.361e+09 bp, 5.998e+09 bp, and 5.885e+09 bp raw reads.Raw reads were subjected to quality control using SeqQC.The ratio of >Q20 bases was more than 87% across the three libraries.The percentages of undetermined bases (Ns) were 0.144%, 0.137%, and 0.224% in the three libraries, respectively (Table 1).After deleting adapter sequences and discarding low-quality sequences from the raw data, 6.832 Gbp (92.81% of the total reads), 5.558 Gbp (92.66% of the total reads), and 5.557 Gbp (94.42% of the total reads) of high-quality reads were obtained for the three libraries, respectively.The average length of the clean reads exceeded 95 bp, and the ratio of retained reads was more than 95% by pre-processing (Table 1).To assess the contamination of the processed reads, random sets of one hundred thousand sequences were aligned against the Nr database.The results are presented in S1 Table.This assay indicated that the sequencing quality was high enough for further analysis.These processed paired-end reads were used for transcript assembly.

De novo assembly
The paired-end de novo assembly of the processed reads was performed using the TRINITY transcriptome assembly software program.After filtering out repetitive sequences and those shorter than 200 bases in length, a total of 133,287 transcripts (166 Mb) with a sequence length > 200 bp were generated.The total length of the transcripts was 1.66e+08 bases, and the mean length of the transcripts was approximately 1248 bases (Table 2).The average GC content of the transcripts was 37.84%, indicating that the transcripts were AT-rich at 62.16% (Table 3; S1 Fig) .The N50 was 2132 in this assembly, which was higher than most other plant transcriptome assemblies [12,26,28,42,43].The higher the N50 value, the better the assembly [12].Further clustering using the Chrysalis cluster module of TRINITY resulted in 78,697 unigenes (65 Mb), which represented the longest transcripts in sequence length within each loci.Approximately 47% (37,438) of the unigenes had a length that exceeded 500 bp (Table 2 It has been demonstrated that longer transcripts are easier and more likely to be mapped to correct transcript sequences [44].The lengths of the assembled transcripts and unigenes are shown in S2 Fig.The ratios of mapped reads were 93.55%, 94.08%, and 94.04%, and the expression ratios of unigenes were 91.92% (72,342), 84.71% (66,663), and 82.19% (64,680) in the Con, Wat6, and Wat24 samples, respectively, demonstrating a decreasing trend in gene expression during root development (Table 3).

GO enrichment analysis
GO enrichment analysis is a proven method to identify primary biological functions.The functional enrichment of DEGs indicated that, at FDR0.05, 258 GOs were enriched in the Wat6 versus Con (Wat6:Con), 183 GOs were enriched in the Wat24 versus Con (Wat24:Con), and 222 GOs were enriched in Wat24 versus Wat6 (Wat24:Wat6).The functional enrichment of the DEGs revealed different GOs in the three samples.For example, the functions of oxidoreductase activity, response to oxidative stress, DNA binding transcription factor activity, and photosynthesis were enriched in Wat6, while the functions of ribosome and translation were enriched in Wat24 (S3 Table ).These results suggest that profound cellular and metabolic reorganization occurs during the root induction stage.GO enrichment further demonstrated that total of 897 terms were significantly regulated with 595 up-regulated and 302 down-regulated in Wat6:Con, whereas total of 487 terms were significantly regulated with 232 up-regulated and 255 down-regulated in Wat24:Con, and total of 484 terms were significantly regulated with 128 up-regulated and 356 terms down-regulated from Wat6 to Wat24 (Table 6).The up-regulation and down-regulation of GO categories are presented in Fig 5.In the group of down-regulated terms, the proportions of unigenes in each GO category exhibited a trend of Wat6 > Wat24 > Wat24:Wat6, with the exception of the subcategory of nutrient reservoir activity in molecular function, suggesting that the more significant down-regulation of DEGs occurs during the root induction stage.In the group of upregulated terms, the major GO categories exhibited a trend of Wat24 > Wat6 > Wat24:Wat6, with the exceptions of rhythmic process in biological process and extracellular matrix in cellular component, suggesting that the more significant up-regulation of DEGs occurs during the root induction stage.Comparing between the down-regulated and up-regulated groups, we found that the significant down-regulated categories appeared under molecular function, with the top subcategories of protein binding transcription factor activity, nucleic acid binding transcription factor activity, and molecular transducer activity.The significant up-regulated categories appeared under both cellular component and molecular function, with the top subcategories of antioxidant activity, structural molecule activity, and nutrient reservoir activity in molecular function and extracellular matrix, extracellular region part, cell junction, and macromolecular complex in the cellular component category.The top 10 significant up-and downregulated GO categories are listed in Table 7.Among the top-10 up-regulated GO groups, GO:0003735, GO:0005840, GO:0005198, GO:0022626, GO:0044445, GO:0006412, GO:0044391, and GO:0030529 were all up-regulated in Wat6, Wat24, and Wat24:Wat6.Moreover, the top-10 up-regulated GO categories were identical in Wat24 and Wat24:Wat6.However, in the down-regulated groups, only GO:0001071 and GO:0003700 were both downregulated in Wat6 and Wat24; the others were all associated with different GO categories across the three samples (Table 7, S4 Table ).These results indicate that nearly the same groups of GO categories were significantly up-regulated at the root induction and initiation stages, including ribosome, structural constituent of ribosome, structural molecule activity, translation, ribonucleoprotein complex, ribosomal subunit, cytosolic ribosome, non-membranebounded organelle, intracellular non-membrane-bounded organelle, and cytosolic part.
Clearly, these GO categories are associated with protein synthesis.However, several distinct GO categories were significantly down-regulated at the root induction and initiation stages, including nucleic acid binding transcription factor activity, sequence-specific DNA binding transcription factor activity, RNA biosynthetic process, DNA integration, nucleic acid binding transcription factor activity, sequence-specific DNA binding transcription factor activity, and nucleic acid metabolic process.These GO categories are associated with RNA transcription.Interestingly, GO:0016491, oxidoreductase activity, was significantly up-regulated in Wat6 but significantly down-regulated in Wat24:Wat6, suggesting an increase in cellular oxidoreductase activity during the root induction stage that became a decrease during the root initiation stage.Compared with Wat6, the significant down-regulated GO categories include response to chemical stimulus, oxidoreductase activity, response to endogenous stimulus, response to auxin stimulus, response to stimulus, response to hormone stimulus, and response to organic substance.Clearly, these GO categories involve responses to stimulus and hormone signaling.
These results indicate that more KOs were up-regulated than down-regulated, especially in Wat6, suggesting that the key up-regulation of KOs occurred during the root induction stage.KEGG enrichment analysis further indicated that ko03010 (ribosome), ko0094 (phenylpropanoid biosynthesis), ko00360 (phenylalanine metabolism), and ko00909 (sesquiterpenoid and triterpenoid biosynthesis) were all up-regulated in Wat6, Wat24, and Wat24:Wat6.The significant down-regulated KOs during Wat6 were photosynthesis, carbon fixation in photosynthetic organisms, carotenoid biosynthesis, nitrogen metabolism, sphingolipid metabolism, glycerolipid metabolism, and porphyrin and chlorophyll metabolism.The significant down-regulated KOs during Wat24 were cutin, diterpenoid biosynthesis, cytokine-cytokine receptor interaction, and circadian rhythm-plant, and those in Wat24:Wat6 were oxidative phosphorylation, nitrogen metabolism, plant hormone signal transduction, diterpenoid biosynthesis, photosynthesis, and cysteine and methionine metabolism.Among them, ko00195 (photosynthesis) and ko00910 (nitrogen metabolism) were down-regulated in both Wat6 and Wat24:Wat6, suggesting that photosynthesis and nitrogen metabolism were continuously down-regulated from the root induction stage to the root initiation stage (Tables 8 and 9, S5 Table ).The principal aspects of the KEGG enrichment results were consistent with the GO enrichment results.

Gene expression profiling during adventitious rooting
Gene expression levels can be estimated from Illumina sequencing based on the number of clean reads for a gene.The RPKM method [15] was used to calculate the expression abundances of unigenes during adventitious rooting.The results indicated that the unigenes numbered with RPKM = 100-500, RPKM = 500-1000, and RPKM1000 exhibited a clearly increasing trend from Con to Wat24, suggesting that the expression abundances of certain genes greatly increased during root development (Table 3).A total of 11,717 unigenes showed differential expression (log2 1) in Wat6, with 8,772 unigenes down-regulated and 2,945 unigenes up-regulated.A total of 12,737 unigenes showed differential expression during Wat24, with 9,303 unigenes down-regulated and 3,434 unigenes up-regulated.Compared with Wat6, a total of 5,334 unigenes showed differential expression in the Wat24 sample, with 2,167 unigenes down-regulated and 3,167 unigenes up-regulated.These results indicate that 74.9% and 73.04% of the DEGs were down-regulated at the root induction and initiation stages, respectively, while 59.4% of the DEGs were up-regulated from the root induction stage to the initiation stage (Table 10).Further analysis revealed that 283 unigenes were specifically up-regulated DEGs and 546 unigenes were specifically down-regulated DEGs in Wat6; 619 and 753 unigenes were specifically up-and down-regulated DEGs in Wat24; and 424 and 163 unigenes were specifically up-and down-regulated DEGs from Wat6 to Wat24.Most of the specifically expressed DEGs were low-abundance genes (read number 100).For example, among the specifically expressed DEGs with a read number 100, 34 were up-regulated and 11 were down-regulated in Wat6, 69 were up-regulated and 11 were down-regulated in Wat24, and 29 were up-regulated and 0 were down-regulated from Wat6 to Wat24.Moreover, among the specifically expressed DEGs with both a read number 100 and log2 4, 209 unigenes were up-regulated and 96 were down-regulated in Wat6, 238 were up-regulated and 59 were down-regulated in Wat24, and 100 were up-regulated and 34 were down-regulated from Wat6 to Wat24 (Table 10).These results indicate that many more specific DEGs were significantly up-regulated than down-regulated during adventitious root induction and initiation.

Specifically up-and down-regulated unigenes during adventitious root induction
To evaluate the changes in DEGs during adventitious root induction and initiation, we selected the top 50 DEGs with both a read number >1000 and log2 >5 (fold change >32) (S6 Table ).

Validation of gene expression
To validate the differential expression data obtained through statistical comparisons of RPKM values, a total of 39 interesting DEGs of four types: 17 auxin signaling-related genes, 14 stress response-related genes, 3 LATERAL ORGAN BOUNDARY (LBD)-DOMAIN genes, and 3 internal reference genes were selected for validation of the transcriptomic data using real-time quantitative PCR (qRT-PCR).Detailed information on these genes is presented in S8 Table .According to the RNA-Seq results and the study published by Jian et al. [45], we selected three genes: CPY20, eIF5A, and ACTIN (Actin-related protein 4), as internal reference genes for qRT-PCR.The qRT-PCR results showed that CPY20 was the most stable housekeeping gene, so it was used to calculate the relative expression levels in this study.Out of the 39 selected genes, 36 showed a strong correlation (92.3%) to the RNA-Seq data (Fig 6).The qRT-PCR results confirmed that PER1, PER2, ADH1, LBD29, LBD41, and PIN1 were significantly up-regulated at the two time points; AUX22C, AUX15A, and QORL (Quinone oxidoreductase-like protein) were significantly up-regulated at Wat6 but returned to their original levels by Wat24; and the other genes showed a significant reduction at both time points.

Discussion
Transcriptomic data can reveal gene expression profiles and give fundamental insights into biological processes.As a high-throughput, accurate and low-cost method, RNA-Seq, a new next-generation sequencing (NGS) method, has been widely applied to analyze transcriptomes qualitatively and quantitatively.NGS has proven to be a powerful tool for DEG screening, especially for species without available genomic information [42,43].In this study, the Illumina HiSeq 2000 platform was used to perform a de novo transcriptome sequencing analysis of the mung bean to better understand gene expression changes during adventitious rooting.Pooled RNA samples from hypocotyls and hypocotyls sampled at two time points after primary root excision were used to construct cDNA libraries for deep sequencing.This sequencing generated 7.36 Gbp, 5.998 Gbp, and 5.885 Gbp of sequence data, and obtained approximately 68.32 million, 55.58 million, and 55.57 million paired-end clean reads in the mung bean hypocotyls 0 h, 6 h, and 24 h after primary root excision, respectively.The newly developed Trinity method was used for de novo reads assembly.The Trinity method can recover more full-length transcripts across a broad range of expression levels and provides a unified, sensitive solution for transcriptome reconstruction in species without a reference genome, similar to methods that rely on genome alignments [26].Another study demonstrated that Trinity was a better approach than was SOAPdenovo for assembly, as the assembled unigenes did not contain gaps, and the average unigene length was nearly twice the length of those produced by SOAPdenovo [28].After de novo assembly, we obtained 78,697 unigenes with a mean length of 832 bp, which is longer than has been reported previously in studies using the same technology [26,28,42,43].Among the total number of unigenes, 91.92% (72,342), 84.71% (66,663), and 82.19% (64,680) of the unigenes were expressed in the Con, Wat6, and Wat24 samples, respectively.Consequently, the read number, mapped read number, and expressed genes show decreasing trends during adventitious rooting.
To understand the gene expression profile during rooting, the clean reads were mapped back to the assembled unigenes using the BWA-0.6.2 software.The number of reads mapped to each unigene was then counted and normalized using RPKM [15].Gene expression values were measured using the method described by DEGseq R package [38].We identified a total of 11,717 unigenes that showed differential expression (fold change>2) during the adventitious root induction stage, whereas 12,737 unigenes showed differential expression during the adventitious root initiation stage.Between the induction stage and the initiation stage, 5,334 unigenes showed differential expression, suggesting their possible role in the activation of the primordium and root meristem formation.Using a DNA microarray method, Rigal et al. (2012) studied gene expression changes during adventitious rooting in the model tree Populus trichocarpa.Their results indicated that 5,781 genes were differentially expressed in the organization of the adventitious root primordium; 6,538 genes were differentially expressed during primordium differentiation; and 1,146 genes were differentially expressed between these two stages [9].In another similar study using cDNA microarrays, Brinker et al. ( 2004) identified 220 genes that changed significantly during root development in hypocotyl cuttings of Pinus contorta [6].The results obtained suggest that RNA-Seq is a sensitive, low-cost, and accurate method for deep-sequencing transcriptome of plant without available genomic information and was able to identify more DEGs during the early stages of adventitious rooting relative to the results of DNA microarrays.This technology also enables the precise elucidation of transcripts in the samples.
GO enrichment analysis indicated that the majority of GO categories significantly up-regulated at the root induction and initiation stages were protein synthesis-related, including ribosome, structural constituent of ribosome, translation, ribonucleoprotein complex, ribosomal subunit, cytosolic ribosome, non-membrane-bounded organelle, intracellular non-membranebounded organelle, and cytosolic part.Conversely, the significantly down-regulated GO categories were DNA, RNA synthesis-related, and signal transduction-related, which included DNA integration, RNA biosynthetic, nucleic acid metabolic process, nucleic acid binding transcription factor activity, sequence-specific DNA binding transcription factor activity.These results indicate that during the root induction stage, the cells experience an increase in the assembly of ribosomes and protein synthesis and a reduction of DNA and RNA synthesis [6].GO categories related to response to stimulus and hormone signaling, such as response to chemical stimulus, oxidoreductase activity, response to endogenous stimulus, response to auxin stimulus, response to stimulus, response to hormone stimulus, and response to organic substance were significantly up-regulated at the root induction stage and down-regulated at the root initiation stage.
KEGG enrichment revealed that pathways such as ribosome, phenylpropanoid biosynthesis, phenylalanine metabolism, and terpenoid biosynthesis were up-regulated, whereas pathways such as photosynthesis, carbon fixation, carotenoid biosynthesis, nitrogen metabolism, sphingolipid metabolism, glycerolipid metabolism, cutin, cytokine-cytokine receptor interaction, oxidative phosphorylation, and plant hormone signal transduction were significantly downregulated during the early stage of adventitious rooting.The loss of the photosynthetic function of hypocotyl cells was also revealed in hypocotyl cuttings of P. contorta at an early stage of adventitious root formation [6].
Although many genes specifically involved in the regulation of adventitious rooting have been identified in several plant species, the global profiling of gene expression during this process is not well studied using transcriptomic method.To better understand gene expression patterns during early root development, we selected the genes that exhibited greater than 32-fold changes and higher abundant expression (read number >1000).The most highly upregulated unigenes encoded proteins involved in (1) functions related to stress, such as cationic peroxidase, pathogenesis-related protein, 7-ethoxycoumarin O-deethylase-like (cytochrome P450 monooxygenase), peroxidase C3-like isoform 2, low-temperature-induced 65 kDa protein-like (water stress-induced), early nodulin-like protein 1-like (phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins), late embryogenesis abundant protein (LEA-18, water stress-induced), heat shock 70 kDa protein-like; anthocyanin metabolismassociated and flavone metabolism-associated genes; (2) functions related to cell wall remodeling, such as polygalacturonase and polygalacturonase PG1 precursor (a pectin lyase-like superfamily protein), endoglucanase 17-like, peroxidase C3-like isoform 2 (lignin biosynthesis activity); (3) functions related to protein and lipid metabolism, such as ef1a, metacaspase-9-like (a peptidase), vignain-like (a peptidase), E3 ubiquitin-protein ligase HERC1-like, trypsin protease inhibitor precursor, and patatin group A-3-like (phospholipase A2 activity); (4) functions related to auxin transport and signal transduction, such as auxin efflux carrier, auxinbinding protein ABP19a-like, and proline-rich protein; and (5) a function act as transcription factors, such as ethylene-responsive transcription factor ERF086-like and the MYB family MYB114.In addition, the gene encoding casparian strip membrane protein 2 (unknown function) was specifically expressed in Wat24 compared with Wat6 (Table 14).
During the auxin-induced root initiation stage in hypocotyl cuttings of Pinus contorta, genes involved in cell replication and cell wall weakening and a transcript encoding a PIN-HEAD/ZWILLE-like protein were up-regulated, while genes related to auxin transport, photosynthesis, and cell wall synthesis were down-regulated.During the root meristem formation stage, the transcript abundance of genes involved in auxin transport, auxin responsive transcription, and cell wall synthesis, as well as a gene encoding a B-box zinc finger-like protein, was increased, while transcripts encoding proteins involved in cell wall weakening were decreased [6].The most highly over-expressed transcripts during the induction stage of adventitious rooting in the stem cuttings of Populus trichocarpa were from genes that encoded proteins involved in cell wall remodeling, such as glycoside hydrolases (GHs), pectate lyases, pectin esterases and expansins, auxin-, gibberellin-, or ethylene-responsive genes, as well as genes that have been implicated in signaling, such as Ser/Thr protein kinases.The members of the AP2/ERF, MYB, NAC, WRKY, and bHLH transcription factor families exhibited significant expression changes during this process [9].Table 14 summarizes the most differentially expressed genes during early stage of adventitious rooting with or without auxin treatment in several plant species investigated using DNA microarray or RNA-seq technologies.This summary indicates that, during early stage of adventitious rooting, the common gene functional categories occur in all plants investigated, including stress response, cell wall weakening and modification, plant hormone signaling, signal transduction, and transcription factors.Although the same genes appear in this list, the auxin-induced expression genes, even the plant-specific expression genes are also distinct.For example, genes encoding histone H3 and CDC2 associated with cell replication, genes encoding PINHEAD/ZWILLE-like protein, DNA binding protein, and B-box zinc finger like protein involved in signal transduction, were induced by IBA; and genes encoding AUX1-like and SAMS were repressed by IBA in Pinus contorta [6].Genes encoding GH3, indole-3-acetate O-methyltransferase, and cytokinin oxidases involved in plant hormone signaling, and gene encoding glutathione S-transferases (GSTs) in glutathione synthesis, were induced by IBA; and genes encoding adenylate isopentenyltransferase and cytokinin hydroxylases involved in plant hormone signaling were downregulated by IBA in Camellia sinensis [46].Genes encoding lateral root primordium (lrp1), SCARECROW-like6, PISTILLATA, and AINTEGUMENTA LIKE1 (PtAIL1, PtPLT1.1, and course of adventitious rooting, peroxidase (POD) activity is sharply reduced during the induction stage, increased during the initiation stage and gradually reduced during the expression stage [16,23,47,48].Peroxidases comprise a large family of enzymes that function as antioxidants and also respond to water stress [49].Late embryogenesis abundant (LEA) proteins act as water-binding molecules, membrane-stabilizers, and ion modulators and are induced by drought stress [49][50][51][52].Increases in LEA proteins and pathogenesis-related proteins indicate that the plants were exposed to water stress [6].During the root primordia formation stage in P. contorta, transcripts encoding enzymes of the flavonoid pathway were up-regulated [6].One of these flavonoid pathway proteins, chalcone synthase, and a pathogenesis-related protein contribute to a constitutive defense barrier in the root epidermis of the pea [53].Increases in the expression of isoflavone reductase-like and isoflavone 2'-hydroxylase-like genes suggests a role for the flavonoid pathway in response to stress during early stages of rooting.
The regulation of genes with potential roles in cell wall remodeling is an essential process for adventitious root formation.In this study, the top up-regulated genes included many that encode proteins involved in cell wall synthesis and loosening, such as the significantly up-regulated genes endoglucanase 17-like and endo-1,3(4)-beta-glucanases, which are members of the glycoside hydrolase family and potentially participate in cell wall loosening [54].The phenylpropanoid biosynthesis and phenylalanine metabolism pathways were also significantly upregulated.The phenylpropanoid polymer complex is the component of lignin that accumulates between cellulose, hemicellulose and pectin components in the cell wall.Phenylpropanoid synthesis starts from phenylalanine [55].The common derivatives from phenylpropanoid pathway, phenolic acids, flavonoids, and lignin [56], are crucial regulators in cell division and differentiation [57] and stimulate in vitro rooting [58].During the early stage of adventitious rooting, genes with the potential to be active in cell wall synthesis were down-regulated, while genes involved in weakening cell walls were up-regulated, suggesting that the cell walls were undergoing remodeling and weakening [6,9].
Genes associated with auxin transport and signaling are regulated during the early stage of adventitious rooting.During the root induction stage, the gene encoding auxin efflux carrier was significantly up-regulated.Auxin efflux carriers control auxin distribution to establish and maintain auxin concentration gradients in various tissues [59], triggering the establishment of new growth axes [60].During the root initiation stage, an auxin signaling gene, auxin-binding protein ABP19a-like, was specifically expressed.ABP1 has known to mediate rapid cellular auxin effects through the non-transcriptional auxin response pathway and is essential for auxin-regulated processes.ABP1 also regulates the expression of AUX/IAA genes [61,62], suggesting that active transport of auxin starts during the early stage of root meristem formation.However, down-regulation was observed in another auxin carrier complex gene, ABC transporter G family member 22-like, which functions in cellular auxin efflux and influx [63].
In this study, the gene encoding the ethylene-responsive transcription factor ERF017-like was significantly down-regulated during the root initiation stage.During root primordia formation in P. contorta cuttings, the gene encoding an ethylene responsive element binding protein (EREBP)-like protein was down-regulated [6].Ethylene biosynthesis has been demonstrated to be required for adventitious root formation, and there is crosstalk between ethylene and auxin during the process of adventitious root formation in tomato [64,65].These results suggest that ethylene signaling also mediates adventitious rooting.
The MYB transcription factor is an important regulator of the initiation and primordium formation of adventitious roots.Interestingly, three genes encoding the MYB transcription factor MYB134 were significantly down-regulated during the root induction stage but significantly up-regulated during the root initiation stage (S7 Table and Fig 6) in this study.In cuttings of Populus trichocarpa, MYB family expression levels changed the most during primordium differentiation [9].MYB77 acts in a synergistic manner with ARF7 to enhance the expression of auxin-responsive genes and mediate the auxin response [66].These results suggest that members of the MYB transcription factor family mediate the initiation of and the formation of adventitious root primordium.

Conclusions
In recent years, we have made great efforts to reveal the mechanisms that regulate adventitious rooting in plants at the physiological and molecular levels using a model plant Vigna radiata (L.) R. Wilczek, a tropical legume that serves as a significant source of dietary protein for the people of Asia and Africa.In this paper, we provide the first study to report the transcriptome of seedling hypocotyls of V. radiata and in vitro adventitious rooting of hypocotyl cuttings using RNA-Seq analysis.We obtained 78,697 assembled unigenes using the Trinity de novo assembly method.Among these unigenes, 72,342, 66,663, and 64,680 genes were expressed in hypocotyls or at the 6 h and 24 h time points during adventitious rooting, respectively, but only 29,029 (36.77%) unigenes could be annotated using public databases.The global transcriptomic data reveal that profound cellular and metabolic reorganization occurs during the root induction stage.We used gene clustering and the enrichment of GO terms and KEGG pathways to describe the overall biological processes regulated during this developmental process.We also used RPKM analysis to investigate the differentially expressed gene profiles at the three developmental stages.Furthermore, real-time quantitative PCR was used to confirm the differential expression levels observed for 39 of the unigenes.The results obtained using RNA--Seq were consistent with the average expression levels in three biological replicates.Further investigation of the transcriptional changes at more closely spaced developmental stages will provide additional valuable information.Our full transcript abundance analysis, presented in S2 and S7 Tables, represents a useful resource for further insight into mung bean transcriptome and in vitro adventitious root development and for candidate gene selection.

Fig 1 .
Fig 1.Time course of adventitious root development in mung bean hypocotyls after the primary roots excision.Adventitious root primordia are visible at 48 h after the primary roots excision and adventitious roots grow through the epidermis of the hypocotyls within 96 h.The basal 0.5 cm of hypocotyls at 0 h (Con), 6 h (Wat6), and 24 h (Wat24) after the primary roots excision and incubation in water were harvested and used as study samples.doi:10.1371/journal.pone.0132969.g001

Fig 2 .
Fig 2. Gene Ontology classification of mung bean transcriptome.Unigenes with BLASTx matches against the plant Nr database were classified into three main GO categories (biological process, cellular component, molecular function) and 57 sub-categories.The left-hand scale on the y-axis shows the percentage of the unigenes in each of the categories.The right-hand scale on the y-axis indicates the number of the unigenes in the same category.doi:10.1371/journal.pone.0132969.g002

Fig 6 .
Fig 6.Validation of selected genes involved in adventitious rooting by qRT-PCR.The gene expression levels measured by qRT-PCR were compared with that of RNA-Seq.White histograms represent expression levels determined by RNA-Seq in RPKM units (left axis), while grey columns represent gene expression levels determined by qRT-PCR and normalized to three control genes (right axis).Bars represent the mean (± SE) of three experiments.Different letters (a, b, and c) represent statistically significant differences (P < 0.01) among the data of qRT-PCR, analysed using Student's t-test.doi:10.1371/journal.pone.0132969.g006

Table 1 .
Sequencing data information.

Table 2 .
Statistics of reads assembly with the Trinity method.

Table 3 .
Sample mapping results and unigene abundance measurements in the samples.

Table 4 .
E-value distribution of the BLASTx hits against the Nr and TrEMBL databases for each unigene. doi:10.1371/journal.pone.0132969.t004

Table 6 .
Differentially enriched GO categories and KEGG pathways at the time points during adventitious rooting.

Table 7 .
Top 10 significantly down-and up-regulated GO categories in the samples.

Table 10 .
Statistics of the DEGs (FDR<0.001) at the time points during adventitious rooting in mung bean.

Table 11 .
Top up-and down-regulated DEGs in the Wat6 sample.

Table 12 .
Top up-and down-regulated DEGs in the Wat24 sample.

Table 13 .
Top up-and down-regulated DEGs between Wat24 and Wat6 sample.

Table 14 .
Summary of the most differentially expressed genes during early stage of adventitious rooting in several plants investigated.