Deep Sequencing Reveals the Effect of MeJA on Scutellarin Biosynthesis in Erigeron breviscapus

Background Erigeron breviscapus, a well-known traditional Chinese medicinal herb, is broadly used in the treatment of cerebrovascular disease. Scutellarin, a kind of flavonoids, is considered as the material base of the pharmaceutical activities in E. breviscapus. The stable and high content of scutellarin is critical for the quality and efficiency of E. breviscapus in the clinical use. Therefore, understanding the molecular mechanism of scutellarin biosynthesis is crucial for metabolic engineering to increase the content of the active compound. However, there is virtually no study available yet concerning the genetic research of scutellarin biosynthesis in E. breviscapus. Results Using Illumina sequencing technology, we obtained over three billion bases of high-quality sequence data and conducted de novo assembly and annotation without prior genome information. A total of 182,527 unigenes (mean length = 738 bp) were found. 63,059 unigenes were functionally annotated with a cut-off E-value of 10−5. Next, a total of 238 (200 up-regulated and 38 down-regulated genes) and 513 (375 up-regulated and 138 down-regulated genes) differentially expressed genes were identified at different time points after methyl jasmonate (MeJA) treatment, which fell into categories of ‘metabolic process’ and ‘cellular process’ using GO database, suggesting that MeJA-induced activities of signal pathway in plant mainly led to re-programming of metabolism and cell activity. In addition, 13 predicted genes that might participate in the metabolism of flavonoids were found by two co-expression analyses in E. breviscapus. Conclusions Our study is the first to provide a transcriptome sequence resource for E. breviscapus plants after MeJA treatment and it reveals transcriptome re-programming upon elicitation. As the result, several putative unknown genes involved in the metabolism of flavonoids were predicted. These data provide a valuable resource for the genetic and genomic studies of special flavonoids metabolism and further metabolic engineering in E. breviscapus.


Introduction
Erigeron breviscapus (Vant.) Hand-Mazz. (Deng Zhanhua in Chinese) is a famous traditional Chinese medicine, broadly used for cerebrovascular disease in clinic [1]. The main active compounds of E. breviscapus are flavonoids and their derivatives, such as scutellarin and apigenin 7-O-glucuronide [2]. More than 95% of the clinical efficiency E. breviscapus is performed for the treatment of cerebrovascular disease and apoplexy sequelae [3]. However, such active nature compounds are only found in a few Erigeron species, including E. breviscapus and E. multiradiatus. What's worse, E. breviscapus, mainly distributed in Yunnan province in China, is endangered due to overexploitation [4].
TFs are proteins that bind to specific DNA sequences in the promoters to control and regulate the transcription of genes. Many TF families, such as WRKY, MYB and bHLH, perform extensive regulatory effects to the biosynthesis of flavonoids in plant kingdom [13]. In grape, VvMYB5b can activate the transcription of ANS, CHI and LAR1 directly and contribute to the regulation of anthocyanin and proanthocyanidin biosynthesis [14]. In most cases, MYB-bHLH-WD40 always participates in the biosynthesis of flavonoids as a protein complex [15]. In rice, OsWRKY13 has been found to show effects on the biosynthesis of flavonoids [16]. However, it is still unknown as how TFs regulate the biosynthesis of scutellarin in E. breviscapus.
Many signal molecules, including Methyl jasmonates (MeJA), abscisic acid (ABA) and salicylic acid (SA) show extensive regulations to the secondary metabolism containing biosynthesis of flavonoids [17]. On the one hand, in our previous research, the yields of rosmarinic acid and tanshinone in Salvia miltiorrhiza and lignans in Isatis indigotica were significantly induced by exogenous MeJA. In addition, it is well known that most MeJA-induced plants show a larger accumulation of flavonoids [18]. On the other hand, analysis after MeJA treatment also can be used to unveil the relation between gene and metabolism, find key genes involved in the biosynthesis of active compounds and conduct metabolic engineering [19,20]. However, the effect of MeJA on the biosynthesis of scutellarin in E. breviscapus has not been elucidated up to now.
To get a comprehensive understanding of the global regulation mechanism of MeJA on scutellarin biosynthesis in E. breviscapus, the transcriptomes of MeJA-induced E. breviscapus seedlings were performed in this study. Comparison of the gene expression profiles among three independent groups (0 h, 3 h, and 12 h after treatment) revealed the genes that were up-or down-regulated at different time of MeJA treatment and two gene co-expression networks were analyzed to find 13 putative flavonoids genes.

Illumina sequencing and de novo assembly of sequence reads in E. breviscapus
The sequencing produced 5,409,932 reads with a total of 332,496,270 nucleotides (nt). The number of all unigenes was 182,527, and the max and min lengths of unigenes were 16,893 and 201, respectively ( Fig 1A). Of these unigenes, the length of about 44.05% was more than 500 bp, which was great for further deep analyses ( Fig 1B).

COG and GO classification and KEGG pathway enrichment of unique sequences
The Gene Ontology (GO) analysis was used to classify the functions of all unigenes. A total 17,078 (9.4%) unigenes were classified into 47 subcategories and three main categories. In 'biological processes' group, two subgroups, 'cellular process' and 'metabolic process' had more members than others, indicating that the plants were undergoing extensive metabolic activities. For the 'cellular components', the assignments were mainly given to the 'cell', 'cell part' and 'organelle'. 'Binding' and 'catalytic active' were the most highly represented groups under the 'molecular function' category (Fig 2A). The statistics abovementioned also accorded with such plants as Sophora moorcroftiana [21], Prosopis alba [22] and Chorispora bungeana [23]. This  To classify the orthologous gene products, through the Cluster of Orthologous Groups of proteins (COG) database, 19,274 unique (10.6%) sequences were annotated and classified into 24 different functional COG classes, which were represented by A to Z. The largest group was the 'general function prediction only', followed by 'translation, ribosomal structure and biogenesis'; 'posttranslational modification, protein turnover, chaperones'; and 'replication, recombination and repair' (Fig 2B). Based on the COG analysis, this work helped us to better understand the protein function without reference genome sequence and protein evolution.
To further understand the active biochemical pathways in E. breviscapus, unigenes were mapped into reference canonical pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG) (Fig 2C). A total of 7,984 unique sequences (4.4%) were annotated into 33 pathways of 4 groups ('Cellular Processes', 'Environmental Information Processing', 'Metabolism' and 'Organismal Systems'). By comparing the GO and COG analyses, the pathway-based analysis may predict the accurate function of genes in metabolism. A summary of the unigenes annotation is given in S1 Table.

Candidate genes encoding enzymes involved in the biosynthesis of scutellarin
Based on Nr datasets, the numbers of the unigenes involved in the biosynthesis of scutellarin were discovered in this study (Fig 3A), including PAL, C4H, 4CL, CHS, CHI, FNS II, FNS I, F6H, F3H, UGT.
We have previously mentioned that there must be a F6H, which converts apigenin to scutellarein, followed by glucuronidation catalyzed by F7GAT ( Fig 3B). According to the results, 8 F7GAT were identified, which might convert scutellarein to scutellarin. Unfortunately, candidate F6H was not found in this result and the same result was found in Yang' research [4]. Furthermore, all hydroxylases were discovered and classified into three groups (isoflavone hydroxylase, flavonoids hydroxylase and other hydroxylase). The result showed that the flavonoids hydroxylases only included flavonoid 3'-hydroxylases (24, 9%), flavonoid 3', 5'-hydroxylases (4, 1%) and flavonoid 3-hydroxylases (34, 14%). The reason for this result might be that the F6H has been insufficiently studied due to its restricted occurrence ( Fig 3C).

Identification of transcription factors
TFs are proteins, binding to specific DNA sequences in the promoters to control and regulate the transcription of genes, which are essential for plants to respond to various stresses and pathogen attacks from environment, such as drought, high salt content and high temperature. In this study, 4072 TFs were identified and classified into 17 different common families by searching from unique transcriptoms (Fig 4). AP2/ERF family was the largest group (297, 7%), followed by MYB (189, 5%), bHLH (173, 4%) and WRKY (151, 4%). This result is similar to the wheat transcriptome, whose largest group is bZIP, followed by MBF1, WRKY and MYB [24], and similar to the result from the Tibetan Sophora transcriptome, whose largest group is were classified into 24 different functional COG classes. The Y-axis indicates the number of unigenes in a specific functional cluster and the X-axis indicates the symbols of different functional COG classes. C) Histogram presentation of the KEGG classification of the annotated transcripts. A total of 7,984 unique sequences were annotated into 33 pathways of 4 groups ('Cellular Processes', 'Environmental Information Processing', 'Metabolism' and 'Organismal Systems'). The X-axis indicates the KEGG pathway, and the Yaxis indicates the percentage of unigenes that were assigned to a specific pathway.
doi:10.1371/journal.pone.0143881.g002 bZIP, followed by MYB, bHLH, and WRKY [25]. Despite a minor difference, these results further suggest that AP2/ERF, MYB, bHLH, and WRKY TFs are superfamilies in plants, and the expression of most numbers are affected by abiotic and abiotic stress such as heat, low temperature, MeJA and ethylene. To understand the impact of MeJA on these important transcription factors, expression change comparison was studied (Fig 5 and S2 Table) and the result was concluded in Table 1. Notably, two families (bHLH and WRKY) were susceptible to MeJA, including 46 (24.7%) and 68 (43.6%) significantly regulated genes respectively.
Furthermore, to understand the influence of other stresses from environment, the heat responsive and ethylene responsive proteins were identified through the Nr database. For heat, 554 protein sequences and 46 transcription factors were identified. The result showed that the heat shock protein was the largest group (22, 4%) ( Fig 6A). For ethylene, 287 proteins and 205 transcription factors were identified, which contained the largest AP2/ERF family (119, 41%) ( Fig 6B).

Verification of expression profile by qPCR
To confirm the gene expression data, 8 transcription factors from 4 crucial transcription factor families abovementioned (AP2/ERF, MYB, bHLH, and WRKY) whose expression was up-or down regulated after MeJA treatment were randomly chosen for qRT-PCR analysis. As shown in Fig 4, the unigene expression trends were similar in both sequencing and transcript data (Fig 7).   To identify the genes that were differentially expressed under MeJA, a categorization was carried out by GO analysis (Fig 8B). By comparing MeJA treatment versus control, the differentially expressed genes abovementioned were functionally assigned to the relevant terms in three categories ('Biological Process', 'Cellular Component', and 'Molecular Function') of the   the largest members). However, the type of differentially expressed genes at 12 h was more than that in 3 h. It is also worth noting that the number of 'metabolic process' and 'cellular process' at 12 h was more than that at 3 h. These data, overall, suggested that 'metabolic process' and 'cellular process' were strongly affected, probably because MeJA-induced activities of  Table). 30 up-regulated genes and 11 downregulated genes (>10 folds) at both 3 and 12 h after MeJA treatment was identified. The results told that MeJA showed a positive effect on more genes to biosynthesis of flavonoids.

MeJA-induced changes in the metabolites
Here, we focused on the scutellarin and apigenin 7-O-glucoside, the most important compounds for its medicinal values. To get the richest relative abundance of precursor ions and product ions, the best parameters were chosen (S1 Fig). The spectra of full scan product ion of precursor ions of the two analytes and chromatograms of the standard substance were also listed in S1 Fig The  content of scutellarin performed a 2.1-fold increase approximately in day 1 and 2 after MeJA treatment. The content of apigenin 7-O-glucoside increased 3.2 fold in day 2 (Fig 9). Integration of transcripts and metabolites data will be crucial for the study of metabolism in plants, both at the regulatory and catalytic levels. Result indicated that inductive effect of MeJA on compounds' Fig 10. Co-expression analysis of known flavonoids biosynthetic genes. Two different co-expression analyses were performed using CressExpress [23] and a high-resolution root spatio-temporal (HRRS) expression dataset [24]. In each of these PAL2 (AT3G53260), PAL4 (AT3G10340), C4H (AT2G30490), 4CL1 (AT1G51680), 4CL2 (AT3G21240), 4CL3 (AT1G65060), 4CL5 (AT3G21230), CHS (AT5G13930), CHI (AT3G55120), FNS I (AT5G08640) and FNS II (AT5G63580) were used as bait. Candidate genes had to be co-expressed with two or more flavonoids biosynthesis genes in order to be retained on the co-expressed gene list. 13 genes were found to be co-expressed in all two analyses. Gene identities are given in S5 Table. doi:10.1371/journal.pone.0143881.g010 accumulation had an obvious delay. Compared the variation trend of genes' expression (S2 Fig)  and compounds' accumulation (Fig 9), the genes showed similar variation trend with compounds and higher expression level (> 10 fold) may more like strongly catalyze the metabolism of scutellarin, such as PAL1, PAL9, C4H2, 4CL29, CHS10, F3H30, FNS II3, F7GAT1.

Co-expression analyses
Here, we predicted more candidates for involvement in the biosynthesis of flavonoids, based on analyses designed to identify genes co-expressed with known components of the flavonoids biosynthetic pathway (Fig 10). Of the 13 genes in E. breviscapus and 9 genes in Arabidopsis thaliana identified by each of two different co-expression software tools, publicly available datasets and our datasets, 2 were certificated genes to regulate the flavonoids metabolism in A. thaliana (AT5G48930 and AT5G44110), and others also showed an indirect interaction with flavonoids metabolism, such as lignin biosynthetic process (AT5G13420), coumarin biosynthetic process (AT4G23850). However the 13 predicted genes in E. breviscapus had no known role in the biosynthesis of flavonoids (S4 Table).

Plant material
The seeds of E. breviscapus were purchased from Southwest Forestry University (102.

Total RNA isolation, RNA-seq library construction and sequencing
The leaves from the middle position of the plants were collected from the three groups and then placed in liquid nitrogen immediately. Total RNAs were extracted using TRIzol+ Reagent (TianGen) according to the manufacturer's instruction. An equal quantity of RNA from all plants was blended for cDNA library construction to obtain the transcriptome data. A normalized cDNA library was constructed with 1 μg of total RNA. The sequencing libraries were generated using NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, USA) with the manufacturer's instruction, which were sequenced on an Illumina HiSeq 2000 platform for paired-end reads. All sequencings were purchased from the Suzhou Genewiz Bio-pharm Technology Corporation (Jiangsu, China).

De novo assembly and functional annotation
For high-quality clean data, the raw sequence data was purified by trimming adapter sequences and removing low-quality sequences. The clean reads were assembled using Trinity software as described for assembly without a reference genome (http://trinityrnaseq.sourceforge.net). After assembling, BLASTx alignment (e value < 1.00E-5) of all the unigenes were annotated against protein databases, including the NCBI non-redundant (Nr) protein database, Gene Ontology database (GO), Swiss-Prot protein database, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, and the Cluster of Orthologous Groups (COG) database.

Validation by quantitative real-time PCR
To confirm the gene expression data, 8 unigenes (MYB23, MYB24, AP2/ERF25, AP2/ERF15, bHLH6, bHLH2, WRKY16 and WRKY28) were randomly chosen from four transcription factor abovementioned for qRT-PCR analysis (S4 Table). For the quantitative RT-PCR of the mRNAs, 1 ug of total RNA was reversely transcribed by Superscript III Reverse Transcriptase (Invitrogen, USA). Real-time PCR was performed using SYBR premix Ex Taq (Takara, Japan) following the manufacturer's instruction. The PCR amplification was performed under the following conditions: 95°C for 30 s, followed by 40 cycles of 95°C for 5 s, 53°C for 10 s and 72°C for 20 s. Three independent biological replicates for each sample and three technical replicates for each biological replicate were analyzed. All primers used were listed in S5 Table. 3

.5 Co-expression analyses
Analyses to identify genes that were co-expressed with known flavonoids biosynthesis genes were performed using two different tools/datasets; CressExpress and a high-resolution root spatiotemporal (HRRS) dataset. For each analysis, 11 known flavonoids biosynthesis genes were used as query baits (S6 Table). CressExpress, a co-expression analysis tool, can rank coexpressed genes based on their common connections with two or more query genes [26]. Similarly, the HRRS dataset that was analyzed with the graphical representation tool was featured by Brown et al. to identify candidate genes that were spatially co-expressed with each flavonoids biosynthesis gene [27,28]. The final lists from both the two analyses were compared to reveal common genes identified.

Compounds extraction
Plants were dried at 50°C. The dried sample (2.0 mg) was ground into powder and extracted twice with 70% methanol (1 mL) by sonication for 30 min. The supernatant was diluted with 70% methanol to 2 mL. Before analysis, the extract solution should be filtered through a 0.2 μm organic membrane.

Compound analysis
All samples were carried on an Agilent 1200 series HPLC and interfaced to an Agilent 6410 triple-quadrupole mass spectrometer equipped with an electrospray ionization source (Agilent Corporation, MA, USA). A ZORBAX SB-C18 column (3.5 μm, 2.1×150 mm, I.D. Agilent Corporation, MA, USA) and a C18 guard column (5 μm, 4.0×2.0 mm, Agilent Corporation, MA, USA) was used. Acetonitrile-5 mM ammonium acetate solution (the concentration of acetonitrile remain 30% in 2.0 min, v/v) was used as mobile phase for the gradient elution at a flow rate of 0.3 ml/min. The column temperature was maintained at 30°C. The injection volume was 10 μL and the analysis time was 2.0 per sample. The ESI source in negative mode was chosen.

Conclusions
In this study, we used high-throughput sequencing data to reveal transcriptome re-programming of E. breviscapus upon MeJA elicitation. A large number of candidate genes involved in MeJA response were identified. Furthermore, using two different co-expression analyses, we predicted 13 candidate genes involved in flavonoids biosynthesis. This work represents a fully characterized transcriptome and provides a valuable resource for genetic and genomic studies.
In addition, these results will be helpful for further metabolic engineering to increase the contents of the active compounds.   Table. The list of known flavonoids biosynthesis genes were used as query baits in analyses of co-expression. (XLSX)