Transcriptomic Analysis of Paeonia delavayi Wild Population Flowers to Identify Differentially Expressed Genes Involved in Purple-Red and Yellow Petal Pigmentation

Tree peony (Paeonia suffruticosa Andrews) is a very famous traditional ornamental plant in China. P. delavayi is a species endemic to Southwest China that has aroused great interest from researchers as a precious genetic resource for flower color breeding. However, the current understanding of the molecular mechanisms of flower pigmentation in this plant is limited, hindering the genetic engineering of novel flower color in tree peonies. In this study, we conducted a large-scale transcriptome analysis based on Illumina HiSeq sequencing of cDNA libraries generated from yellow and purple-red P. delavayi petals. A total of 90,202 unigenes were obtained by de novo assembly, with an average length of 721 nt. Using Blastx, 44,811 unigenes (49.68%) were found to have significant similarity to accessions in the NR, NT, and Swiss-Prot databases. We also examined COG, GO and KEGG annotations to better understand the functions of these unigenes. Further analysis of the two digital transcriptomes revealed that 6,855 unigenes were differentially expressed between yellow and purple-red flower petals, with 3,430 up-regulated and 3,425 down-regulated. According to the RNA-Seq data and qRT-PCR analysis, we proposed that four up-regulated key structural genes, including F3H, DFR, ANS and 3GT, might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. We also found 50 differentially expressed transcription factors that might be involved in flavonoid biosynthesis. This study is the first to report genetic information for P. delavayi. The large number of gene sequences produced by transcriptome sequencing and the candidate genes identified using pathway mapping and expression profiles will provide a valuable resource for future association studies aimed at better understanding the molecular mechanisms underlying flower pigmentation in tree peonies.

factors, which can be classified into three families: R2R3-MYB, basic-Helix-Loop-Helix (bHLH) and WD40 [19]. To date, a large number of structural genes as well as some regulatory genes have been well characterized and isolated in mutants or crossed lines of snapdragon, maize, petunia and Arabidopsis as model plants [17,18]. Meanwhile, numerous studies on the molecular mechanism of pigmentation have been performed in different ornamental plants, including P. suffruticosa, in which five structural genes (PsCHS1, PsCHI1, PsANS1, PsF3H1 and PsDFR1) were cloned and identified as the most important genes involved in tree peony flower pigmentation [20][21][22]. However, despite the economic and breeding importance of P. delavayi, flower pigmentation in this plant has previously been studied only at the physiological level [8,16]. Therefore, studying the molecular biology of P. delavayi color formation is of great importance to accelerate the use of the unique flower genes in this species.
For woody plants, especially those of high heterozygosity, such as tree peony, whole genome sequencing requires long-term and expensive investment. Transcriptome sequencing based on the Illumina/Solexa high-throughput sequencing platform is a rapid and convenient way to obtain information on the expressed fraction of genome [23,24]. RNA-Seq is not restricted by a reference sequence and is suitable for non-model organisms without genomic sequences [25][26][27][28]. It has been widely applied to model as well as non-model organisms in various studies, including transcript profiling, single nucleotide polymorphism discovery and the identification of genes that are differentially expressed between samples [29][30][31][32][33][34].
In the present work, individuals with purple-red and yellow flowers within a wild P. delavayi population in Yunnan Province, China were used as experimental materials, and two transcriptomes (of petals of each color mixed at different flower opening stages) were sequenced on the Illumina HiSeq 2000 platform. Additionally, by analyzing the data with various bioinformatics tools, we discovered some potential differentially expressed genes involved in purple-red and yellow petal pigmentation, and analyzed the expression profiles of seven candidate genes through real-time PCR. To the best of our knowledge, this is the first exploration of the petal transcriptome of the wild tree peony species P. delavayi. These transcriptome sequences may provide a theoretical basis to understand the molecular mechanisms behind the pigment composition of P. delavayi and a valuable resource for identifying genes expressed in this wild population.

Illumina sequencing and de novo transcriptome assembly
To achieve a broad survey of the genes associated with P. delavayi flower pigmentation, two cDNA libraries (Pl and Pd) were generated for RNA-Seq. Each included a mixture of equal amounts of RNA from petals at five flower opening stages (Pl, yellow petals; Pd, purple-red). The mRNA was isolated, enriched, sheared into smaller fragments, and reverse-transcribed into cDNA, which was subjected to sequencing on an Illumina HiSeq 2000 sequencing platform. In total, we got 52.2 million raw reads with an average length of 83.42 nt and 54.3 million reads with an average length of 84.94 nt from the Pl and Pd sequencing libraries, respectively. After removal of adaptor sequences, ambiguous reads, and low-quality reads, 48,401,848 highquality clean reads comprising 4,356,166,320 nucleotides (4.35 Gb) with a Q20 percentage of 98.31% and a GC percentage of 45.47% were generated from Pl transcriptome sequencing, while 51,300,426 high-quality clean reads comprising 4,617,038,340 nucleotides (4.61 Gb) with the same Q20 percentage as that of Pl and a GC percentage of 45.69% were generated from Pd transcriptome sequencing (Table 1). All high-quality reads were assembled de novo by the Trinity program [35]. From the Pl dataset, 149,470 contigs were produced with an N50 of 542 nt (i.e. 50% of the assembled bases were incorporated into contigs of 542 nt or longer), while 144,205 contigs with an N50 of 579 nt were generated from the Pd sample. After linking the contigs together, we obtained 88,330 unigenes with an average length of 603 nt from the Pl dataset, most of which were 200-3000 bp long, and 82,720 unigenes with an average length of 615 nt from the Pd sample. Finally, 90,202 non-redundant All-unigenes were obtained by splicing and reducing the redundancy of the unigene sequences, with average length and N50 values of 721 nt and 1093 nt, respectively (Table 1). Of the total unigenes, only 814 from Pl and 823 from Pd were longer than 3 kb. The length distributions of the unigenes are shown in S1A Fig. These results showed that the throughput and sequencing quality was high enough for the following analyses.
The all-unigene sequences were aligned against protein databases using Blastx (evalue < 0.00001) in the following order: the non-redundant protein database (NR), the Swiss-Prot protein database (Swiss-Prot), the Kyoto Encyclopedia of Genes and Genomes database (KEGG), the Cluster of Orthologous Groups of proteins database (COG), the Gene Ontology database (GO) and the non-redundant nucleotide database (NT) [36]; a total of 44,811 (49.68%) were annotated (S1 Table). The size distribution of the CDSs and predicted proteins are shown in S1B Fig. Additionally, the CDSs of 4,083 all-unigenes that had no hits in Blastx were predicted by ESTScan and then translated into peptide sequences [37]; their size distribution is shown in S1C Fig.
The KEGG database can help study the complicated biological behaviors of genes and provide pathway annotations [38]. A total of 24,262 All-unigenes were mapped into 128 KEGG pathways, representing metabolism, genetic information processing, organism systems and cellular processes (S3 Table). The two largest pathways were 'metabolic pathways' and 'biosynthesis of secondary metabolites' with 5,288 and 2,552 All-unigenes, respectively. Two pathways contained only four All-unigenes each, 'caffeine metabolism' and 'betalain biosynthesis'; these were the least represented categories. Many All-unigenes corresponded to flower pigmentation related pathways involved in the biosynthesis of secondary metabolites, including 'flavonoid biosynthesis' (179, 0.74%), 'flavone and flavonol biosynthesis' (100, 0.41%), 'anthocyanin biosynthesis' (10, 0.04%) and 'isoflavonoid biosynthesis' (65, 0.27%). These annotations provide a valuable resource for the investigation of specific processes, functions and pathways in P. delavayi, and the genes in these flavonoid biosynthesis-related pathways might play an important role in flower coloration.

Analysis of genes differentially expressed between the transcriptomes of Pl and Pd
To identify the candidate genes controlling P. delavayi yellow and purple-red flower pigmentation, we performed differentially expressed gene (DEG) analysis by calculating fragments per kb per million fragments (FPKM) values [39]. All-unigene differential expression analysis was carried out by comparing the two samples to predict genes with different expression levels. The thresholds false discovery rate (FDR) 0.001 and absolute value of log 2 ratio 1 were used to judge the significance of the DEGs (Pl vs. Pd). A total of 18,784 All-unigenes were identified as differentially expressed, among which 6,855 (36.5%) were aligned with the NR database, with 3,430 up-regulated and 3,425 down-regulated (Fig 2).

Functional analysis of DEGs
GO analysis provided functional classification annotation as well as functional enrichment analysis for the DEGs, and pathway analysis helped to further understand their main biological functions. As shown in Fig 3, 5,052 DEGs were categorized into 55 functional groups, including 24, 15 and 16 groups in the biological process, cellular component and molecular function categories, respectively. In biological process, the major classifications for these All-unigene products were 'metabolic process' (3,108, 61.5%) and 'cellular process' (3,052, 60.4%). The most frequent cellular component terms were 'cell' and 'cell part', each with the same number of All- Many studies have shown that the coloration of yellow and purple-red P. delavayi petals is closely related to flavonoids [8,12,16]. Using the KEGG database, we mapped 3,969 of 18,784 DEGs to 127 pathways, and noted that 59, 39, and 5 DEGs were involved in 'flavonoid biosynthesis' (ko00941), 'flavone and flavonol biosynthesis' (ko00944) and 'anthocyanin biosynthesis' (ko00942), respectively, and all belonged to the flavonoid biosynthesis pathway. Based on NR, NT and Swiss-Prot annotations, GO functional analysis and KEGG pathway analysis, we subsequently predicted 15 up-regulated and 12 down-regulated All-unigenes involved in petal coloration ( Table 3). The annotations for these unigenes included most of the structural genes in the anthocyanin biosynthetic pathway, and the proposed pathway showed that most steps in anthocyanin biosynthesis were up-regulated in purple-red petals (Fig 4).
In the flavonoid biosynthetic pathway, the regulation of structural gene expression appears tightly organized in a spatial and temporal way during plant development, and is orchestrated by a ternary complex involving transcription factors from the R2R3-MYB, basic helix-loophelix (bHLH), and WD40 classes [19]. Thus, besides structural genes, we also investigated the expression levels of these three transcription factor families. As shown in S4 Table, 50 differentially expressed All-unigenes were annotated as transcription factors between yellow and purple-red petals, including 21 All-unigenes encoding R2R3-MYBs, 12 encoding bHLHs and 17 encoding WD40s. Among these transcription factors, 21 were highly expressed in purple-red petals and 29 were highly expressed in yellow petals.

Expression analysis of candidate DEGs probably involved in flavonoid biosynthesis
To validate the unigene expression profiles from the transcriptome sequencing data, seven flavonoid biosynthesis-related All-unigenes that exhibited particularly strong expression in anthocyanin-pigmented petals compared with other flower organs, such as sepals, carpels, leaves, and stamens (data not shown), and showed the largest difference in expression level between purple-red and yellow petals, were selected for quantitative RT-PCR assays. To get more information in this analysis, we examined the expression patterns of these genes at five different developmental stages during flower opening. As shown in Fig 5, purple-red petals showed much higher gene expression levels than yellow petals for most of the selected genes, including CHS, F3H, DFR, ANS and 3GT. Among them, DFR in particular was expressed at a statistically significantly higher level in purple-red petals than in yellow ones all through the flower pigmentation period, with p < 0.01 (Fig 5), suggesting that it might play an extremely important role in purple-red petal coloration.
Conversely, the expression levels of the other two selected genes were much higher in yellow petals (Fig 5). THC2 0 GT showed prominent expression at stage 1 prior to petal pigmentation and at stage 5 when the petals were fully pigmented, both with almost 8-fold greater expression than in purple-red petals at the corresponding stages, implying that a high expression level of this gene prior to petal pigmentation is essential for yellow pigment accumulation. For FNS II, the transcript levels were extremely high in yellow petals but hardly detected in purple-red ones throughout the entire process of flower pigmentation. These results demonstrated that the expression patterns of the seven selected DEGs analyzed by real-time RT-PCR were consistent with their respective digital expression data.

Discussion
In recent years, as a cost-effective and efficient tool to discover novel genes and analyze molecular mechanisms, high-throughput mRNA sequencing based on the Illumina/Solexa platform has been widely used for transcriptome analysis of many ornamental plants without genomic sequence data, such as P. suffruticosa [33,34], Gardenia jasminoides [40], Chrysanthemum lavandulifolium [41] and Zantedeschia aethiopica [42].
Prior to this study, most sequencing efforts in wild tree peony species were based on EST sequencing; very few tags had been reported in public databases and little genetic or genomic information was available. Here, we report the results of a large-scale transcriptome analysis of petal pigmentation in an endemic species of China, P. delavayi, using RNA-Seq technology. Two cDNA libraries constructed from the petals of yellow (Pl) and purple-red (Pd) flowers at different opening stages were analyzed, leading to the identification of 44,811 unigenes that were annotated using known sequences in various databases and DEGs involved in flavonoid biosynthesis. The relative transcript levels of seven candidate DEGs were further validated using real-time PCR. To the best of our knowledge, this is the first attempt to use Illumina paired-end sequencing technology for the de novo sequencing and assembly of the P. delavayi flower transcriptome. We believe our data will provide important new insights and facilitate further studies of P. delavayi genes and their functions. Compared with previous floral transcriptome studies in the tree peony P. suffruticosa [33,34] and other woody ornamental plants, such as Rosa [43], Camellia chekiangoleosa [41] and G. jasminoides [40], we herein report more contigs and unigenes, suggesting that the wild tree peony species P. delavayi contains abundant gene resources. In previous studies of peach [44] and safflower [27] flowers, transcriptome dynamics associated with petal pigmentation were investigated. Millions of transcripts were generated from petals of different colors, and enriched functional category analysis based on GO annotations showed that genes associated with 'cell', 'cell part', 'binding', 'cellular processes', 'metabolic processes' and 'catalytic activity' were the highest-represented groups. In the present study, similar results were obtained (Fig 1), indicating that the P. delavayi flowers were undergoing rapid development and carrying out intensive metabolic activities with complicated regulation by many transcription factors during petal pigmentation.
The accumulation of secondary metabolites plays an important role in the formation and development of flower color, especially flavonoids [45,46]. Flavonoid biosynthesis is regulated by a series of structural and regulatory genes, and the structural genes directly control flavonoid biosynthesis and accumulation [47]. A number of researchers have demonstrated that the primary pigments related to P. delavayi flower color are flavonoid compounds, including chalcones, flavones and flavonols, and anthocyanins, mainly Pn3G5G, which contribute to a variety of colors, such as yellow, orange, and purple-red [8,12,13,16]. The flower coloration of yellow P. delavayi individuals is mainly due to the biosynthesis of the 2'-glucoside of 2',4',6',4-tetrapydroxychalcone (THC)-isosalipurposide (ISP), chrysoeriol and apigenin [8,16], whereas purple-red coloration is dependent on anthocyanin contents, particularly peonidin-3,5-glucosides (Pn3G5G) [12]. Therefore, in this work, unigenes participating in the flavonoid biosynthetic pathway were specifically selected and studied in detail, providing new insight into the regulation of flavonoid and anthocyanin biosynthesis in P. delavayi that should accelerate engineering of this pathway in the future. qRT-PCR analysis of seven flavonoid biosynthetic pathway-related candidate unigenes in Pl and Pd. qRT-PCR analysis was performed using total RNA from petals at each floral developmental stage (S1-S5). The expression in yellow petals at stage 1 was used as a calibration standard. Data from real-time PCR were normalized to the helicase gene expression value, and relative transcript levels are presented as means with standard errors (S.E.) of five replications. As shown in Table 2, almost all of the structural genes in the main flavonoid biosynthesis pathway were identified, indicating that this pathway is well conserved in P. delavayi. We also observed that many of these genes appeared to form multi-gene families, implying that the genome of P. delavayi, like those of many other higher plants, has undergone one or more rounds of genome duplication during its evolution. Based on the results of unigene differential expression analysis, we predicted 27 DEGs as key structural genes involved in coloration from our sequencing data, including 15 up-regulated and 12 down-regulated genes (Table 3). In this pathway (Fig 4), CHS is the first committed enzyme, catalyzing the condensation of one molecule of 4-coumaroyl-CoA and three molecules of malonyl-CoA to generate 4,2 0 ,4 0 ,6 0 -tetrahydroxychalcone. We analyzed the expression patterns of 12 novel transcripts related to CHS and found that most of them, especially Unigene45464_All, were significantly up-regulated in purple-red petals (Fig 5). We propose that a sharp increase of transcripts encoding CHS may provide more substrate for anthocyanin biosynthesis, which is mainly responsible for petal coloration in purple-red flowers. 4,2 0 ,4 0 ,6 0 -Tetrahydroxychalcone is a key intermediate in P. delavayi flower coloration; it can be glycosylated by THC2 0 GT or isomerized by CHI, leading to the production of chalcone isosalipurposide and other flavonoids, such as flavones and flavonols, and anthocyanins, which contribute to the yellow and purple-red colors seen in P. delavayi, respectively. Mutations of the CHI gene have been shown to be necessary for isosalipurposide accumulation in carnation [48] and barley [49]. Additional mutation of the DFR gene confers better yellow coloration in carnation [50]. The existence of several genes encoding THC2 0 GT in carnation [51,52] indicates that several non-specific enzymes may catalyze the reaction in vivo. In this work, only Unigene18441_All showed sequence homology to THC2'GT from Catharanthus roseus (GenBank: BAF75901) [53], and its expression in yellow petals was almost 2-fold higher than in purple-red petals, suggesting an important role in yellow flower pigmentation. Interestingly, among the eight novel transcripts related to CHI, only one unigene (CL7622.Contig1_All) was down-regulated in yellow petals. Additionally, both of the transcripts (CL8830.Contig1_All and CL8830.Contig2_All) annotated as FNS II, which is responsible for flavone formation [54,55], were obviously up-regulated in yellow petals. Park et al. [56] reported that in Scutellaria baicalensis, SbCHI-overexpressing hairy root lines not only showed enhanced SbCHI gene expression, but also produced more flavones than the control. Thus, considering that P. delavayi yellow flowers are pigmented by the yellow chalcone isosalipurposide and by colorless or yellow flavonol and flavone glycosides, we speculate that the high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color.
Regarding the downstream enzymatic genes involved in the P. delavayi anthocyanin biosynthetic pathway, F3H, DFR, ANS and 3GT were expressed at a significantly higher level in purple-red petals than in yellow petals according to both transcriptome profiling and qRT-PCR analysis ( Table 3, Fig 5). We can see from Fig 4 that F3H catalyzes the formation of DHK, which can be further hydroxylated by F3'H to form DHQ. Then, DHK and DHQ are deoxidized to leucoanthocyanidins by DFR, leading to synthetic branches catalyzed by ANS and GT that produce corresponding pelargonidin-based (orange to red) and cyanidin-based (red to magenta) pigments, respectively. Cyanidin is the precursor pigment of other anthocyanidins, and can be transformed into peonidin derivatives by the action of a 3 0 -O-methyltransferase [57][58][59]. Because the purple-red color in P. delavayi flowers is mainly related to the synthesis of cyanidin-derived anthocyanins (cyanidin and peonidin based), our results suggest that high expression levels of F3H, DFR, ANS and 3GT genes induce cyanidin and peonidin production to make flowers purple-red. In contrast, the low transcription levels of these genes observed in yellow petals ensure the inhibition of anthocyanin formation. Thus, we propose that the molecular mechanism underlying the differences in P. delavayi flower pigmentation is related primarily to structural genes downstream of CHI in the flavonoid synthetic pathway. These results are consistent with previous studies on flower color in many other plants [60][61][62] and the petal pigments of yellow and purple-red P. delavayi individuals [8,12,16].
Transcriptional regulation of structural genes appears to be a major mechanism by which anthocyanin biosynthesis is regulated in plants. R2R3-MYB, bHLH and WD40 proteins represent the three classes of transcription factors influencing anthocyanin biosynthesis intensity and patterns, and generally controlling the expression of many different structural genes [63][64][65]. A large number of previous reports have demonstrated that transcription factors, especially R2R3-MYBs, play an important role in color differences in many crops, including grape [66], apple [67], cauliflower [68], oriental hybrid lily [69], and flowering peach [70]. In this study, we detected 50 differentially expressed transcription factors from these three classes, and found that some were significantly up-regulated in purple-red petals and showed positive correlation with F3H, DFR, ANS and 3GT genes in the anthocyanin biosynthesis pathway (S4 Table). These results provide an informative list of candidate regulatory transcription factors involved in P. delavayi flower pigmentation, but further research related to these candidate regulatory genes including sequence, expression and biological function analyses needs to be carried out.

Conclusions
The Illumina HiSeq sequencing technology was used for sequencing and transcriptome analysis of the non-model plant P. delavayi, and provided an efficient approach for identifying critical genes involved in P. delavayi flower pigmentation. We identified potential candidate genes encoding key enzymes and reconstructed the flavonoid biosynthetic pathway in P. delavayi. The up-regulated unigenes encoding F3H, DFR, ANS and 3GT might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. In addition, differentially expressed transcription factors related to anthocyanin biosynthesis were detected. Our results will provide valuable resources and a substantial basis for understanding the molecular mechanisms controlling P. delavayi flower pigmentation, and will eventually accelerate the genetic engineering of flower color in tree peony.

Ethics Statement
All Plant protocols were reviewed and approved by the Research Institute of Forestry, Chinese Academy of Forestry, State Forestry Administration. All necessary permits were obtained for field studies from the Diqing Forestry Bureau, Shangri-La County, Yunnan Province, China. The fieldwork conducted for sampling did not affect the local ecology and did not involve endangered species.

Plant materials
Petal samples were separately detached from purple-red-flowered and yellow-flowered individuals from a wild P. delavayi population in Shangri-La County (27°57 0 N, 99°35 0 E), Yunnan Province, China (Fig 6) at five different opening stages in the afternoon during the end of April to early May, 2013. The flower opening stages were described by Zhou et al. [20]: stage 1, unpigmented tight bud; stage 2, slightly pigmented soft bud; stage 3, initially opened flower; stage 4, half opened flower; stage 5, fully opened and pigmented flower with exposed anthers. The petals were immediately frozen in liquid nitrogen and after transport to the laboratory stored at −80°C until RNA extraction.

RNA extraction, cDNA library construction and transcriptome sequencing
Total RNA was extracted from petals by the cetyltrimethylammonium bromide (CTAB) method with some modification [71]. RNase-free DNase I (Tiangen; Beijing, China) was applied to remove contaminating genomic DNA. The RNA purity was determined with a NanoDrop 2000 spectrophotometer (NanoDrop Technologies; Wilmington, DE, USA) and 2% agarose gels were run to verify RNA integrity. Two cDNA libraries were constructed by mixing equal amounts of RNA from purple-red or yellow petals at five different opening stages, and named Pd and Pl, respectively. Enrichment of mRNA, fragment interruption, addition of adapters, size selection and PCR amplification and RNA-Seq were performed by staff at the Beijng Genome Institute (BGI; Shenzhen, China). mRNA was isolated with magnetic oligo (dT) beads. The mRNA was mixed with fragmentation buffer, fragmented into short fragments and used as a template for first-strand and second-strand cDNA synthesis. The cDNA fragments were purified using a Qia-Quick PCR extraction kit (Qiagen; Valencia, CA, USA), and resolved with elution buffer for end reparation and single nucleotide A (adenine) addition. The short fragments were connected with adapters and suitable fragments were selected for PCR amplification as templates. An Agilent 2100 Bioanalyzer (Agilent Technologies; Palo Alto, CA, USA) and an ABI StepOnePlus Real-Time PCR System (Applied Biosystems; Foster City, CA, USA) were used during the quality control steps. Finally, the two libraries were sequenced at the BGI using an Illumina HiSeq 2000 sequencing platform (Illumina Inc.; San Diego, CA, USA).

De novo assembly and sequence analysis
The raw reads were first filtered by removing adaptor sequences, empty reads, repeated reads and low-quality reads to get clean reads for subsequent analysis. The sequencing data for the clean reads were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra) with accession number SRP052291. Then, de novo assembly of the clean reads was performed to generate non-redundant unigenes using the Trinity software with an optimized k-mer length of 25 [35].
For sequence analysis, the resulting unigenes were aligned by Blastx to several protein databases (e-value < 0.00001) such as NCBI NR, Swiss-Prot, KEGG and COG, and the best-aligned results were used to decide the sequence direction of the unigenes. If the results from different databases conflicted, a priority order of NR > Swiss-Prot > KEGG > COG was followed when deciding the sequence direction of the unigenes. If a unigene was unaligned to any of the above databases, the ESTScan software was used to decide its sequence direction [37]. For unigenes with sequence directions, we recorded their sequences from the 5 0 end to the 3 0 end; for those without any direction, the sequences were provided by the assembly software. The length of the sequences assembled was a criterion of assembly success. The contig and unigene length distributions were calculated.

Functional annotation of unigenes
To obtain functional annotations, unigene sequence-based alignments were performed against three public databases (NR, Swiss-Prot and KEGG; e-value < 0.00001), and domain-based alignments were carried out against the COG database (e-value < 0.00001). Additionally, a Blastn search was performed against the NCBI NT database with an e-value < 0.00001. The proteins with the highest sequence similarity to the unigenes were retrieved for analysis. KEGG mapping was used to study the complex biological behavior and obtain pathway annotations for the unigenes, while COG matched each annotated sequence to an ancient conserved domain, and was used to predict and classify the possible functions of the unigenes. Based on NR annotations, the Blast2GO program (version 2.3.5, http://www.blast2go.de/) was used to retrieve associated GO terms describing biological processes, molecular functions and cellular components [72]. After getting GO annotations, the WEGO (http://wego.genomics.org.cn/cgibin/wego/index.pl) software was used to perform GO functional classification for all unigenes and to understand the distribution of gene functions of the species at the macro level [73].
Gene expression analysis using RNA-Seq data Unigene expression was calculated using the FPKM method [39], which eliminates the influence of different gene lengths and sequencing level on the calculation of gene expression. The false discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for p-value. When we obtained the FDR, we used the ratio of FPKMs from the two samples at the same time. The smaller the FDR and the larger the ratio, the larger the difference in expression level between the two samples; therefore, for this analysis, thresholds of FDR 0.001 and ratio > 2 were chosen to judge the DEGs (Pl vs. Pd) [74]. The confirmed DEGs were subjected to GO functional enrichment analysis and KEGG pathway analysis. Then, based on NR, NT and Swiss-Prot annotations, GO functional analysis, KEGG pathway analysis and flower coloration studies in tree peony [8,[11][12][13][14][15][20][21][22], the DEGs involved in petal coloration were screened for up/down-regulated unigenes, among which those DEGs with the highest absolute values for the FPKM ratio (Pl vs. Pd) were predicted as key genes for petal coloration and used for further analysis.

Quantitative Real-time PCR (qRT-PCR) Analysis
qRT-PCR was performed to verify the expression of seven genes probably involved in purplered and yellow flower pigmentation in P delavayi. Total RNA was extracted from petals at five different opening stages. After treatment with RNase-free DNase I (Tiangen; Beijing, China) according to the user manual, 1 μg of total RNA was reverse-transcribed to first-strand cDNA using the PrimeScript RT reagent kit (Takara; Otsu, Japan). qRT-PCR experiments were performed in a 96-well plate with an ABI Prism 7500 Sequence Detector (Applied Biosystems, USA), using a SYBR Premix Ex TaqKit (Takara; Otsu, Japan) to monitor cDNA amplification, according to the manufacturer's protocol. As a control, parallel amplification reactions for the tree peony housekeeping gene helicase (GenBank: EF608942) were also performed. Each primer set was designed based on the 3 0 -end cDNA sequence of the corresponding gene using the Primer Premier 5.0 software; all primer sequences are listed in S5 Table. Each 20 μL PCR reaction contained 10 μL of SYBR Premix Ex Taq (2×), 2 μL of 20× diluted RT-product, 0.8 μL of each forward and reverse primers (10 μM), 0.4 μL of ROX Reference Dye II, and 6 μL of ddH 2 O. The thermal conditions were 95°C for 30 s and 40 cycles of 95°C for 5 s and 60°C for 35 s, and then 95°C for 15 s, 60°C for 20 s and 95°C for 15 s for the dissociation stage. After the real-time PCR, the absence of unwanted by-products was confirmed by automated melting curve analysis and agarose gel electrophoresis of the PCR products. In all experiments, five replicates for each RNA sample were included; averages were calculated and differences in the threshold cycle (Ct) were evaluated by the 7500 System Sequence Detection Software v1.3.1.
For data analysis, the relative expression ratios were calculated by the comparative ΔΔCt method (ABI Prism 7500 Sequence Detection System, Applied Biosystems, USA) of relative gene quantification. To monitor the expression patterns of the seven selected transcripts during flower pigmentation, the relative quantification of gene expression was achieved by calibrating the transcription level in petals at different stages to that in yellow petals at stage 1. The expression level calculated by the formula 2 −ΔΔCt represents the x-fold difference from the calibrator.