Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Transcriptome analysis of Asparagus officinalis reveals genes involved in the biosynthesis of rutin and protodioscin

Transcriptome analysis of Asparagus officinalis reveals genes involved in the biosynthesis of rutin and protodioscin

  • Tae Gyu Yi, 
  • Young Rog Yeoung, 
  • Ik-Young Choi, 
  • Nam-Il Park


Garden asparagus (Asparagus officinalis L.) is a popular vegetable cultivated worldwide. The secondary metabolites in its shoot are helpful for human health. We analyzed A. officinalis transcriptomes and identified differentially expressed genes (DEGs) involved in the biosynthesis of rutin and protodioscin, which are health-promoting functional compounds, and determined their association with stem color. We sequenced the complete mRNA transcriptome using the Illumina high-throughput sequencing platform in one white, three green, and one purple asparagus cultivars. A gene set was generated by de novo assembly of the transcriptome sequences and annotated using a BLASTx search. To investigate the relationship between the contents of rutin and protodioscin and their gene expression levels, rutin and protodioscin were analyzed using high-performance liquid chromatography. A secondary metabolite analysis using high-performance liquid chromatography showed that the rutin content was higher in green asparagus, while the protodioscin content was higher in white asparagus. We studied the genes associated with the biosynthesis of the rutin and protodioscin. The transcriptomes of the five cultivars generated 336 599 498 high-quality clean reads, which were assembled into 239 873 contigs with an average length of 694 bp, using the Trinity v2.4.0 program. The green and white asparagus cultivars showed 58 932 DEGs. A comparison of rutin and protodioscin biosynthesis genes revealed that 12 of the 57 genes associated with rutin and two of the 50 genes associated with protodioscin showed more than four-fold differences in expression. These DEGs might have caused a variation in the contents of these two metabolites between green and white asparagus. The present study is possibly the first to report transcriptomic gene sets in asparagus. The DEGs putatively involved in rutin and protodioscin biosynthesis might be useful for molecular engineering in asparagus.


Asparagus officinalis L. has been used as a medicinal plant and widely consumed globally for a long time, making it is an economically valuable plant [1]. Extracts of asparagus have been reported to increase insulin levels, which is helpful for improving type 2 diabetes [2] and protecting liver cells [3]. Furthermore, it exhibits anti-inflammatory, antimicrobial, and anticancer effects [4].

The major functional components of asparagus are flavonoids and steroidal saponins [5]. Flavonoids have anti-allergic, anti-inflammatory, antiviral, and antioxidant activities [6]. Rutin comprises the largest proportion of flavonoids in asparagus [7]. Rutin is a glycoside of quercetin and has been reported to have antimicrobial and anti-inflammatory [8] effects. It has also been demonstrated to have beneficial effects on diabetes [9] and arthritis [8]. Additionally, it exhibits protective effects against hepatocellular toxicity [10]. Various steroidal saponins, which impart a bitter taste, have been isolated from asparagus [11],. These compounds have been reported to affect cholesterol metabolism [12] and protect the liver [13]. They have different structures and are, therefore, useful for pharmaceutical and other commercial applications [14]. The most common steroidal saponin in asparagus is protodioscin, which affects sexual function and sex hormones [15].

Asparagus can have a variety of colors, including green, white, and purple, depending on the cultivation environment and cultivar. It contains both rutin and protodioscin, and its antioxidant ability varies with color [16]. In particular, green and white asparagus varieties are formed according to the light conditions during their cultivation. Asparagus becomes green when it receives light and white when its exposure to light is blocked. This variation is associated with differences in rutin and protodioscin contents of asparagus [16]. Green asparagus has a higher content of rutin than white asparagus, while white asparagus has a higher content of protodioscin [17]. Many studies have been conducted on the antioxidant effects and functional components of asparagus [18]. However, most molecular biology studies focused on only one of the green and white characteristics [19, 20]. Therefore, in order to compare the differences in the rutin and protodioscin content of green and white asparagus, molecular biology studies are needed to compare the differences in gene expression between the two colored varieties.

The genes involved in the secondary metabolism are more diverse than those involved in the primary metabolism. The present study aimed to analyze the whole-genome transcriptome of A. officinalis and identify differentially expressed genes (DEGs) involved in the biosynthesis of rutin and protodioscin (as functional health-promoting compounds), and their association with stem color. The identification of potential candidate genes associated with the secondary metabolic pathways based on transcriptome and DEG analyses will improve our understanding of the regulation and biosynthesis of secondary metabolites [14].

The present study aimed to investigate the genetic differences between rutin and protodioscin, among the green and white varieties of asparagus.

Materials and methods

Plant materials

Asparagus was cultured in a farm located in Chuncheon-si, Gangwon-do, Korea. In the present study, the commonly used cultivars ‘Atlas’ (green and white), ‘Gijnlim’ (green), ‘Pacific purple’ (purple), and ‘UC157’ (green), which were grown for 5 years, were used (Table 1). Asparagus spears (20–25 cm length) were sampled for transcriptome sequencing and metabolite content analysis. The samples were washed with distilled water, immediately frozen in liquid nitrogen, and stored at −80°C for RNA extraction and metabolite content analysis.

Table 1. The garden asparagus samples used for secondary metabolite and RNA sequencing analysis.

RNA extraction and Illumina sequencing

Asparagus spears of ‘Atlas,’ ‘Gijnlim,’ ‘Pacific purple,’ and ‘UC157’ were frozen in liquid nitrogen and homogenized using a mortar and pestle. Each asparagus spear was individually sampled with three tissues of each line. The extracted total mRNA was pooled from each of the three biological replicate samples. Total RNA was extracted using the Ribospin Kit (GeneAll Biotechnology Co., Seoul, Korea) according to the manufacturer’s protocol. DNA digestion was performed using DNase I (Sigma, St. Louis, MO, USA). The total RNA was quantified by measuring the absorption of light at A260, and the quality was checked by using a microvolume UV-vis spectrophotometer (Colibri, Titertek-Berthold, Germany). The cDNA library and massive parallel sequencing reads were generated using the Illumina mRNA Sequencing Kit and high-throughput sequencing platform according to the manufacturer’s protocol (Illumina, Inc., San Diego, CA US) by a professional sequencing provider (Macrogen Inc., Seoul, South Korea).

De novo assembly and DEG analysis of the transcriptome

We generated a transcriptome by de novo assembly of the mRNA sequences from the five varieties of asparagus and compared the genes involved in rutin and protodioscin biosynthetic pathways in green and white asparagus varieties cultivated under different light conditions. Briefly, the RNA-seq data generated by the Illumina platform were filtered by removing the adapter sequences and low-quality reads with a Phred quality score < 20 [21]. The clean sequencing reads were used for de novo assembly with the Trinity software v2.4.0. [22], which was operated using the default settings with a k-mer of 25 and a minimum contig length of 100. All contigs were defined as unigenes by the BLASTx analysis and were not extended. We focused on comparing the DEGs associated with the synthesis of the secondary metabolites, rutin and protodioscin between Atlas_W of white shoot asparagus (hereafter referred to as “Atlas_W”) and Atlas_G of green shoot asparagus (hereafter referred to as “Atlas_G”). DEGs were identified by estimating the relative transcript abundance, i.e., the reads aligned to all unique contigs. The aligned fragments were normalized using the fragment per kilo-base of exon model per million (FPKM) mapped reads method [23].

Functional annotation and gene ontology (GO) classification

All unique contigs were used to identify candidate coding regions with TransDecoder ( The candidate gene contigs were searched against the known genes by performing a BLASTx analysis in the National Center for Biotechnology Information (NCBI) non-redundant (nr) protein database. Other databases, including EGGNOG, KO (KEGG Orthology), and UniprotKB, were also used for gene annotation. Unigenes were identified based on high sequence similarity to the genes of other species at an E-value cutoff of 1.0E-05. The unigenes identified based on NR annotation were classified by a gene ontology (GO) analysis using the GOstats ( and Blast2GO ( programs. In the GO analysis, the genes were classified at the second level under the biological process, cellular components, and molecular function categories.

cDNA synthesis and quantitative real-time PCR

cDNA was synthesized from 1 μg of total RNA using the PrimeScript RT Reagent Kit (Takara, Kusatsu, Japan). For all target genes and the actin gene (as an internal reference), the primers were designed with the PrimerQuest Tool of Integrated DNA Technologies ( to conduct a quantitative real-time PCR (S1 Table). The quantitative real-time PCR was performed in a BIORAD CFX96 Real-time PCR system (Bio-Rad Laboratories, Hercules, CA, USA) with SYBR Premix Ex TaqII (Takara, Kusatsu, Japan) under the following conditions: pre-denaturation at 95°C for 15 min, 39 cycles of denaturation at 95°C for 20 s, annealing at 60°C for 40 s, and extension at 72°C for 20 s, with a final extension at 72°C for 10 min. The differences between the treatment means were evaluated using three independent replicates for each sample. Subsequently, the mean and standard deviation values of the three replicates of each sample were analyzed.

High-performance liquid chromatography analysis

All chemicals used were of the analytical reagent grade and used as received. All solvents were purchased from Daejung Chemicals (Siheung, Korea), and the standards were purchased from Sigma-Aldrich (St. Louis, MO, USA). The Atlas green and white spear samples were stored at −70°C and freeze-dried for 48 h. Each sample was pulverized using a mortar and pestle. The powdered sample (0.1 g) was extracted with 1 mL of 70% methanol at 30°C for 24 h. The supernatant was collected by centrifugation at 14,000 rpm at 25°C for 10 min. After filtration using a 0.22-μm syringe filter (Woongki Science, Seoul, Korea), the filtrate was used for analysis. The sample was separated on a C18 column (250 mm × 4.6 mm, particle size 5 μm; Shimadzu, Kyoto, Japan) using a Prominence high-performance liquid chromatography (HPLC) system (Shimadzu, Kyoto, Japan) equipped with a diode array UV-vis detector. Solvent A was water (formic acid 0.1%) and solvent B was acetonitrile (formic acid 0.1%). The flow rate was 0.7 mL/min, the injection volume was 10 μL, and the column temperature was maintained at 40°C. The analytical wavelengths were 205 and 330 nm. Initially, the concentration of solvent B was 12%, which was then increased to 30% and 80% at 20 and 50 min, respectively.

Statistical analysis

All analyses were repeated three times, and the data were expressed as means and standard deviations. The statistical analysis was performed using the SAS software (SAS 1998) and Duncan's multi-range test was performed with a significance level of p < 0.05.


Illumina sequencing and de novo transcriptome assembly

We successfully obtained a complete transcriptome through de novo assembly of the Illumina RNA-seq reads from asparagus spears. A total of 340 311 532 paired-end reads were generated from the transcriptomes of five varieties of asparagus. The raw data can be accessed at GenBank NCBI, SRA PRJNA522348 ( After removing the adapter sequences, ambiguous reads, and low-quality reads, 336 599 498 high-quality clean reads were obtained with an average GC content of 47.824%. Among these reads, 41 218 018 were from Atlas (green), 42 688 134 were from Atlas (white), 53 569 526 were from Gijnlim, 50 989 974 were from Pacific purple, and 148 133 846 were from UC157 (Table 2). All clean reads were assembled into 239 873 contigs using the Trinity program, with an average contig length of 694 bp, a maximum contig length of 16 179 bp, and a minimum contig length of 201 bp. A short average contig length is normal in the de novo assembly using the Illumina high-throughput platform sequencing data in plants [24, 25]. The GC content of these contigs was 40.56% and 40.65% in all contigs and 181 710 unique transcriptomes, respectively (Table 3). A slightly higher than the GC ratio of the range of 39.1 ~ 39.5% depends of the chromosome sequence in the NCBI Asparagus officinalis annotation release 100 ( The largest contig of 16 179 bp was matched to the auxin transport protein BIG (5109AA, XP_010942266 in NCBI) in Elaeis guineensis through a BlastX search. The de novo assembly was performed by comparing the summary to the other case, mapping the cleaned reads back to the unigenes, using TransRate for quality assessment and manual search for a few contig sequences in the NCBI database. We defined 181 710 putative unique genes with a total size of 166 Mb in asparagus. These data are enough for gene annotation and for analyzing the genes associated with the biosynthesis of the secondary metabolites, rutin and protodioscin.

Table 2. Summary of transcriptome cDNA sequencing of the five varieties of asparagus.

Table 3. Summary of de novo assembly of cDNA sequences from the transcriptomes of five varieties of asparagus.

Gene annotation

To identify putative genes, all assembled contigs were queried against various databases, including NR, EGGNOG, KO, and UniprotKB. The number of genes identified varied from 70 098 to 90 160, depending on the database (S2 Table). We identified a putative unigene set by performing a BLASTx analysis using public protein databases, including the NCBI NR protein database. In total, 89 418 genes were matched to the genes in the NR database. Among them, 79 896 unigenes had an E-value < 1e-5 (S3 Table). Furthermore, 102 140 contigs of the putative unigenes were found in the NR database through a BLASTx search. Asparagus could be identified as a plant with highly different genes lacking contig annotation as shown by a BLAST search using public databases. We next analyzed the species distribution of the matched genes. Among the numerous plant species, E. guineensis (16 813), followed by Phoenix dactylifera (13 770), Musa acuminata (5312), Nelumbo nucifera (1568), Oryza sativa (1171), and Vitis vinifera (1057), had the highest number of hits to the transcriptome of asparagus. Similar results have been reported by previous studies. In a study on sex expression mechanisms in asparagus, E. guineensis (6736), P. dactylifera (5515), M. acuminata (2695), and O. sativa (1853) had the highest number of sequence hits to asparagus unigenes [26].

Based on the BLAST results against the NR and KO databases, the sequences were annotated and classified into GO classes, including biological processes, cellular components, and molecular functions. In total, 36% of the genes were classified, with 30 106 (12.55%), 24 163 (10.07%), and 23 215 (9.68%) genes included under the biological process, cellular components, and molecular function categories, respectively (Fig 1). In the biological process category, metabolic process (7820), unclassified (4829), and biological regulation (3667) were the prominently represented sub-categories. Under the cellular components category, cell part (10 447) and organelle (5455) were the most highly represented sub-categories. Under the molecular function category, the highest proportions of genes were clustered under the catalytic activity (8597), binding (8459), and unclassified (3636) sub-categories.

Fig 1. GO analysis of asparagus.

(A: biological process, B: cellular component, C: molecular function).

Differentially expressed genes among green and white asparagus

The expression levels showed similar ratios in Atlas_G and Atlas_W in terms of the FPKM values ranging from 0 to 13 (Fig 2). A DEG analysis revealed that a total of 58 932 genes were differentially expressed between Atlas_G and Atlas_W. A specific gene responsible for functional material determination or metabolite differentiation should not exhibit a similar expression pattern. Therefore, we identified genes with different expression patterns between Atlas_G and Atlas_W for further analysis. Among 58 932 DEGs, 29 442 exhibited higher expression in Atlas_G than in Atlas_W, and 1951 DEGs showed a greater than four-fold difference in their expression levels. In Atlas_W, 29 490 DEGs exhibited higher expression, and 2 214 DEGs showed a greater than four-fold difference in expression compared with Atlas_G. The metabolic genes of rutin and protodioscin were searched using the NCBI NR database. Fifty-seven genes related to rutin (flavonoid) biosynthesis were identified, 12 of which showed more than four-fold differences in expression between Atlas_G of and Atlas_W. Fifty genes related to protodioscin (terpenoid) biosynthesis were found, of which two showed more than four-fold differences in expression between Atlas_G and Atlas_W (S4 Table).

Fig 2. Distribution of expression levels between Atlas_G of green shoot asparagus and Atlas_W of white shoot asparagus.

Gene expression associated with the synthesis of flavonoids and terpenoids in Atlas_G and Atlas_W

The biosynthesis of flavonoid compounds, such as rutin, starts with phenylalanine [27]. The rutin biosynthetic pathway leads to naringenin through p-coumaric acid, and then dihydrokaempferol, dihydromyricetin, and dihydroquercetin. Subsequently, quercetin and rutin are biosynthesized in the dihydroquercetin pathway.

Through transcriptome analysis, we identified 17 and 40 upregulated genes related to flavonoid biosynthesis in Atlas_G and Atlas_W, respectively (Fig 3). Furthermore, among the enzyme genes involved in biosynthesis, the F3′H gene was found only in Atlas_G, and the C4H and CHS genes were found only in Atlas_W. The actual rutin content of Atlas_G was higher than that of Atlas_W (S1 Fig). Therefore, the genes with high expression levels in Atlas_G could play an important role in rutin biosynthesis. However, the genes encoding the enzymes involved in subsequent processes require further research.

Fig 3. Biosynthetic pathways and compartmentalization of flavonoid synthesis in plants.

4CL, 4-coumarate:CoA ligase; C4H, cinnamate-4-hydroxylase; CHS, chalcone synthase; CHI, chalcone isomerase; F3H, flavanone 3-hydroxylase; F3′H, flavonoid 3′-hydroxylase; F3′5′H, flavonoid 3′,5′-hydroxylase; F3Rha, flavonol-3-O-glucoside l-rhamnosyltransferase; FLS, flavonol synthase; FNS, flavone synthase; PAL, phenylalanine ammonia-lyase. The numbering of the 3′, 4′, and 5′ carbon positions is shown in the structure. Enzyme names are in italics.

Protodioscin is a representative triterpenoid of A. officinalis and belongs to the steroidal sapogenin family. Steroidal sapogenins are synthesized from cholesterol in several plants [28], and terpenoid compounds are divided into the mevalonate (MVA) and non-mevalonate (MEP) pathways. The MVA pathway begins with condensation of acetyl-CoA to acetoacetyl-CoA, which is then converted to isopentenyl diphosphate (IPP) through 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) and mevalonate (MVA). IPP acts as a precursor of triterpenoids. The other synthetic pathway, the MEP pathway, begins with the production of 1-deoxy-d-xylulose 5-phosphate (DXP) from pyruvate and glyceraldehyde-3-phosphate (G3P). In this process, IPP is produced via 2-C-methyl-d-erythritol 4-phosphate (MEP) and (E)-4-hydroxy-3-methylbut-2-enyl diphosphate (HMBPP) and then biosynthesized into triterpenoids.

When the MVA and MEP pathways were examined, 29 genes were found to be highly expressed in Atlas_G, and 21 genes were highly expressed in Atlas_W (Fig 4). In addition, some genes, including MVK, DXR, MCT, MDS, and HDR, appeared only in Atlas_G. However, a comparison of the actual rutin and protodioscin content revealed that Atlas_W contained more protodioscin than Atlas_G (S2 Fig). Therefore, the genes highly expressed in Atlas_W might play an important role in the biosynthesis of terpenoids. However, the information on the enzymes that act after the MVA and MEP pathways is limited. We suggest that the enzymes involved in the subsequent processes might play an important role in terpenoid biosynthesis of asparagus, but this requires further study to confirm.

Fig 4. Biosynthetic pathways and compartmentalization of terpenoid synthesis in plants.

AACT, acetoacetyl-CoA thiolase; AcAc-CoA, acetoacetyl-CoA; CDP-ME, 4-(cytidine 5′-diphospho)-2-C-methyl-d-erythritol; CDP-ME2P, 4-(cytidine 5′-diphospho)-2-C-methyl-d-erythritol phosphate; CMK, CDP-ME kinase; DMAPP, dimethylallyl diphosphate; DXP, 1-deoxy-d-xylulose 5-phosphate; DXR, DXP reductoisomerase; DXS, DXP synthase; FDS, farnesyl diphosphate synthase; FPP, farnesyl diphosphate; G3P, glyceraldehyde-3-phosphate; GDS, geranyl diphosphate synthase; GPP, geranyl diphosphate; HDR, (E)-4-hydroxy-3-methylbut-2-enyl diphosphate reductase; HDS, (E)-4-hydroxy-3-methylbut-2-enyl diphosphate synthase; HMBPP, (E)-4-hydroxy-3-methylbut-2-enyl diphosphate; HMG-CoA, 3-hydroxy-3-methylglutaryl-CoA; HMGR, HMG-CoA reductase; HMGS, HMG-CoA synthase; IDI, isopentenyl diphosphate isomerase; IPP, isopentenyl diphosphate; ISPS, isoprene synthase; MCT, 2-C-methyl-d-erythritol 4-phosphate cytidylyltransferase; MDS, 2-C-methyl-d-erythritol 2,4-cyclodiphosphate synthase; ME-2,4cPP, 2-C-methyl-d-erythritol 2,4-cyclodiphosphate; MEP, 2-C-methyl-d-erythritol 4-phosphate; MVD, mevalonate diphosphate decarboxylase; MVK, mevalonate kinase; PMK, phosphomevalonate kinase; TPS, terpene synthase. Enzyme names are in italics.

Relationship of genes expressed during flavonoid and terpenoid biosynthesis

We identified significant DEGs involved in the biosynthesis of rutin and protodioscin (Table 4, Fig 5). The NR annotation suggested that four 4CL genes and three CHS genes were similar to the genes of Ornithogalum saundersiae. Notably, these genes were all highly expressed in Atlas_W. This suggests that the biosynthetic pathways in Atlas_W and O. saundersiae are highly similar, and that this species might be helpful for studying the rutin and protodioscin biosynthesis of Atlas_W. However, the flavonoid rutin is not a major substance in Atlas_W (S1 Fig). Therefore, these genes might play a role in the biosynthesis of flavonoids other than rutin, and this information might contribute to the study of flavonoid biosynthesis in asparagus. Among the genes associated with asparagus flavonoid biosynthesis, the genes similar to the 4CL gene of P. dactylifera had higher expression in Atlas_G. Furthermore, these genes showed the largest difference among the flavonoid genes of asparagus. As one of the main compounds of Atlas_G (S1 Fig), rutin is a flavonoid-based compound and, therefore, the expression of these genes might play an important role in the biosynthesis of rutin. Genes from E. guineensis, P. dactylifera and Narcissus tazetta were also associated with the flavonoid and terpenoid biosynthetic genes of asparagus that showed more than four-fold differences between Atlas_G and Atlas_W. In addition, the E. guineensis and P. dactylifera genes were found to match many asparagus genes in the gene annotation. Therefore, these genes were expected to have a great influence on the biosynthetic pathways in asparagus.

Fig 5. Heat map of expression differences in 57 flavonoid and 50 terpenoid biosynthesis genes of asparagus based on FPKM values (names in yellow represent flavonoid-related genes and names in pink represent terpenoid-related genes).

Table 4. The significant DEGs related to rutin and protodioscin biosynthesis between Atlas_W of white shoot asparagus and Atlas_G of green shoot asparagus.

The RNA-seq gene expression levels were screened using an absolute value of log2 fold change > 1 and FPKM > 1. A qRT-PCR was performed to validate the sequencing results. Primers were designed based on the alignments of the most conserved domains with the Integrated DNA Technologies database (; actin was used as the reference gene. We tested the expression of 14 DEGs involved in flavonoid and terpenoid biosynthesis with qRT-PCR to validate the RNA-seq results (Fig 6). Nine of the 14 genes were similar to those of RNA-seq. Flavonoid-related genes showed high expression levels in Atlas_W according to the FPKM values. The expression levels of c65364 and c72399 were approximately 6.3 and 3.2 times higher in Atlas_W than in Atlas_G. The expression levels of c76795, c83185 and c84298 were 3.7, 3.6, and 2.8 times higher, respectively, in Atlas_W than in Atlas_G. In particular, c151453 showed the largest difference; its expression level was 116.3 times higher in Atlas_W. However, some genes were hard to compare, including c47648, c78488, and c88325, because their expression levels were very low. The c92756_g1_i1 and c92756_g1_i2 nucleotide sequences were almost identical, making it difficult to distinguish them with the primers used. Therefore, the expression levels of both genes were combined. In contrast to the RNA-seq results, the c92756 gene showed higher expression in Atlas_G than Atlas_W. For two terpenoid-related genes, c81038 showed higher expression in Atlas_G according to the FPKM values, with approximately 2-fold difference, while c71742 showed a higher expression level in Atlas_W in qRT-PCR in contrast to the FPKM values. In the case of c92756 and c71742, the qRT-PCR results differed from the FPKM values, which was considered to be due to the correction of FPKM.

Fig 6. Comparison of the expression patterns of genes involved in flavonoid and terpenoid biosynthesis in Atlas_W of white shoot asparagus and Atlas_G of green shoot asparagus between the RNA-seq FPKM and qRT-PCR data.

(A: RNA-seq FPKM data, B: qRT-PCR data). Values in a row with different letters are significantly different (p < 0.05), mean ± SD (n = 3).

The growth condition that causes the greatest difference between Atlas_G and Atlas_W is light intensity. Therefore, the light intensity conditions can affect the expression levels of the biosynthetic genes of the secondary metabolites, rutin and protodioscin. These differences might contribute significantly to the differences of the rutin and protodioscin of Atlas_G and Atlas_W. The genes showing distinct differences in the rutin and protodioscin biosynthetic pathways were identified. The flavonoid- and terpenoid-related genes with expression differences greater than four-fold might be responsible for the differences in rutin and protodioscin between Atlas_G and Atlas_W. Thus, these genes might play an important role in the regulation of secondary metabolites, rutin and protodioscin in asparagus.

Analysis of secondary metabolite content

Gene expression and compound contents can vary significantly. Thus, a quantitative analysis of rutin- and protodioscin-based components of asparagus was conducted to investigate the secondary metabolite contents of asparagus. Rutin was detected at 330 nm and observed at 20.4 min (S1 Fig). Protodioscin was detected at 205 nm and observed at 30.3 min (S2 Fig). The quantitative analysis revealed that the rutin content in Atlas_G extract was 94.281 μg/g, while the protodioscin content was only 5.093 μg/g. By contrast, the protodioscin content in Atlas_W extract was 124.103 μg/g, which was significantly higher than in Atlas_G extract. However, the rutin content was 2.418 μg/g, which was significantly lower than in Atlas_G (Fig 7). These differences in metabolite contents between Atlas_G and Atlas_W are similar to those reported in previous studies [16, 17]. The analysis of secondary metabolites suggested that there were differences in gene expression between Atlas_G and Atlas_W. The expression analysis of the genes involved in the rutin biosynthetic pathway showed that Atlas_W had higher expression levels than Atlas_G of some genes. Conversely, among the genes involved in the protodioscin biosynthetic pathway, Atlas_G had higher expression levels than Atlas_W of some genes. When comparing these results with the contents of the secondary metabolites studied, the genes with higher expression levels in Atlas_G than in Atlas_W might be considered to be involved in rutin biosynthesis. In the case of protodioscin, the genes showing higher expression in Atlas_W than in Atlas_G might be considered to be involved in protodioscin biosynthesis.

Fig 7. Rutin and protodioscin contents in Atlas_W of white shoot asparagus and Atlas_G of green shoot asparagus.

Values in a row with different letters are significantly different (p < 0.05), mean ± SD (n = 3).


Genes involved in rutin and protodioscin biosynthesis

In our transcriptome analysis, we identified seven phenylalanine ammonia-lyase (PAL) genes, which are associated with phenylalanine and initiate flavonoid biosynthesis. The most frequently detected genes were the 4-coumarate:CoA ligase (4CL) genes; among the 57 flavonoid-related biosynthetic genes found, 30 were 4CL genes. The 4CL gene encodes an enzyme involved in the biosynthesis of p-coumaroyl CoA from p-coumaric acid, and has been reported to be associated with the expression of PAL genes [29]. Flavonol compounds, to which rutin belongs, are biosynthesized from naringenin and dihydrokaempferol. Eleven F3H genes related to the biosynthesis of naringenin and dihydrokaempferol were found. However, the information about the biosynthetic genes involved in subsequent steps is very limited. There are only two reports of the biosynthetic genes for quercetin or dihydroquercetin, which are directly related to rutin.

In Fagopyrum esculentum, which has a high rutin content, transcript sequencing was performed to observe the rutin accumulation and variation in seeds among varieties [30]. Similar to our study, the expression levels of the PAL, C4H, 4CL, CHS, CHI, F3H, F3H, and FLS genes were compared and analyzed. As a result of this analysis, it was reported that the expression level of the FLS1 gene was related to rutin accumulation in F. esculentum. This suggests that the expression levels of the PAL, C4H, 4CL, CHS, CHI, F3H, and FLS genes are important for rutin and flavonoid contents, but the authors highlighted the importance of the FLS1 gene. However, asparagus has very limited information on related biosynthetic genes in the pathway after naringenin and the F3H genes. There are only two reports of the biosynthetic genes for quercetin or dihydroquercetin, which are directly related to rutin.

There are two terpenoid biosynthetic pathways, the MVA pathway and the MEP pathway. The MVA pathway begins with acetyl-CoA becoming HMG-CoA via HMGS. We found 34 genes in the MVA pathway to IPP, among which the HMGS genes were the most frequent. The other biosynthetic pathway, the MEP pathway, begins with DXS enzymes that biosynthesize DXP. We found 13 genes in the MEP pathway to IPP, among which the DXS genes were the most frequent. However, the information on biosynthesis steps after IPP is very limited. Only three terpenoid biosynthetic enzyme genes acting after IPP were found, and no information was found about the biosynthetic enzyme genes leading to the synthesis of triterpenoids, including protodioscin.

The terpenoid metabolism pathways in Chlorophytum borivilianum and Valeriana fauriei roots have been reported [31, 32]. The analysis of their terpenoid metabolic pathways focused on the genes related to the MVA and MEP pathways. At present, reports on protodioscin are very rare. Therefore, we had to analyze the metabolism of terpenoids, a group to which protodioscin belongs, and it was dependent on the information available for the genes involved in the MVA and MEP pathways.

For both flavonoid and terpenoid biosynthetic pathways, a little information is available on the posterior biosynthetic enzymes, necessitating further studies on the biosynthetic enzymes for plant secondary metabolites. In our asparagus transcriptome analysis, many potential genes were not annotated. Studies of these unannotated genes will be necessary to identify the biosynthetic enzyme genes in asparagus. Such studies might contribute to the study of secondary metabolite biosynthesis in other plants.

Gene expression and actual secondary metabolite contents

Comparative genomics is a powerful tool for determining the genetic basis of biological functions. Forty of the 57 flavonoid biosynthetic genes identified in this experiment showed higher expression in Atlas_W than in Atlas_G. In addition, five 4CL, three CHS, one CHI, and two FH3 genes showed more than four times higher expression in Atlas_W than in Atlas_G. However, the rutin content of Atlas_G was 18 times higher than that of Atlas_W in this study. Gene expression can be activated by the sensing of environmental stresses, such as wounding and UV light irradiation [33, 34]. Because phenylpropanoid compounds can protect against these stresses, such stresses can promote their biosynthesis. Therefore, environmental stress in white asparagus cultivated under light-blocked conditions might increase the expression of the phenylpropanoid genes. In the case of rutin, the more light at the site, the more active the biosynthesis [17]. However, in white asparagus, the light is blocked and rutin biosynthesis is not performed efficiently. Further studies are needed to determine the roles of the highly expressed flavonoid-related genes in white asparagus.

Of the 50 genes involved in terpenoid biosynthesis, 29 genes showed higher expression in Atlas_G than in Atlas_W. Among these terpenoid biosynthetic enzyme genes, one HMGS gene and one DXS gene showed four times greater expression in Atlas_G. However, the protodioscin content was 60 times higher in Atlas_W than in Atlas_G in this study. A study of Achyranthes bidentata suggested that triterpenoid saponins are first synthesized in the leaves and then moved to the stem and root [35]. Experiments with leaves and root tissues of Chlorophytum borivilianum showed that the initial reaction of the terpenoid biosynthetic pathway occurs in the leaves [36]. This might explain the high expression levels of the terpenoid genes in Atlas_G used in this study. However, further studies are needed to determine the exact roles of these genes.

Difference in gene regulation due to light conditions

The occurrence of green and white shoot asparagus is due to light conditions, in which the absence of light results in the formation of white shoot asparagus. There is a big difference in biological systems, including genetic and epigenetic factors related to the environmental stress of light.

Through the transcriptome analysis, we showed that the gene expression in green asparagus is more affected by light than that in white asparagus. Many light-related genes, such as light-regulated protein, photosystem I reaction center subunit, and chlorophyll A/B binding protein, were highly expressed in green asparagus. These genes are expressed in the chloroplasts of a cell, and their expression level vary according to the light conditions [37]. They play a role in photosystem-related complex formation and interaction stabilization [38] and affect plant development and growth through the synthesis of chlorophyll [39].

The difference in light conditions affected the expression of the genes related to the amino acid synthesis [40, 41]. Amino acids, such as glutamate and glutamine, have been reported to be affected by light and have important effects on hormone and secondary metabolite biosynthesis [42, 43]. The expression levels of the glutamate synthase and glutamine synthetase genes were compared among the genes analyzed in this experiment. A total of 12 glutamate synthase genes were detected, and they were highly expressed in Atlas_W. In addition, one of the glutamate synthase genes showed four times or more difference in expression. A total of seven glutamine synthetase genes were found, of which five genes were highly expressed in Atlas_G. Of these five genes, two showed more than 4-fold difference in expression.

Therefore, the difference in secondary metabolite abundance between white and green asparagus spears according to the light conditions is complex in biological systems, including genetic and epigenetic factors. The genetic factors, including DEGs involved in the biosynthesis of glutamate, can lead to the biosynthesis of the light-induced amino acids. The further effects of epigenetic factors should be studied to understand the difference of metabolite abundance between the white and green asparagus spears resulting from the difference in light conditions.


De novo transcriptome sequencing was performed to identify functional differences between green and white asparaguses. A total of 239 873 sequences were assembled in an asparagus spear, and 89 418 distinct sequences were successfully annotated using the NR database. One hundred and seven genes were thought to be involved in the biosynthesis of rutin and protodioscin, which were the representative functional materials of green and white asparaguses. Among them, 14 genes showed a four-fold difference or more in their expression levels. Furthermore, the content of rutin and protodioscin was significantly different between Atlas_W and Atlas_G. This suggests that growing asparagus spears under different light conditions affects the expression of genes related to rutin and protodioscin biosynthesis, thus affecting the content of these compounds. Comparative analysis of the secondary metabolite content and gene expression levels can help in understanding the biosynthesis of functional compounds in green and white asparaguses. This research is expected to contribute to molecular, biological, and natural chemical studies of asparagus in the future.

Supporting information

S2 Table. Gene annotations based on the NR, EGGNOG, KO and UniprotKB databases.


S3 Table. The putative unigene set identified by BLASTx analysis using public protein databases, including the NCBI NR protein database.


S4 Table. List of 57 flavonoid and 50 terpenoid biosynthesis genes identified based on FPKM values in asparagus.


S1 Fig. HPLC chromatograms of the rutin standard and methanol extracts of freeze-dried white and green asparagus spears at 330 nm.


S2 Fig. HPLC chromatograms of the protodioscin standard and methanol extracts of freeze-dried white and green asparagus spears at 205 nm.



  1. 1. Pontaroli AC, Camadro EL. Somaclonal variation in Asparagus officinalis plants regenerated by organogenesis from long-term callus cultures. Genetics and Molecular Biology. 2005; 28: 423–430.
  2. 2. Hafizur RM, Kabir N, Chishti S. Asparagus officinalis extract controls blood glucose by improving insulin secretion and β-cell function in streptozotocin-induced type 2 diabetic rats. British Journal of Nutrition. 2012; 108: 1586–1595. pmid:22221560
  3. 3. Kim BY, Cui ZG, Lee SR, Kim SJ, Kang HK, Lee YK et al. Effects of Asparagus officinalis extracts on liver cell toxicity and ethanol metabolism. Journal of food science. 2009; 74.
  4. 4. Bousserouel S, Le Grandois J, Gossé F, Werner D, Barth SW, Marchioni E et al. Methanolic extract of white asparagus shoots activates TRAIL apoptotic death pathway in human cancer cells and inhibits colon carcinogenesis in a preclinical model. International journal of oncology. 2013; 43: 394–404. pmid:23754197
  5. 5. Fuentes-Alventosa J, Jaramillo-Carmona S, Rodríguez-Gutiérrez G, Rodríguez-Arcos R, Fernández-Bolaños J, Guillén-Bejarano R et al. Effect of the extraction method on phytochemical composition and antioxidant activity of high dietary fibre powders obtained from asparagus by-products. Food Chemistry. 2009; 116: 484–490.
  6. 6. Nijveldt RJ, van Nood E, van Hoorn DEC, Boelens PG, van Norren K, van Leeuwen PAM. Flavonoids: a review of probable mechanisms of action and potential applications. The American Journal of Clinical Nutrition. 2001; 74: 418–425. pmid:11566638
  7. 7. Fuentes-Alventosa JM, Rodríguez G, Cermeño P, Jiménez A, Guillén R, Fernández-Bolaños J et al. Identification of flavonoid diglycosides in several genotypes of asparagus from the Huétor-Tájar population variety. Journal of agricultural and food chemistry. 2007; 55: 10028–10035. pmid:17960889
  8. 8. Guardia T, Rotelli AE, Juarez AO, Pelzer LE. Anti-inflammatory properties of plant flavonoids. Effects of rutin, quercetin and hesperidin on adjuvant arthritis in rat. Il farmaco. 2001; 56: 683–687. pmid:11680812
  9. 9. Kamalakkannan N, Prince PSM. Antihyperglycaemic and antioxidant effect of rutin, a polyphenolic flavonoid, in streptozotocin‐induced diabetic wistar rats. Basic & clinical pharmacology & toxicology. 2006; 98: 97–103.
  10. 10. Janbaz KH, Saeed SA, Gilani AH. Protective effect of rutin on paracetamol-and CCl4-induced hepatotoxicity in rodents. Fitoterapia. 2002; 73: 557–563. pmid:12490212
  11. 11. Dawid C, Hofmann T. Structural and sensory characterization of bitter tasting steroidal saponins from asparagus spears (Asparagus officinalis L.). Journal of agricultural and food chemistry. 2012; 60: 11889–11900. pmid:23137023
  12. 12. Roman ID, Thewles A, Coleman R. Fractionation of livers following diosgenin treatment to elevate biliary cholesterol. Biochimica et Biophysica Acta (BBA)-Lipids and Lipid Metabolism. 1995; 1255: 77–81.
  13. 13. Zhu X, Zhang W, Pang X, Wang J, Zhao J, Qu W. Hypolipidemic Effect of n‐Butanol Extract from Asparagus officinalis L. in Mice fed a High‐fat Diet. Phytotherapy research. 2011; 25: 1119–1124. pmid:21280112
  14. 14. Upadhyay S, Phukan UJ, Mishra S, Shukla RK. De novo leaf and root transcriptome analysis identified novel genes involved in steroidal sapogenin biosynthesis in Asparagus racemosus. BMC genomics. 2014; 15: 746. pmid:25174837
  15. 15. Gauthaman K, Ganesan AP. The hormonal effects of Tribulus terrestris and its role in the management of male erectile dysfunction–an evaluation using primates, rabbit and rat. Phytomedicine. 2008; 15: 44–54. pmid:18068966
  16. 16. Maeda T, Kakuta H, Sonoda T, Motoki S, Ueno R, Suzuki T et al. Antioxidation capacities of extracts from green, purple, and white asparagus spears related to polyphenol concentration. HortScience. 2005; 40: 1221–1224.
  17. 17. Lee EJ, Yoo KS, Patil BS. Development of a Rapid HPLC‐UV Method for Simultaneous Quantification of Protodioscin and Rutin in White and Green Asparagus Spears. Journal of food science. 2010; 75.
  18. 18. Eichholz I, Rohn S, Gamm A, Beesk N, Herppich WB, Kroh LW et al. UV-B-mediated flavonoid synthesis in white asparagus (Asparagus officinalis L.). Food research international. 2012; 48: 196–201.
  19. 19. Park J-H, Ishikawa Y, Yoshida R, Kanno A, Kameya T. Expression of AODEF, a B-functional MADS-box gene, in stamens and inner tepals of the dioecious species Asparagus officinalis L. Plant molecular biology. 2003; 51: 867–875. pmid:12777047
  20. 20. Papoulias E, Siomos AS, Koukounaras A, Gerasopoulos D, Kazakis E. Effects of genetic, pre-and post-harvest factors on phenolic content and antioxidant capacity of white asparagus spears. International journal of molecular sciences. 2009; 10: 5370–5380. pmid:20054475
  21. 21. Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces usingPhred. I. Accuracy assessment. Genome research. 1998; 8: 175–185. pmid:9521921
  22. 22. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature biotechnology. 2011; 29: 644. pmid:21572440
  23. 23. Lin Y, Golovnina K, Chen Z-X, Lee HN, Negron YLS, Sultana H et al. Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. BMC genomics. 2016; 17: 28. pmid:26732976
  24. 24. Garg R, Patel RK, Tyagi AK, Jain M. De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA research. 2011; 18: 53–63. pmid:21217129
  25. 25. Zhang J, Liang S, Duan J, Wang J, Chen S, Cheng Z et al. De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in peanut (Arachis hypogaea L.). BMC genomics. 2012; 13: 90. pmid:22409576
  26. 26. Li S-F, Zhang G-J, Zhang X-J, Yuan J-H, Deng C-L, Gao W-J. Comparative transcriptome analysis reveals differentially expressed genes associated with sex expression in garden asparagus (Asparagus officinalis). BMC plant biology. 2017; 17: 143. pmid:28830346
  27. 27. Verhoeyen M, Bovy A, Collins G, Muir S, Robinson S, De Vos C et al. Increasing antioxidant levels in tomatoes through modification of the flavonoid biosynthetic pathway. Journal of Experimental Botany. 2002; 53: 2099–2106. pmid:12324533
  28. 28. Mahato SB, Nandy AK, Roy G. Triterpenoids. Phytochemistry. 1992; 31: 2199–2249. pmid:1368386
  29. 29. Lee D, Ellard M, Wanner LA, Davis KR, Douglas CJ. The Arabidopsis thaliana 4-coumarate: CoA ligase (4CL) gene: stress and developmentally regulated expression and nucleotide sequence of its cDNA. Plant molecular biology. 1995; 28: 871–884. pmid:7640359
  30. 30. Gao J, Wang T, Liu M, Liu J, Zhang Z. Transcriptome analysis of filling stage seeds among three buckwheat species with emphasis on rutin accumulation. PLOS ONE. 2017; 12: e0189672. pmid:29261741
  31. 31. Park YJ, Li X, Noh SJ, Kim JK, Lim SS, Park NI et al. Transcriptome and metabolome analysis in shoot and root of Valeriana fauriei. BMC genomics. 2016; 17: 303. pmid:27107812
  32. 32. Kalra S, Puniya BL, Kulshreshtha D, Kumar S, Kaur J, Ramachandran S et al. De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum. PLOS ONE. 2013; 8: e83336. pmid:24376689
  33. 33. Kubasek WL, Shirley BW, McKillop A, Goodman HM, Briggs W, Ausubel FM. Regulation of flavonoid biosynthetic genes in germinating Arabidopsis seedlings. The Plant Cell. 1992; 4: 1229–1236. pmid:12297632
  34. 34. K Hahlbrock a, Scheel D. Physiology and Molecular Biology of Phenylpropanoid Metabolism. Annual Review of Plant Physiology and Plant Molecular Biology. 1989; 40: 347–369.
  35. 35. Li J, Hu Z. Accumulation and dynamic trends of triterpenoid saponin in vegetative organs of Achyranthus bidentata. Journal of integrative plant biology. 2009; 51: 122–129. pmid:19200150
  36. 36. Kumar S, Kalra S, Kumar S, Kaur J, Singh K. Differentially expressed transcripts from leaf and root tissue of Chlorophytum borivilianum: A plant with high medicinal value. Gene. 2012; 511: 79–87. pmid:23000016
  37. 37. Yang C, Hu H, Ren H, Kong Y, Lin H, Guo J et al. LIGHT-INDUCED RICE1 regulates light-dependent attachment of LEAF-TYPE FERREDOXIN-NADP+ OXIDOREDUCTASE to the thylakoid membrane in rice and Arabidopsis. The Plant Cell. 2016; 28: 712–728. pmid:26941088
  38. 38. Suorsa M, Sirpiö S, Allahverdiyeva Y, Paakkarinen V, Mamedov F, Styring S et al. PsbR, a missing link in the assembly of the oxygen-evolving complex of plant photosystem II. Journal of Biological Chemistry. 2006; 281: 145–150. pmid:16282331
  39. 39. Paddock T, Lima D, Mason ME, Apel K, Armstrong GA. Arabidopsis light-dependent protochlorophyllide oxidoreductase A (PORA) is essential for normal plant growth and development. Plant molecular biology. 2012; 78: 447–460. pmid:22278767
  40. 40. McGrath RB, Coruzzi GM. A gene network controlling glutamine and asparagine biosynthesis in plants. The Plant Journal. 1991; 1: 275–280. pmid:1688250
  41. 41. Ngai N, Coruzzi G: Dissecting light repression of the asparagine synthetase gene (AS1) in Arabidopsis. In: Cellular Integration of Signalling Pathways in Plant Development. Springer; 1998: 147–157.
  42. 42. Lea PJ, Ireland RJ. Nitrogen metabolism in higher plants. Plant amino acids: biochemistry and biotechnology Marcel Dekker, New York. 1999: 1–47.
  43. 43. Rose A, Last R. Molecular genetics of amino acid, nucleotide, and vitamin biosynthesis. COLD SPRING HARBOR MONOGRAPH SERIES. 1994; 27: 835–835.