Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

De Novo Sequencing and Assembly Analysis of the Pseudostellaria heterophylla Transcriptome

  • Jun Li,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Wei Zhen,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Dengkai Long,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Ling Ding,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Anhui Gong,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Chenghong Xiao,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Weike Jiang,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Xiaoqing Liu,

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Tao Zhou ,

    taozhou88@163.com

    Affiliations Guiyang University of Chinese Medicine, Guiyang 550025, China, National Engineering Research Center of Miao’s Medicines, Guiyang 550025, China

  • Luqi Huang

    Affiliation State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medical, China Academy of Chinese Medical Sciences, Beijing 100700, China

Correction

6 Jan 2017: Li J, Zheng W, Long D, Ding L, Gong A, et al. (2017) Correction: De Novo Sequencing and Assembly Analysis of the Pseudostellaria heterophylla Transcriptome. PLOS ONE 12(1): e0170134. https://doi.org/10.1371/journal.pone.0170134 View correction

Abstract

Pseudostellaria heterophylla (Miq.) Pax is a mild tonic herb widely cultivated in the Southern part of China. The tuberous roots of P. heterophylla accumulate high levels of secondary metabolism products of medicinal value such as saponins, flavonoids, and isoquinoline alkaloids. Despite numerous studies on the pharmacological importance and purification of these compounds in P. heterophylla, their biosynthesis is not well understood. In the present study, we used Illumina HiSeq 4000 sequencing platform to sequence the RNA from flowers, leaves, stem, root cortex and xylem tissues of P. heterophylla. We obtained 616,413,316 clean reads that we assembled into 127, 334 unique sequences with an N50 length of 951 bp. Among these unigenes, 53,184 unigenes (41.76%) were annotated in a public database and 39, 795 unigenes were assigned to 356 KEGG pathways; 23,714 unigenes (8.82%) had high homology with the genes from Beta vulgaris. We discovered 32, 095 DEGs in different tissues and performed GO and KEGG enrichment analysis. The most enriched KEGG pathway of secondary metabolism showed up-regulated expression in tuberous roots as compared with the ground parts of P. heterophylla. Moreover, we identified 72 candidate genes involved in triterpenoids saponins biosynthesis in P. heterophylla. The expression profiles of 11 candidate unigenes were analyzed by quantitative real-time PCR (RT-qPCR). Our study established a global transcriptome database of P. heterophylla for gene identification and regulation. We also identified the candidate unigenes involved in triterpenoids saponins biosynthesis. Our results provide an invaluable resource for the secondary metabolites and physiological processes in different tissues of P. heterophylla.

Introduction

Pseudostellaria heterophylla (Miq.) Pax, known as Hai Er Shen (HES) and false starwort belongs to the Caryophyllaceae family. The Chung Yao Chi New Chinese Materia Medica records the collection of HES plants since 1959 because of its local and ethnic use. P. heterophylla are distributed widely in the southern parts of China including Fujing, Jiangsu, Anhui, Shandong, Shanxi, Zhejiang, Jiangxi, Hubei, Shanxi, and Guangzhou provinces. The P. heterophylla is a mild tonic herb, weaker than Panax ginseng and popularly used in Traditional Chinese Medicines (TCM) products such as Jiangzhong Jianweixiaoshi Tablets, Composite Pseudostellaria granule. The mitogenic fraction (PH-I) from the hot water extract of P. heterophylla has significant potent anti-tumor activities against Ehrlich ascites tumor (EAT) cells in mice in vivo but not in vitro by releasing the tumor necrosis factor (TNF) [1]. The ethyl acetate fraction extracted from the roots of P. heterophylla markedly reduced the number of coughs and prolonged the latent cough period in rat model of stable phase chronic obstructive pulmonary disease induced by cigarette smoke exposure [2].

Saponins in P. heterophylla (PHS) are primary bioactive compounds and consist of Pseudostellarinoside A and A-cutifolisde D, both of which are oleanyl-type saponins [3]. PHS extracts have significant anti-fatigue, anti-anoxia activities [4] and prevent cell membrane of H9c2 cell from oxidative injury via preventing increased oxidative stress [5]. The precursor for the biosynthesis triterpenoid saponins is 2,3-oxidosqualene, which is synthesized via the MVA pathway [6]. Oxidosqualene cyclase (OSC) catalyzes the cyclization of 2.3-oxidosqualene to produce various triterpene skeletons. Some of candidate genes involved in triterpene saponin biosynthesis were isolated from P. quinquefolium [7], P. ginseng [8], and P. notoginseng [9], but none were identified from P. heterophylla.

Tuberous roots or stem, are the primary medicinal plant organs of TCM plants such as P. heterophylla [10], Fallopia multiflora [11], Panax notoginseng [12], Salvia miltiorrhiza [13]. Chemical technology has helped identify the secondary metabolites including flavonoids, an isoquinoline alkaloid, terpenoids, and phenylpropanoid in these plants [3]; however, there have been no molecular studies on secondary metabolism pathways involved in their biosynthesis and degradation. Hua et al.[14] (2016) performed de novo sequencing and transcriptome analysis of P. heterophylla tuberous roots, but no transcriptomic and genomic information from the aboveground parts (leaf, stem, and flower) is available in the nucleotide databases of National Centre for Biotechnology Information (NCBI). Study of the molecular basis of traits related to saponin biosynthesis and secondary metabolism in P. heterophylla will facilitate its breeding and improvement. RNA-seq is a useful tool for studying the expressed transcripts in different tissues and stages [15]. In this study, we used Illumina HiSeq 4000 sequencing platform to sequence the mRNA of P. heterophylla from various tissues (flowers, leaves, stem, root cortex and xylem). A global transcriptome database of P. Heterophylla was constructed to identify the differentially expressed genes (DEGs) in different tissues and putative genes encoding the enzymes involved in the biosynthesis of triterpene saponins.

Methods

Plant materials and RNA extraction

P. heterophylla cultivar ‘Shitai 1’ was selected and grown in a commercial planting base in Sibing County, Guizhou Province, China. Five tissues (S1 Fig) were collected separately from three randomly selected individuals. After cleaning, all samples were cut into small pieces for RNA isolation, and partial materials were used for gene cloning and RT-qPCR. Total RNA was extracted following the instructions of the Transzol Plant RNA Extraction Kit (TransGen Biotech, Beijing, China). DNA contamination was removed using DNase I (Takara, Tokyo, Japan).

cDNA library preparation and transcriptome sequencing

The construction of the cDNA libraries and the RNASeq was performed by Shanghai Majorbio Bio-pharm Technology Co., Ltd. (Shanghai, China). Firstly, mRNA were purified from 12 μg of total RNA from five tissues (flowers, leaves, stem, root cortex and xylem) by using Oligo(dT) magnetic beads, respectively. Then, the mRNA was disrupted into small fragments (200 ± 25 bp), which were used for the second-strand cDNA synthesis. These cDNA fragments were ligated with the Illumina paired-end sequencing adaptors. Finally, these libraries were sequenced on a paired-end flow cell using Illumina Hiseq 4000 platform. We obtained 5–8 GB of reads from each sample for de novo assembly.

De novo assembly and Gene annotation

Before assembly, the adaptors and unknown nucleotides in raw reads were filtered with SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle software (https://github.com/najoshi/sickle). Then the high-quality clean reads from 15 samples were used for de novo assembly by Trinity software [16] (http://trinityrnaseq.sourceforge.net/). Finally, the redundant Trinity generated contigs were clustered to remove using TIGR Gene Indices Clustering Tools (TGICL) (http://www.tigr.org/tdb/tgi/software/).

ORF prediction was performed using the Markov model as described on http://trinityrnaseq.sourceforge.net/analysis/extract_proteins_from_trinity_transcripts.html). Then, the results were determined by Pfam database (http://pfam.xfam.org/). All unigenes were annotated using BLASTx by sequence comparison with various protein databases [i.e., Nr, Swissprot, Cluster of Orthologous Groups of proteins (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG)], with an e-value cutoff of 1e-5. Function analysis of all unigenes was performed by subjecting to Gene Ontology (GO). Blast2GO program (https://www.blast2go.com/) was used to identify the GO term from all assembled unigenes. Finally, we used the WEGO software (http://wego.genomics.org.cn/) to perform GO function classification and determine the distribution of gene functions in P. heterophylla at the macromolecular level.

Digital gene expression profiling

Gene expression profiles were performed using RSEM (http://deweylab.biostat.wisc.edu/rsem/). The reads per kb per million reads (RPKM) were used to normalize the expression levels for each gene in each tissue of P. heterophylla. The RPKM from all isoforms of the same gene were summed as the RPKM of that gene. Cluster 3.0 software (http://bonsai.hgc.jp/~mdehoon/software/cluster/) was used to normalize the expression level of triterpene saponins. Samples names are shown on the heat maps.

Identification of the unigenes involved in triterpene saponins

The amino acid sequences of triterpene saponins were downloaded from NCBI and used for searching the P. heterophylla transcriptomic database. Putative genes of saponin biosynthesis in P. heterophylla were identified using the BlastP program with an e-value of 1e-10. The default hits were removed manually.

Real-Time PCR verification

Total RNA was extracted from different tissues of P. heterophylla and first-strand cDNA synthesis was performed by the Reverse Transcriptional M-MLV (Takara, Japan). We used ABI 7500 real-time PCR system (Life Technologies, Carlsbad, CA, USA) to determine the expression by real time PCR. All reactions were performed using SYBR® Premix Ex Taq™ II (Takara Biotechnology, China) according to the procedure with ten-fold diluted cDNA as templates. Reactions were first incubated at 95°C for 30 s, followed by 40 cycles of amplification at 95°C for 5 s and then 60°C for 34 s, after a final cycle of amplification at 95°C for 15 s, 60°C for 1 min and 95°C for 15 s. The raw data were analyzed using ABI 7500 software, and expression levels were normalized to PhACT2 gene (gi: KT363848) to minimize the variation of cDNA template contents. The expression level was shown using 2−ΔCt method. The experiments were performed in three individual biological replications.

Results

Illumina paired-end sequencing data and De novo assembly

We obtained 87 Gb of sequencing data including 645,961,688 raw reads and 616,413,316 clean reads with the base average error rate below 0.02%. A brief overview of the transcriptome assembly statistics are shown in Table 1. We used the Trinity program for the de novo assembling of the clean data because P. heterophylla reference genome was not available. After removal of ambiguous reads and low-quality reads (Q20 < 20), 127,334 unique sequences were obtained from the cDNA library constructed from P. heterophylla flowers, stem, leaves, and tuberous roots of (Table 1). The Q20 percentage (sequencing error rate < 1%) and Q30 percentage were 98% and 93.81%, respectively. The GC percentage in ground parts (leaves, stem, and flowers) and underground parts (cortex and xylem of tuberous roots) were an average of 51.5% and 43.5%. The length of unigenes ranged from 201 to 82, 236 bp, with an N50 length of 951 bp. 48, 860 coding sequences were obtained from all P. heterophylla unigenes sequences, and 30, 396 CDSs (62.21%) were longer than 1000 bp.

Functional annotation

Gene annotation showed that only a total of 52,937 unigenes (41.57%) had significant matches with the information from public databases. The annotation rate was much lower than those of previous reports [17, 18]. However, there are about 74,150 unigenes (58.43%) without any matches to known genes, and these unaligned genes may be specific genes and novel transcripts in P. heterophylla. Our results showed that 20,104 unigenes had high similarity (greater than 80%) in mapped sequences with Nr database and 20,497 unigenes (16.09%) had significant homology (e-value < 10−30) (Fig 1A and 1B). The mapping rates of unigenes against the Pfam, Swissprot, KEGG, String databases were 38.83%, 68.24%, 44.32% and 18.49%, respectively. The number of unigenes that were annotated in the unique database were as follows: 101 unigenes in the Pfam database, 90 unigenes in the SwissProt database, 36 unigenes in the KEGG database and 11,213 unigenes in the Nr database (Table 2). Species distribution analysis showed that only 23,711 unigenes (18.62%) had high homology with the genes from Beta vulgaris, followed by Vitis vinifera (1,182, 0.93%), Theobroma cacao (380, 0.30%), while 17,126 unigenes had high homology with sequences from other organisms (Fig 1C).

thumbnail
Fig 1. Species distribution of unigenes from P. heterophylla.

a: Similarity distribution of top BLAST hits for each unigene; b: E-value distribution of BLAST hits with a cut off E-value of 1.0E−5; c: Species distribution for top BLAST hits in the Nr database.

https://doi.org/10.1371/journal.pone.0164235.g001

Unigenes showing high similarities with genes from Microbotryum violaceum (686 unigenes), Fusarium oxysporum (509 unigenes), Leptosphaeria maculans (321 unigenes), Pseudomonas fluorescens (272 unigenes), Rhodosporidium toruloides (255 unigenes) may belong to endophytes surviving in different parts of P. heterophylla [19]. Three unigenes from each species were validated by RT-PCR (S1 Table & S2 Fig).

Functional classification

We classified the functions of all unigenes using the Nr annotation and Gene Ontology (GO) classification (Fig 2, S2 Table). Moreover, we assigned 28, 210 unigenes to one or more gene ontology categories, 24,129 to molecular function, 15,544 unigenes to cellular component, and 23,751 unigenes to biological process. In the molecular function group, we found unigenes related to “catalytic activity” (15, 220, 53.95%) and “binding” (14,909, 52.85%). For the cellular component category, “cell” (7,660, 45.78%), “cell part” (7,659, 45.77%), “organelle” (5,601, 33.47%), “membrane” (4,380, 26.18%), “macromolecular complex” (3,485, 20.83%) represented the majority of unique sequences. Among molecular function category, unigenes assigned to “metabolic process” (11,388, 68.06%), “cellular process” (10, 343, 61.81%), and “single-organism process” (8,446, 50.47%) were the most abundant. A high percentage of genes were grouped into the “biological regulation” (3,217, 19.22%), “response to stimulus” (3,084, 18.43%), “regulation of biological process” (3,015, 18.02%), and “cellular component organization or biogenesis” (2, 461, 14.71%) categories.

thumbnail
Fig 2. Gene Ontology classification of assembled unigenes.

The unigenes were categorized into three main categories: biological process, cellular component, and molecular function.

https://doi.org/10.1371/journal.pone.0164235.g002

COG database was used for the function prediction and classification of all unigenes(Fig 3). In brief, 5,140 unigenes were grouped into 25 COG classifications. The largest group in the 25 COG categories was “translation, ribosomal structure and biogenesis” (803, 14.74%), followed by “general function prediction” (631, 11.58%), “signal transduction mechanisms” (565, 10.37%), and “posttranslational modification, protein turnover, chaperones” (544, 9.99%).

KEGG classification

All unigenes were compared against KEGG for searching active biochemical pathways in P. heterophylla using BLASTx, with an e-value < 1e-10. We assigned 39, 795 unigenes to 356 KEGG pathways (Fig 4). “Ribosome” had the largest number of unigenes (1,075 unigenes) followed by “protein processing in endoplasmic reticulum” (404 unigenes), “oxidative phosphorylation” (390 unigenes), “glycolysis/gluconeogenesis” (315 unigenes), “endocytosis” (309 unigenes), “spliceosome” (287 unigenes). The metabolic pathways in our study were: “carbohydrate metabolism” (1,398 unigenes), “amino acid metabolism” (1,193 unigenes), “energy metabolism” (1,124 unigenes), “lipid metabolism” (653 unigenes), “metabolism of cofactors and vitamins” (425 unigenes), “metabolism of other amino acids” (345 unigenes), “nucleotide metabolism” (343 unigenes), “glycan of biosynthesis and metabolism” (306 unigenes), “metabolism of terpenoids and polyketides” (299 unigenes), and “biosynthesis of secondary metabolites” (270 unigenes). KEGG genetic information processing included “folding, sorting and degradation” (914 unigenes), followed by “replication” (413 unigenes) and “transcription” (108 unigenes). In the environmental information processing category, the most abundant subcategories were “signal transduction” (1,215 unigenes), “signaling molecules” and “interaction” (251 unigenes), and “membrane transport” (233 unigenes) (S3 Table).

thumbnail
Fig 4. Pathway assignment based on the Kyoto Encyclopedia of Genes and Genomes (KEGG).

(A) Classification based on metabolism categories, (B) Classification based on genetic information processing categories, (C) Classification based on environmental information processing categories, (D) Classification based on cellular processes categories, and (E) Classification based on organismal systems categories.

https://doi.org/10.1371/journal.pone.0164235.g004

Differential Expression Analysis of P. heterophylla

We used our assembled data as a reference and compared the unigenes from different tissues of P. heterophylla (Fig 5A). A unigenes was regarded as a Differentially Expressed Gene (DEG) when FDR < 0.05 and log2|FC| > = 1. There were 32,095 DEGs between root cortex and xylem, of which 21,073 were down-regulated, and 11,022 were up-regulated (Fig 5B). There were 30,070 DEGs between root cortex and leaf, in which 18,495 were down-regulated and 11,575 up-regulated. Moreover, we identified 31,555 DEGs between root cortex and stem, 18,212 of which were down-regulated and 13,343 of which were up-regulated. Between root cortex and flower, 17,073 DEGs were down-regulated while 6,948 DEGs were up-regulated. Overall, we identified 2,289 common DEGs from the four comparison groups. Root cortex showed the highest number of upregulated unigenes among all tissues.

thumbnail
Fig 5. Venn diagrams of unigenes of three libraries and statistical analysis of the differentially expressed genes (DEGs).

(A) Distribution of the unigenes of the three libraries; (B) The red columns indicate the up-regulated DEGs and the green columns represent the down-regulated DEGs in three pair-wise comparisons (FDR ≤ 0.001 and an absolute value of log 2 Ratio ≥ 1 was used as the significant threshold for DEGs).

https://doi.org/10.1371/journal.pone.0164235.g005

GO enrichment analysis and KEGG enrichment analysis of DEGs in P. heterophylla

The GO enrichment analysis and KEGG enrichment analysis elucidated the functional differences of DEGs from different P. heterophylla samples (S3 Fig). In GO enrichment analysis, the function was regarded as enriched if the corrected p-value of which was below 0.05. The result showed that the unigenes involved in “response to fungus”, “oligosaccharide metabolic process”, “defense response to other organism”, “chloroplast envelope”, “hydrolase activity, hydrolyzing O-glycosyl compounds”, “sucrose metabolic process” were enriched between root cortex and flower (S4 Fig). Highly enriched DEGs were involved in “response to auxin”, “root development”, “plastid thylakoid”, “chloroplast thylakoid”, and “chloroplast stroma” between root cortex and leaf (S5 Fig). The DEGs involved in “pollen development”, “gametophyte development”, “response to auxin”, “response to external stimulus”, and “thylakoid” were enriched between root cortex and stem (S6 Fig). Other highly enriched genes were related to “oxidation−reduction process”, “naringenin−chalcone synthase activity”, “flavonoid metabolic process”, “protein disulfide oxidoreductase activity” between root cortex and xylem (S7 Fig). Moreover, we also analyzed 31 response categories related to DEGs using the heatmap according to the total RPKM values of all the DEGs in each pathway (Fig 6). Among these categories, most categories were up-regulated in underground parts (root cortex and xylem), including “response to biotic stimulus”, “response to insect”, “response to carbohydrate”, “response to endogenous stimulus”, “response to fungus”, “response to bacterium” and “response to wounding”. The only categories active in leaf were “response to cytokinin”, “response to jasmonic acid”, “response to light stimulus”, “response to cold”. Our results showed that 11 out of 31 response pathways had up-regulated expression in both leaf and stem. These included “response to salt stress”, “response to brassinosteroid”, “response to auxin”, “response to water deprivation”, “response to gibberellin” and “response to salt stress”.

thumbnail
Fig 6. GO annotation of DEGs.

The heat map shows 31 categories of DEGs in different tissues, leaf, stem, flower, root cortex and xylem included. Different colors indicated different expression levels. Green indicates down-regulated expression and red represents up-regulated expression. The heat map of all genes involved in each category was constructed using the log10 values of RPKM.1_G_M, 3_G_M, 4_G_M represents root xylem, 1_G_P, 3_G_P 4_G_P represents root cortex, 1_G_YD, 3_G_YD, 4_G_YD represents leaf, 1_G_J, 3_G_J, 4_G_J represents stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants.

https://doi.org/10.1371/journal.pone.0164235.g006

For a further study of DEGs, the KEGG database was used to search the significantly enriched biochemical pathway. Between root cortex and flower, the most significant enriched pathway was “plant hormone signal transduction”,which contained down-regulated DEGs in above-ground parts. Most of the DEGs that were involved in “plant-pathogen interaction”, “starch and sucrose metabolism”, “phenylpropanoid biosynthesis”, “alpha-Linolenic acid metabolism”, “circadian rhythm–plant”, “glycosylphosphatidylinositol (GPI)-anchor biosynthesis”, and “N-Glycan biosynthesis” were down-regulated. On the other hand, the DEGs involved in “diterpenoid biosynthesis”, “isoquinoline alkaloid biosynthesis”, “monoterpenoid biosynthesis”, “stibenoid diarylhepatanoid and gingerol biosynthesis”, “ubiquinone and other terpenoid-quinone biosynthesis”, “zeatin biosynthesis” were up-regulated (S8 Fig). We observed similar results in each underground parts (root cortex and xylem) compared to either aboveground parts (leaf, stem and xylem) in P. heterophylla (S9S11 Figs).

We further used the heatmap to analyze 14 KEGG pathways involved in the biosynthesis of secondary metabolites in different tissues (Fig 7). Our analysis showed that 6 out of 14 pathways showed up-regulated expression in underground parts (root cortex and xylem) including “monoterpenoid biosynthesis”, “zeatin biosynthesis”, “tropane, piperidine and pyridine alkaloid biosynthesis”, “sesquiterpenoid and triterpenoid biosynthesis”, “ubiquinone and other terpenoid−quinone biosynthesis” and “isoquinoline alkaloid biosynthesis.” These results explain why tuberous root including root cortex and xylem is the principal medicinal part of P. heterophylla.

thumbnail
Fig 7. KEGG annotation of DEGs.

The heat map shows 31 pathways of secondary metabolism in different tissues, including leaf, stem, flower, root cortex and xylem. Expression differences are shown in different colors. Red represents high expression and green represents the low expression. 1_G_M, 3_G_M, 4_G_M indicates root xylem, 1_G_P, 3_G_P 4_G_P indicates root cortex, 1_G_YD, 3_G_YD, 4_G_YD indicates leaf, 1_G_J, 3_G_J, 4_G_J indicates stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants.

https://doi.org/10.1371/journal.pone.0164235.g007

Identification of genes involved in triterpenoids saponins biosynthesis of P. heterophylla

We identified 70 candidate genes in P. heterophylla including AACT (acetyl-CoA acetyltransferase), HMGS (HMG-CoA synthase), HMGR (HMG-CoA reductase), MVK (mevalonate kinase), PMK (phosphomevalonate kinase), MVD (mevalonate diphosphate decarboxylase), GGPS (geranylgeranyl pyrophosphate synthase), FPS (farnesyl diphosphate synthase), IDI (isopentenyl diphosphate isomerase), SS (squalene synthase), SE (squalene epoxidase), LuS (lupeol synthase), β-A28O (β-amyrin 28-oxidase) (S4 Table). 2,3-oxidosqualene is the the key enzyme at the first committed step and the skeleton of triterpenoids saponins in plants depends on its activity. Notably, three unigenes (c24484_g1, c60124_g1, c27529_g1) encoding lupeol synthase were identified from our transcriptome data, but none encoding β-amyrin synthase and dammarenediol-II synthase were identified.

The heat map result showed that most unigenes encoding AACT, HMGS, MK, PMK, MDD and IDI, had high expression levels in flowers, leaves, stem, root cortex, and xylem (Fig 8). Some members of the gene family of HMGR, SE andβ-A28O were up-regulated in the root cortex and xylem while others were down-regulated. The unigenes encoding GGPS (c12012_g1, c99329_g1) and IDI (c1497_g1) were up-regulated specially in leaf and stem. Some investigated genes showed high expression levels in the root cortex and xylem such as FPS (c51143_g1, c54472_g1), SS (c65449_g2, c65449_g4, c66040_g4) and LuS (c60124_g1). The identification of genes involved in triterpenoids saponins biosynthesis may help explain the accumulation of saponins in different tissues of P. heterophylla. We validated the expression levels of 11 randomly selected genes using real-time PCR. The expression profiles of these unigenes were consistent with the transcriptomic data (Fig 9). Gene-specific primers were designed based on the gene sequences and are shown in S5 Table.

thumbnail
Fig 8. The expression profiles of unigenes involved in triterpene saponin biosynthesis of P. Heterophylla.

Expression differences are shown in different colors. Red represents high expression and green represents the low expression.

https://doi.org/10.1371/journal.pone.0164235.g008

thumbnail
Fig 9. The expression validation of candidate genes in triterpene saponin biosynthesis of P. heterophylla by qRT-PCR.

Error bars represent the mean (± SD) of three individual biologic experiments.

https://doi.org/10.1371/journal.pone.0164235.g009

Discussion

High throughput transcriptome sequencing has become a popular tool for sequencing non-model organisms such as Gingko kernels [20], Rehmannia glutinosa [21], Gossypium hirsutum [22], Liriodendron chinense [23], Ramia [24], and Centella asiatica [25]. We used Illumina HiSeq 4000 sequencing platform to sequence RNA from flowers, leaves, stem, root cortex, and xylem of P. heterophylla. The number of unigenes (127,334) identified in our study was much higher as compared with previous transcriptomic studies from Pseudostellariae redix [14]. Our data provides a useful resource for gene identification and regulation in different tissues of P. heterophylla.

Our transcriptomic data identified unigenes related to five endophytes. Three of these endophytes (M. violaceum, L. maculans, and P. fluorescens) are harmful for the development of plant organs [2628]. F. oxysporum is an important replant disease pathogen in Pseudostellaria heterophylla rhizospheric soil [29] and also isolated from Chamaecyparis lawsoniana [30], Quercus variabilis [31] and Ephedra fasciculate [32]. Some active chemicals were previously purified from F. oxysporum, such as oxysporidinone (pyridine, anti-fungus) and beauvericin (cycle-peptide, anti-cancer) [33]. R. toruloides is an oleaginous yeast and used for lipid production [34]. The results of transcriptome data and reverse transcript PCR indicated that the transcripts of unigenes from M. violaceum, P. fluorescens and R. toruloides were detected in aboveground parts (leaf, stem and flower), the expression profiles of unigenes from R. toruloides and L. maculans were determined in underground parts (root cortex and xylem).These results suggest that endophytes may participate in the interaction between plants and microorganisms; and thus, provide a novel guideline for the planting of P. heterophylla.

The transcriptomic data from different tissues showed that most DEGs were either up-regulated in ground parts (leaf, stem, and flower) or underground parts (root cortex, and xylem) while a few DEGs showed special expression in certain tissues. The tuberous roots of sweet potato, cassava, and dahlia store nutrients, which permit survival from one year to the next. The formation of an enlarged area and secondary metabolic biosynthesis in the tuberous root is influenced by environment factors including fungus, bacteria, and wounding [35, 36]. In this study, these pathways were up-regulated both in root cortex and xylem. The pathways related to response to cytokinin, jasmonic acid, light stimulus, and cold were specially activated in the leaves. These results provided a better understanding of gene expression and regulation in different tissues of P. heterophylla.

The unigenes involved in triterpenoids saponins biosynthesis of P. heterophylla were identified. The cyclization of 2,3-oxidosqualene–catalyzed 2,3-oxidosqualene cyclases (OSCs)–is the first committed step in the triterpenoid saponins, which provides potential products [37]. Although the structure of saponins in P. heterophylla was similar to that of P. vietnamensis and P. notoginseng, we did not identify any unigenes encoding β-amyrin synthase and dammarenediol-II synthase. The OSCs in plants contain four genes coding β-amyrin synthase, dammarenediol-II synthase, lupeol synthase and cycloartenol synthase, respectively. Because of high similarities, these pentacyclic triterpene synthases may have evolved in a complicated order in triterpenoid saponin biosynthesis and sterol biosynthesis with a common progenitor [38]. The in vitro activities of OSCs were analyzed by expressing them in Saccharomyces cerevisiae, strains carrying OSC2 accumulated α-, β-, and δ-amyrin and strains carrying LuS accumulated α-amyrin and lupeol [39]. The above study suggested that 2,3-oxidosqualene in triterpenoids saponins biosynthesis of P. heterophylla may mainly rely on the activity of lupeol synthase. Moreover, the discovery of β-amyrin synthase requires a precise sequencing technology in the future.

Our qRT-PCR results and transcriptome data showed that two unigenes (c65449_g2, c65449_g4) encoding squalene synthase and two (c59462_g1, c55401_g1) encoding squalene epoxidase in triterpenoids saponins biosynthesis were up-regulated in both root cortex and xylem. Unigenes encoding GGPS (c99329_g1), IDI (c1497_g1), and MDD (c53051_g1) enzymes showed a high expression in both ground parts (leaf and stem) and underground parts (root cortex and xylem). Triterpene saponins can be extracted from underground parts (tuber root) and aerial parts (leaf and stem) [40] of P. heterophylla; however, these triterpene saponins may also accumulate in special tissues. Our study provides valuable information about pathways for the synthesis of triterpenoid saponins. Future studies involving isolation of key enzymes genes (OSCs) and their functional analysis are imperative for a complete understanding of the triperpenoid biosynthetic pathways.

Supporting Information

S2 Fig. Reverse transcript PCR determination of unigenes from endophytes.

https://doi.org/10.1371/journal.pone.0164235.s002

(PDF)

S3 Fig. Heatmap_Cortex_Flower_Leaf_Stem_Xylem.

https://doi.org/10.1371/journal.pone.0164235.s003

(PDF)

S4 Fig. Cortex_vs_Flower.DE.list.enrichment.detail.xls.go.

https://doi.org/10.1371/journal.pone.0164235.s004

(PDF)

S5 Fig. Cortex_vs_Leaf.DE.list.enrichment.detail.xls.go.

https://doi.org/10.1371/journal.pone.0164235.s005

(PDF)

S6 Fig. Cortex_vs_Stem.DE.list.enrichment.detail.xls.go.

https://doi.org/10.1371/journal.pone.0164235.s006

(PDF)

S7 Fig. Cortex_vs_Xylem.DE.list.enrichment.detail.xls.go.

https://doi.org/10.1371/journal.pone.0164235.s007

(PDF)

S1 Table. Primers of unigenes from endophytes used in reverse transcript PCR.

https://doi.org/10.1371/journal.pone.0164235.s012

(XLSX)

S4 Table. Candidate genes involved in triterpenoids saponins biosynthesis of P. heterophylla.

https://doi.org/10.1371/journal.pone.0164235.s015

(XLSX)

S5 Table. List of gene-specific primers used in real time PCR.

https://doi.org/10.1371/journal.pone.0164235.s016

(XLSX)

Author Contributions

  1. Conceptualization: JL TZ.
  2. Data curation: JL LD WZ.
  3. Formal analysis: DL XL.
  4. Funding acquisition: TZ JL.
  5. Investigation: LD WZ.
  6. Methodology: JL TZ LD WZ.
  7. Project administration: TZ LH.
  8. Resources: TZ WJ LH.
  9. Software: DL LD.
  10. Validation: JL AG CX.
  11. Visualization: LD WZ DL.
  12. Writing – original draft: JL TZ.
  13. Writing – review & editing: JL TZ.

References

  1. 1. Wong CK, Leung KN, Fung KP, Pang PK, Choy YM. Mitogenic and tumor-necrosis-factor producing activities of Pseudostellaria-Heterophylla. Int J Immunopharmacol. 1992;14(8):1315–1320 pmid:1464465
  2. 2. Pang W, Lin S, Dai Q, Zhang H, Hu J. Antitussive Activity of Pseudostellaria heterophylla (Miq.) Pax extracts and improvement in lung function via adjustment of multi-cytokine levels. Molecules. 2011; 16(4):3360–3370. pmid:21512444
  3. 3. Wang ZX, Xu SX, Zhang GG, Qiu F, Su XF, Zhang XQ, et al. Studies on the chemical constituents of pseudotellaria heterophylla (miq.) pax ex. Chinese J. Med. Chem. 1992; 2: 65–67.
  4. 4. Liu XH, Cheng B, Wang YX. Preliminary studies on pharmacological action of total saponins from radix pseudostellariae." Jiangsu Pharm. Clin. Res. 2000; 8: 6–8.
  5. 5. Wang Z, Liao SG, He Y, Li J, Zhong RF, He X, et al. Protective effects of fractions from Pseudostellaria heterophylla against cobalt chloride-induced hypoxic injury in H9c2 cell. J Ethnopharmacol. 2013; 147(2):540–5. pmid:23542142
  6. 6. Haralampidis K, Trojanowska M, Osbourn AE. Biosynthesis of triterpenoid saponins in plants. Adv. Biochem. Eng. Biotechnol. 2002; 75, 31–49. pmid:11783842
  7. 7. Wu Q, Song J, Sun Y, Suo F, Li C, Luo H, et al. (2010). Transcript profiles of Panax quinquefolius from flower, leaf and root bring new insights into genes related to ginsenosides biosynthesis and transcriptional regulation. Physiol Plant. 2010; 138(2):134–49. pmid:19947964
  8. 8. Li C, Zhu Y, Guo X, Sun C, Luo H, Song J, et al. (2013). Transcriptome analysis reveals ginsenosides biosynthetic genes, microRNAs and simple sequence repeats in Panax ginseng C. A. Meyer. BMC Genomics. 2013; 14: 245. pmid:23577925
  9. 9. Liu MH, Yang BR, Cheung WF, Yang KY, Zhou HF, Kwok JS, et al. (2015). Transcriptome analysis of leaves, roots and flowers of Panax notoginseng identifies genes involved in ginsenoside and alkaloid biosynthesis. BMC Genomics. 2015; 16: 265. pmid:25886736
  10. 10. Reinecke MG, Zhao YY. Phytochemical studies of the chinese herb Tai-Zi-Shen, Pseudostellaria-Heterophylla. J. Nat. Prod. 1988; 51: 1236–1240.
  11. 11. Zheng CJ, Zhao SJ, Zhao ZH, Guo J. Molecular authentication of the traditional medicinal plant Fallopia multiflora. Planta Med. 2009; 75: 870–872. pmid:19242903
  12. 12. Yoshikawa M, Murakami T, Ueno T, Yashiro K, Hirokawa N, Murakami N, et al. Bioactive saponins and glycosides. VIII. Notoginseng (1): new dammarane-type triterpene oligoglycosides, notoginsenosides-A, -B, -C, and -D, from the dried root of Panax notoginseng (Burk.) F.H. Chen. Chem. Pharm. Bull. (Tokyo) 1997; 45, 1039–1045.
  13. 13. Hatfield MJ, Tsurkan LG, Hyatt JL, Edwards CC, Lemoff A, Jeffries C, et al. (2013). Modulation of esterified drug metabolism by tanshinones from Salvia miltiorrhiza ("Danshen"). J Nat Prod. 2013; 76(1): 36–44. pmid:23286284
  14. 14. Hua Y, Wang S, Liu Z, Liu X, Zou L, Gu W, et al. Transcriptomic analysis of Pseudostellariae Radix from different fields using RNA-seq. Gene. 2016; 588(1):7–18. pmid:27125225
  15. 15. De Donato M, Peters SO, Mitchell SE, Hussain T, Imumorin IG. "Genotyping-by-sequencing (GBS): a novel, efficient and cost-effective genotyping method for cattle using next-generation sequencing. PLoS One. 2013; 8(5): e62137. pmid:23690931
  16. 16. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011; 29(7): 644–652. pmid:21572440
  17. 17. Zhang GH, Ma CH, Zhang JJ, Chen JW, Tang QY, He MH, et al. Transcriptome analysis of Panax vietnamensis var. fuscidicus discovers putative ocotillol-type ginsenosides biosynthesis genes and genetic markers. BMC Genomics. 2015; 16: 159. pmid:25765814
  18. 18. Muriira NG, Xu W, Muchugi A, Xu J, Liu A. De novo sequencing and assembly analysis of transcriptome in the Sodom apple (Calotropis gigantea). BMC Genomics. 2015; 16: 723. pmid:26395839
  19. 19. Cai QY, Li S, Xiao GQ, Zeng PY, Huang ZL, Wu JZ. Study on Bioactivity of Endophytic Fungi from Psudostellaria heterophylla (Miqi) Pax ex Pax et Hoffm. J. Fujian University TCM. 2011; 21: 41–46
  20. 20. He B, Gu Y, Xu M, Wang J, Cao F, Xu LA. Transcriptome analysis of Ginkgo biloba kernels. Front Plant Sci. 2015; 6: 819. pmid:26500663
  21. 21. Sun P, Xiao X, Duan L, Guo Y, Qi J, Liao D, et al. Dynamic transcriptional profiling provides insights into tuberous root development in Rehmannia glutinosa. Front Plant Sci. 2015; 6: 396. pmid:26113849
  22. 22. Ma Q, Wu M, Pei W, Wang X, Zhai H, Wang W, et al. RNA-seq-mediated transcriptome analysis of a fiberless mutant cotton and its possible origin based on snp markers. PLoS One. 2016; 11(3): e0151994. pmid:26990639
  23. 23. Yang Y, Xu M, Luo Q, Wang J, Li H. De novo transcriptome analysis of Liriodendron chinense petals and leaves by Illumina sequencing. Gene. 2014; 534(2): 155–62 pmid:24239772
  24. 24. Guillén Y, Rius N, Delprat A, Williford A, Muyas F, Puig M, et al. Genomics of ecological adaptation in cactophilic Drosophila. Genome Biol Evol. 2014; 7(1): 349–66. pmid:25552534
  25. 25. Sangwan RS, Tripathi S, Singh J, Narnoliya LK, Sangwan NS. De novo sequencing and assembly of Centella asiatica leaf transcriptome for mapping of structural, functional and regulatory genes with special reference to secondary metabolism. Gene. 2013; 525(1): 58–76. pmid:23644021
  26. 26. Bucheli E, Gautschi B, Shykoff JA. Differences in population structure of the anther smut fungus Microbotryum violaceum on two closely related host species, Silene latifolia and S. dioica. Mol Ecol. 2001; 10(2): 285–94. pmid:11298945
  27. 27. Bohman S, Staal J, Thomma BP, Wang M, Dixelius C. Characterisation of an Arabidopsis-Leptosphaeria maculans pathosystem: resistance partially requires camalexin biosynthesis and is independent of salicylic acid, ethylene and jasmonic acid signalling. Plant J. 2004; 37(1): 9–20. pmid:14675428
  28. 28. Rainey PB. Adaptation of Pseudomonas fluorescens to the plant rhizosphere. Environ Microbiol. 1999; 1(3): 243–57. pmid:11207743
  29. 29. Zhao Y, Wu L, Chu L, Yang Y, Li Z, Azeem S, et al. Interaction of Pseudostellaria heterophylla with Fusarium oxysporum f.sp. heterophylla mediated by its root exudates in a consecutive monoculture system. Sci Rep. 2015; 5: 8197. pmid:25645742
  30. 30. Liu J, Cao D, Zhao K, Ping WX, Zhou DP. Separating endophytes from port -orford-cedar. J. Sci. Teachers' college and university 2008; 28: 62–64.
  31. 31. Musavi SF, Balakrishnan RM. Biodiversity, antimicrobial potential and phylogenetic placement of an endophytic Fusarium oxysporum nfx 06 isolated from Nothapodytes foetida. J. Mycol. 2013; 12: 1–10.
  32. 32. Breinhold J, Ludvigsen S, Rassing BR, Rosendahl CN, Nielsen SE, Olsen CE. Oxysporidinone: a novel, antifungal N-methyl-4-hydroxy-2-pyridone from Fusarium oxysporum. J. Nat. Prod. 1997; 60: 33–35. pmid:9014349
  33. 33. Lee HS, Kim KA, Seo DG, Lee C. Effects of (1)(4)C-labelled precursor feeding on production of beauvericin, enniatins H, I, and MK1688 by Fusarium oxysporum KFCC11363P. J. Biosci. Bioeng. 2012; 113: 58–62. pmid:22019403
  34. 34. Wiebe MG, Koivuranta K, Penttilä M, Ruohonen L. Lipid production in batch and fed-batch cultures of Rhodosporidium toruloides from 5 and 6 carbon carbohydrates. BMC Biotechnol. 2012; 12: 26. pmid:22646156
  35. 35. Washio K, Ishikawa K. Organ-specific and hormone-dependent expression of genes for serine carboxypeptidases during development and following germination of rice grains. Plant Physiol. 1994; 105(4): 1275–80. pmid:7972496
  36. 36. Valcu CM, Junqueira M, Shevchenko A, Schlink K. Comparative proteomic analysis of responses to pathogen infection and wounding in Fagus sylvatica. J. Proteome Res. 2009; 8: 4077–4091. pmid:19575529
  37. 37. Sawai S, Saito K. Triterpenoid biosynthesis and engineering in plants. Front. Plant Sci. 2011; 2(25):25–25. pmid:22639586
  38. 38. Hayashi H, Huang P, Kirakosyan A, Inoue K, Hiraoka N, Ikeshiro Y, et al. Cloning and characterization of a cDNA encoding beta-amyrin synthase involved in glycyrrhizin and soyasaponin biosyntheses in licorice. Biol. Pharm. Bull. 2001; 24: 912–916. pmid:11510484
  39. 39. Moses T, Pollier J, Shen Q, Soetaert S, Reed J, Erffelinck ML, et al. OSC2 and CYP716A14v2 catalyze the biosynthesis of triterpenoids for the cuticle of aerial organs of Artemisia annua. Plant Cell. 2015; 27(1): 286–301. pmid:25576188
  40. 40. Zeng LN, Yuan YH, Huang SY, Fu CJ, Zheng YS, Rong JD, et al. A comparision of effective medicinal ingredients content of Pseudostellaria heterophylla in different harvest times and medicinal parts. Fujian Linye. 2014; 2: 25–26.