Integrated analysis of the transcriptome and metabolome of purple and green leaves of Tetrastigma hemsleyanum reveals gene expression patterns involved in anthocyanin biosynthesis

To gain better insight into the regulatory networks of anthocyanin biosynthesis, an integrated analysis of the metabolome and transcriptome in purple and green leaves of Tetrastigma hemsleyanum was conducted. Transcript and metabolite profiles were archived by RNA-sequencing data analysis and LC-ESI-MS/MS, respectively. There were 209 metabolites and 4211 transcripts that were differentially expressed between purple and green leaves. Correlation tests of anthocyanin contents and transcriptional changes showed 141 significant correlations (Pearson correlation coefficient >0.8) between 16 compounds and 14 transcripts involved in the anthocyanin biosynthesis pathway. Some novel genes and metabolites were discovered as potential candidate targets for the improvement of anthocyanin content and superior cultivars.


Introduction
Tetrastigma hemsleyanum Diels et Gilg (T. hemsleyanum, belonging to the family Vitaceae), also known as "Sanyeqing" (SYQ), is distributed in tropical to subtropical areas in Asia, mainly in southern China, such as Zhejiang, Guizhou and Guangxi provinces. The entire herb and its root tubers have been used as a broad-spectrum antibiotic material for the treatment of fever and sore throat in China for a long time. Previous findings have demonstrated that the principal and functional components of SYQ, such as polysaccharides and polyphenols are beneficial for health [1,2].
Flavonoids such as anthocyanins and flavanols are the most dominant and important components in SYQ [3,4]. These compounds are beneficial for human health and are an important group of pigments that colour the leaves and flowers of many plants [5]. Anthocyanins are also involved in biotic and abiotic stress defence responses [6,7]. It is vital to elucidate the flavonoid biosynthesis and regulatory pathways in SYQ. Although several genes in the anthocyanin regulation pathway have been identified [8], the unique regulation mechanism in SYQ remains unclear.
In the past decade, RNA sequencing has rapidly become an efficient approach to analyse the function of genes in a high-throughput way [11]. Transcripts with low abundance and unknown transcripts cannot be identified [12]. Liquid chromatography / mass spectrometry (LC/MS) has advanced metabolomics by enabling the discovery ofa large number of compounds compared with traditional chemical analysis [13,14]. Recently, the integrated analysis of metabolic and gene expression has been widely used in the exploitation network and correlation between metabolites and genes [15][16][17].
In this study, we explored the regulatory networks of flavonoid and anthocyanin biosynthesis in green (RG) and purple leaves (PL) of SYQ at the transcriptome and metabolome levels. The accumulation of different types of anthocyanin and the expression of related genes were investigated. A connection network was constructed to highlight the regulatory genes associated with specific metabolites. Our findings provide insights into the accumulation mechanism of SYQ leaf colour pigments and the regulation of anthocyanin biosynthesis.

Plant materials and growth conditions
The green (RG) and purple leaf (PL) genotypes (S1 Fig) of Tetrastigma hemsleyanum (Sanyeqing) were collected from Guangxi and Hunan provinces, respectively. They were grown in the plant garden of Hangzhou Academy of Agricultural Sciences (Hangzhou, Zhejiang Province, China). Leaves at the third node away from the top were collected, frozen in liquid nitrogen and stored at -80˚C for RNA and metabolite isolation.

RNA isolation and transcriptome sequencing
Total RNA was isolated with TRIzol reagent (Invitrogen, USA) according to the manufacturer's protocol. The transcriptome sequence library was constructed using NEBNext Ultra RNA Library Prep Kits for Illumina (NEB, USA). Sequencing was performed on an Illumina HiSeq 2500 platform (Novogene, China). The reads were aligned to the genome using TopHat (2.0.9) software after removing the reads containing adapter or poly-N and low-quality reads from the raw data. Total number of reads per kilobase per million reads (RPKM) of each gene was calculated based on the length of the gene and the counts of reads mapped to this gene. GO annotation was implemented using Blast2GO software. KOBAS (2.0) software was used for KEGG enrichment analysis of differentially expressed genes.

Metabolic profiling
The samples were freeze-dried and crushed into power for metabolite isolation. One hundred milligrams of each sample were extracted with 1.0 ml 70% aqueous methanol at 4˚C overnight.

Verification of candidate genes by quantitative real-time PCR (qRT-PCR)
The expression levels of transcripts involved in anthocyanin biosynthesis pathways were validated by qPCR using the same RNA samples used for sequencing. The primers were designed using beacon designer 7.8 software, and cDNAs were synthesized using SuperScript™ III First-Strand Synthesis SuperMix for qRT-PCR (Invitrogen). Power SYBR1 Green PCR Master Mix (Applied Biosystems) was selected for the identification of the PCR products on a CFX384 Real-time PCR system (Bio-Rad). Three replicates were performed for each sample, and the primers of each sample are listed in S1 Table.

Metabolic differences in green leaves and purple leaves
To compare the metabolite composition in the two accessions, LC-MS was used for metabolite identification. In total, 597 metabolites with known structures were detected in RG and PL. Differentially expressed proteins were defined as those with a variable importance for projection (VIP) value >1.5 compared to RG (Fig 1). Among the detected metabolites, 111 metabolites were upregulated and 98 metabolites were downregulated. The top 10 most differentially accumulated metabolites are listed in Table 1 All differentially accumulated metabolites were subjected to KEGG analysis (Fig 2).Metabolites participating in flavonoid biosynthesis, flavone and flavanol biosynthesis, tryptophan metabolism and biosynthesis of alkaloids derived from the shikimate pathway were predominantly enriched.
Principal component analysis (PCA) was employed to identify the differences in metabolite profiles among samples (Fig 3). The results showed that the first principal component (PC1, 52.98% of the total variables) was clearly separated in the RG and PL samples, indicating that the accumulation patterns of metabolites were different in green and purple leaves.

Transcriptome analysis in green leaves and purple leaves
After removing the low-quality reads, approximately 30,000,000 reads of each sample were generated. De novo transcriptome assembly using Trinity built 205,387 transcripts and 100,540 unigenes for the samples, with an average transcript length of 1151 bp. The unigenes were mapped to the NR database and then applied to the Pfam database for annotation.
Using a false discovery rate (FDR) <0.01, log2FC>1 as threshold values, 4211 genes were found to be differentially expressed in PL vs. RG. Among them, 2035 genes were upregulated and 2176 genes were downregulated (Fig 4).
The enriched GO (Gene Ontology) terms for DEGs were analysed (Fig 5). In biology process (BP) categories, the most significant terms were metabolic process, cellular process, and single-organism process. In molecular function (MF) and cellular component (CC) categories, binding and catalytic activity, and cell and cell part were the most enriched, respectively. The identified genes were mapped to 105 KEGG reference pathways (Fig 6). These pathways included flavonoid biosynthesis (ko00941), purine metabolism (ko00230) and so on, and plant pathogen interaction (ko04626) and carbon metabolism (ko01200) had the highest enrichment factor values. In the flavonoid biosynthesis pathway, most genes were upregulated in purple leaves.
Transcription factors are essential regulators in anthocyanin biosynthesis. A total of 176 TFs were identified as either up-or downregulated between purples and green leaves. Of the most extensively studied TFs, one MYB and six bHLH were significantly different between purple and green leaves. MYB was highly expressed in purple leaves, four bHLH were upregulated, and two were downregulated.

Network analysis of metabolites and transcripts in SYQ
The identified anthocyanins, their relevant compounds and the related genes were mapped onto their corresponding position in anthocyanin pathway. The results indicated that the compounds in this pathway were different in purple and green leaves. The abundances of the transcripts and composition of the compounds are shown via heatmap (Fig 7).
As shown in Fig 7, pelargonidin, dihydrokaempferol, and pelargonidin 3-O-glucoside were significantly accumulated in purple leaves, and dihydromyricetin, cyanidin, and cyanidin 3-glucoside had higher levels in green leaves. The majority of the transcripts were promoted in purple leaves, although the expression levels of CHS and CHI were inhibited. In our results, the expression levels of the transcripts and metabolites in the pelargonidin biosynthesis pathway were increased.
To gain a better understanding of the regulatory network of the leaf colour in the two accessions, the Pearson correlation test was performed for the metabolites and the transcripts. In total, 14 genes and 16 metabolites involved in flavonoid and anthocyanin pathways were

PLOS ONE
subjected to Pearson correlation analysis (Fig 8). There were 141 significant correlation combinations between the genes and metabolites that had a Pearson correlation coefficient >0.8 and p value<0.05.  The expression levels of the key genes in the anthocyanin biosynthesis pathway were analysed by real-time quantitative PCR (qPCR). As shown in Fig 9, UGFT, ANS, DFR, F3'5'H, and F3H accumulated at a much higher level in the purple leaves, whereas the expression of CHI was reduced in purple leaves. The expression pattern of the genes involved in the anthocyanin biosynthesis pathway revealed by qPCR was consistent with the RNA-seq results.

Discussion
In this study, the differences in transcripts and metabolites in green and purple leaves were compared. Using ultra-performance liquid chromatography and tandem mass spectrometry, 597 metabolites were detected (Fig 1). As shown by hierarchical sample clustering, there was a clear separation between purple and green leaves (S2 Fig). In the differential accumulated metabolites, six metabolites annotated to anthocyanins were up-accumulated in purple leaves, while only two anthocyanin compounds were inhibited. To investigate the genes regulating the differential pigmentation of SYQ leaves, transcriptome profiling was engaged. Anthocyanin accumulation corresponds to the expression of the genes involved in the biosynthesis pathway [5,18]. Our study showed that most of the key genes in the anthocyanin biosynthesis pathway were upregulated in purple leaves (Fig 7), indicating that the anthocyanin pathway was promoted in purple leaves. This finding was consistent with the gene expression results of the key genes in anthocyanin biosynthesis pathway (Fig 9). The largest log2FC (purple/green) value was found in F3H, which had a value of 5.869. F3H catalyses the conversion of flavanone into dihydroflavanol [19]. The upregulation of F3H facilitates the accumulation of anthocyanins.
There are many types of anthocyanins; cyanidin, delphinidin and pelargonidin have been recognized as the most common anthocyanins in plants. Different distribution of anthocyanins results in different leaf colours in fruits and leaves. For instance, the major type of anthocyanins in berries is cyanidin [20]; red pigment is generated from the accumulation of pelargonidin [21], and delphinidin makes plants look purple. In our results, the biosynthesis pathway of pelargonidin-3-glucoside was promoted at both the transcriptional and metabolic levels. The genes and metabolites work together to regulate the production of anthocyanins.
Transcription factors including MYB, bHLH and WD40 play vital roles in the regulation of flavonoid and anthocyanin pathway [22][23][24]. Only MYB had a higher expression level in purple leaves, and different bHLHs transcripts were differentially expressed in different leaves. There are 8 WRKYs that showed higher expression levels in purple leaves. As suggested previously, WRKY may be involved in the regulation of anthocyanin biosynthesis. More anthocyanin was accumulated in the WRKY overexpressed lines compared to wild type in Arabidopsis [25].
Polysaccharides, phenolic acids and flavanols are the main compounds in SYQ with biological activities that can be used to cure some diseases. Although there are abundant bioactive compounds in SYQ root tubers [26], the shootsmay be more practical for use due to their easy access. In our results, the composition of the bioactive compounds in purple leaves was higher than in green leaves. Purple leaves may contain higher levels of antioxidant activity compared to green leaves. Therefore, purple leaves with higher anthocyanin contents are a superior resource for the improvement of SYQ quality.
Network analysis of the metabolites and transcripts was conducted. There were 141 significant correlation combinations between 14 genes and 16 metabolites. Transcript c90950. graph_c0 (CHS) was correlated with 14 metabolites, the most of those investigated (Fig 8). CHS is the initial enzyme of flavonoid biosynthesis; it catalyses the synthesis of naringenin chalcone from 4-coumaroyl-CoA and malonyl-CoA. In different plant species, the expression pattern of CHS of red tissues and green tissues are different [27]. The expression level of CHS was found to be significantly upregulated in red tomato fruits, while CHS expression was similar between pigmented and non-pigmented tomato mutants [28]. Data obtained indicated that the metabolites and genes work co-ordinately to regulate anthocyanin biosynthesis. Regulatory genes that were highly correlated with the accumulation of metabolites were identified; these could provide new insights into the regulatory mechanism of anthocyanin biosynthesis.

Conclusions
We analysed the regulatory network of anthocyanin biosynthesis integrating the metabolome and transcriptome in the purple and green leaves of SYQ. Correlation analysis of the metabolites and transcripts involved in anthocyanin biosynthesis pathway was conducted. A regulatory network of metabolites and transcripts related to leaf colour could be a resource for the exploitation of the mechanism of anthocyanin regulation in SYQ.