Transcriptome analysis of filling stage seeds among three buckwheat species with emphasis on rutin accumulation

Buckwheat is an important minor crop with pharmaceutical functions due to rutin enrichment in the seed. Seeds of common buckwheat cultivars (Fagopyrum esculentum, Fes) usually have much lower rutin content than tartary buckwheat (F. tartaricum, Ft). We previously found a wild species of common buckwheat (F. esculentum ssp. ancestrale, Fea), with seeds that are high in rutin, similar to Ft. In the present study, we investigated the mechanism by which rutin production varies among different buckwheat cultivars, Fea, a Ft variety (Xide) and a Fes variety (No.2 Pingqiao) using RNA sequencing of filling stage seeds. Sequencing data generated approximately 43.78-Gb of clean bases, all these data were pooled together and assembled 180,568 transcripts, and 109,952 unigenes. We established seed gene expression profiles of each buckwheat sample and assessed genes involved in flavonoid biosynthesis, storage proteins production, CYP450 family, starch and sucrose metabolism, and transcription factors. Differentially expressed genes between Fea and Fes were further analyzed due to their close relationship than with Ft. Expression levels of flavonoid biosynthesis gene FLS1 (Flavonol synthase 1) were similar in Fea and Ft, and much higher than in Fes, which was validated by qRT-PCR. This suggests that FLS1 transcript levels may be associated with rutin accumulation in filling stage seeds of buckwheat species. Further, we explored transcription factors by iTAK, and multiple gene families were identified as being involved in the coordinate regulation of metabolism and development. Our extensive transcriptomic data sets provide a complete description of metabolically related genes that are differentially expressed in filling stage buckwheat seeds and suggests that FLS1 is a key controller of rutin synthesis in buckwheat species. FLS1 can effectively convert dihydroflavonoids into flavonol products. These findings provide a basis for further studies of flavonoid biosynthesis in buckwheat breeding to help accelerate flavonoid metabolic engineering that would increase rutin content in cultivars of common buckwheat.

Introduction to investigate the variation in seed rutin accumulation differentiation among buckwheat species. Here, we collected a wild ancestral species of common buckwheat (F. esculentum ssp. ancestrale, Fea) from southwestern China, with smaller floral organs and seed size than cultivated species and seed rutin content that is similar to Ft (Fig 1). In the present work, we employed RNA-seq technology to identify genes involved in Fes, Fea and Ft metabolism, emphasizing elucidation of genes involved in rutin production and those differentially expressed among these buckwheat species.

Analysis of rutin in developing seeds by high performance liquid chromatography (HPLC)
Samples of the immature seeds were separately collected at three stages examined. In the first stage as the (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12), the ovulary starts to enlarge until the top of the seed is equal to the petal tip. In the second stage (15)(16)(17)(18)(19)(20), the seed top exceeds the petal tip till it is almost full length. In stage 3 (21-30DAP), the seed is filled with starch and still green in color. Seeds were separately freeze-dried for 48 h at -51˚C, then ground into a fine powder. Powdered seed samples (100 mg each) were then dissolved in 1mL methanol and put into a 37˚C water bath with ultrasonication for 1 h. After, these samples were centrifuged 5 min at room temperature and the supernatants filtrated with a 0.22 μM filter before loading 20 μL onto Ultimate LP-C18 column (4.6 × 250 mm, 5 μm; Welch, Jinhua, China) on a Shimadzu LC-2010A HPLC analyzer (Shimadzu, Kyoto, Japan). The column was held at 30˚C, with a flow rate of 1 mL/min. Rutin standards were diluted into five gradients in pure filtered methanol and run with a mobile phase of methanol: water: acetic acid [5:92.5:2.5 (v: v: v) % for buffer A; 95:2.5:2.5 (v: v: v) % for buffer B]. Rutin was detected at 365 nm and identified according to the retention time and Ultra Violet spectra of standards. Linear gradient elution began with 100% buffer A and 0% buffer B, followed by 65% buffer B at 8 min, 80% buffer B for 3 min, then 100% Buffer A for 5 min. The elution curve and rutin concentration of each sample were recorded and calculated for three replicates.

RNA extraction and cDNA library construction
The filling stage seeds were applied to RNA-seq. Fresh filling stage seeds were fast frozen and stored in liquid nitrogen until use. Total RNA from each sample was extracted using an RNeasy Plant kit (Qiagen, Valencia, USA). The purity of each RNA sample was checked with a Nano-Photometer spectrophotometer (IMPLEN, Munich, Germany), the integrity and concentration were determined using an Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, USA) and Qubit 2.0 Fluorimeter (Life Technologies, South San Francisco, USA), respectively.
The plant mRNA was prepared to construct the cDNA library, and RNA-seq was carried out on an Illumina HiSeq TM 2500 platform. In brief, mRNA was isolated from total RNA using oligodT magnetic beads. Short mRNA fragments (150-250 bp) were used as templates for first-strand cDNA synthesized with random hexamer primers and M-MuLV reverse transcriptase. Second-strand cDNA synthesis was carried out by adding dNTPs, RNase H, and DNA polymerase I. After purification, ends were repaired, adenine-base were added, and each end of the double-stranded cDNA fragments were ligated with Illumina-indexed adaptors. Properly sized fragments were selected and amplified by polymerase chain reaction (PCR) to generate templates. The cDNA templates were further enriched by amplification to generate the cDNA library. The cDNA library was then sequenced on an Illumina HiSeq™ 2500 platform using a paired-end pipeline.

De novo transcriptome assembly
Raw reads were trimmed off adaptor sequences for quality control. Then, reads containing more than 5% vague nucleotides and those with low quality containing more than 50% bases with a Q-phred value < 20 were eliminated; clean reads of samples were generated and used for RNA de novo assembly with Trinity program[version r20140413] [17], and the longest transcripts were taken and defined as unigenes. All unigenes were arranged in descending order from the first unigene. When the assembled length covered half the total length of all unigenes, the length of the current unigenes was considered to be N50 statistics. When the assembled length covered 90% of the total length, the length of the current unigene was considered to be N90 statistics.

Annotation and classification of the de novo transcriptome
BLASTx alignment (E<10 −5 ) between unigenes and protein databases was performed. The best alignment results were used to determine the sequence direction, coding regions and amino acid sequence. All unigenes were searched for homologous genes using BLAST and annotation against the NCBI Non-Redundant Protein (NR) and Nucleotide (NT) databases (http://www.ncbi.nlm.nih.gov/) using an E-value cut-off of 10 −5 . Unigene sequences were also aligned by BLASTx to various protein databases in the following order: . The unigenes were sorted to recover proteins with the most similarity to unigenes with putative functional annotations. GO terms at the second level were used to perform the GO annotation of unigenes which assigned into the biological process, molecular functions and cellular components classification. Unigene sequences were also aligned to the KOG database to predict and classify possible functions, and pathway assignments were provided according to the KEGG pathway database.

Identification and annotation of seed metabolism relevant genes
A set of representative keywords was used to predict starch metabolism, flavonoid biosynthesis-related genes, and other metabolic relevant genes important for seed development based on BLAST annotation results. Each gene was individually searched by gene name and related gene regions that were conserved in other species were identified. To find the most important genes related to flavonoid and starch metabolic system functions, GO term and KEGG pathway information were also used.

Quantitative reverse transcriptase-PCR (qRT-PCR)
A QuantiTect Reverse Transcription Kit (Qiagen, Valencia, USA) was used according to the manufacturer's instructions to generate the first strand cDNA after extracting total RNA from samples subjected to RNA-seq. The expression levels of 11 flavonoid biosynthesis DEGs were confirmed by qRT-PCR with the resulting cDNA products as templates. Primer sets were designed from the coding region results of sequencing data. Conserved domains were chosen as the area without single nucleotide polymorphisms (SNPs) and other mutations which may affect the efficiency of amplification by resequencing and redesign. FastStart Universal SYBR Green Master Mix (10μL; Roche, Indianapolis, USA) was mixed with gene-specific primers (3μL), sterilized water (5μL), and the synthesized cDNA (2 μL) in a 20 μL total as the total reaction volume. Reactions were performed on an ABI 7300 Real-Time cycler (Applied Biosystems, San Francisco, USA). Preheated to 95˚C for 10 min; the three-step program (50 cycles) began at 95˚C for 15 s, followed by 60˚C for 30 s, and then72˚C for 30 s. The 2 -ΔΔCt method was used to calculate the relative expression level of each unigene and tartary buckwheat histone H3 was selected as the reference gene for normalization [18].

Statistical analyses
Statistical analyses were performed with Excel (2016). The data are expressed as the means and standard deviations of three replications. One way ANOVA analysis of variance was used to compare the significant difference. We declared significance at an α level 0.05 for all statistical analyses followed by Tukey's post hoc comparisons. A P < 0.05 ( Ã ) was considered statistically significant and P < 0.01 ( ÃÃ ) was considered statistically extremely significant for all data.

HPLC analysis of rutin content in developing buckwheat seeds
HPLC analysis was used to quantify the rutin content in developing Fea, Ft, and Fes seeds at three different stages of development (Fig 2). In all samples, changes in rutin content with growth were similar, reaching their highest value at the very beginning of seed development and decreasing with maturation. However, the absolute content varied by species. In the filling stage, rutin content in Fes decreased to half that in Ft, while levels in Fea were higher than Ft.
In stage 3, the rutin content decreased to almost undetectable levels in Fes; rutin levels in Ft and Fea were similar and 10-times higher than in Fes.

RNA-seq and de novo assembly in filling stage buckwheat seeds, classification and annotation
In order to assess the mechanism of rutin production, we performed RNA-seq on Fea, Ft, and Fes buckwheat seeds. Three cDNA libraries were constructed from fresh filling stage seeds. Sequencing of these libraries generated 29.74, 27.12 and 30.61 million clean reads (filtered from 30.35, 28.56, and 31.9 million raw reads respectively) with Q20 scores of 95.48%, 95.2%, and 95.09%, respectively, and the GC contents of these species were 45.98%, 47.15%, and 47.97%, respectively (S1 Table). These data were pooled together to create a comprehensive reference transcriptome, which was then used to facilitate analysis of differential gene expression and abundance in each sample. The clean reads were de novo assembled into 109,952 unigenes using Trinity [17] and had a mean length of 677 bp; the minimum length was controlled by a quality limitation to 201bp (S2 Table). The size distribution of these unigenes is shown in   Table) to check the uniform of annotated unigene sequence involved in flavonoid biosynthesis. Based on the NR annotation and E-value distribution, 34.3% of the mapped sequences showed significant homology (E<10 −30 ) and 32.4% showed very significant homology (E<10 −100 ) to the available plant sequences. Analysis of species annotation distribution showed that Fagopyrum spp. had the most matched homologous unigenes with Beta vulgaris (23%), followed by Vitis vinifera (11.3%), Theobroma cacao (3.7%), Jatropha curcas (3.2%), Nelumbo nucifera (3.1%), and others (55.7%) [Fig 3].
A total of 25,666 differentially expressed genes were detected among three Fagopyrum spp. libraries, of which 11,658, 24,335, and 2385 DEGs were predicted from Fea versus Ft, Ft versus Fes, and Fes versus Fea, respectively ( Fig 5). Q <0.001 and the absolute value of log2 ratio !1 was used as the screening threshold for differentially expressed genes [19].
We also used KEGG assignments to classify the functions of the DEGs in pairwise comparisons among the filling stage seeds in the FT, FEA and FES. For the KEGG pathways, the specific enrichment of unigenes was observed for several pathways that were involved in flavonoids metabolic processes, such as phenylalanine metabolism (ko00360,19 unigenes), phenylpropanoid biosynthesis (ko00940,20 unigenes) and flavonoid biosynthesis (ko00941,9 unigenes), with multiple terms assigned to the same unigenes. When Fea was compared with Fes, enriched genes were involved in phenylpropanoid biosynthesis, flavonoid biosynthesis and alpha-linolenic acid metabolism (ko00592), ubiquinotne and other terpenoid-quinone biosynthesis (ko00130), tyrosine metabolism (ko00350), diterpenoid biosynthesis (ko00904), phenylalanine, tyrosine and tryptophan biosynthesis (ko00400). Fea compare to Ft was down regulated in ribosome biogenesis in eukaryotes (ko03008). Up-regulated unigenes in diterpenoid biosynthesis (ko00904), selenocompound metabolism (ko00450), Flavonoid biosynthesis (ko00941). Fes versus Ft was down regulated in fatty acid elongation (ko00062).
Eukaryotic transcripts were classified by the KOG database. KOG annotation was performed for the whole Fagopyrum spp. as showed in Fig 4C. The result showed that most of the genes (2539 unigenes) were clustered under the general function prediction category due to the lack of related genome under the Polygonaceae order. We found 1928 unigenes were categorized as posttranslational modifications, protein turnover, and chaperones, which illustrates active biosynthesis during seed development; 1301 unigenes were involved in signal transduction mechanism; 1104 unigenes were related to translation, ribosomal structure, and biogenesis; 897 unigenes were associated with intracellular trafficking, secretion, and vesicular transport; 862 unigenes were implicated in transcription;844 unigenes were involved in secondary metabolite biosynthesis, transport, and catabolism; 807 unigenes were related to carbohydrate transport and metabolism. Other genes involved in metabolism were divided into lipid transport and metabolism (647 unigenes), amino acid transport and metabolism (608 unigenes), inorganic ion transport (428 unigenes), coenzyme transport and metabolism (158 unigenes), and nucleotide transport and metabolism (183 unigenes).

DEGs involved in flavonoid biosynthesis and qRT-PCR validation
The RNA-seq gene expression level was screened using absolute value of log2 fold change >1 and the FPKM >1. qRT-PCR validation was performed based on the sequencing results. Primers were designed based on the alignment of the most conserved domain on the UniProt database (http://www.uniprot.org); histone H3 was used as the reference gene. We tested the expression of 11 DEGs involved in flavonoid biosynthesis with qRT-PCR to validate results using FPKM levels ( Fig 6C). Based on log2-fold change measurements, the correlation between RNA-seq and qRT-PCR were evaluated (S5 Table). The results showed that the relative expression trends of these DEGs were significantly correlated to the data as those of RNA-seq (r 2 = 0.85). leucoanthocyantin reductase, cinnamic acid 4-hydroxylase, chalcone synthase, dihydroflavonol 4-reductase, flavonol synthase 1, anthocyanidin synthase, anthocyanidin reductase and flavonoid 3'-hydroxylase were the differentially expressed structural genes. Differentially expressed regulator genes: Myb family APL isoform X1, bHLH94-like protein, and bHLH67 were also validated (S6 Table).
FLS1 was highly expressed in Ft, being 7.38-fold higher than in Fes, whereas expressed 4.32-fold in Fea comparing to Fes. Myb family APL isoform X1 and bHLH67 showed significant abundance in Fea and Fes versus Ft, whereas bHLH94-like protein was almost 1.9-fold higher in Ft than in Fea and Fes. CHS expression was more than 1 -fold higher in Fea versus Ft. C4H expression was more than 2-fold higher in both Fes and Fea compared to Ft, while DFR expression was about 1.5-fold higher in Fes versus Ft; however, ANS expression levels in Fes and Fea were almost the same that is more than 5 -fold in Ft. Taken together, these results suggest that the FLS1 transcript level may be associated with rutin accumulation in filling stage seeds of buckwheat species.

Storage proteins involved in the development of Fagopyrum spp. seeds
Seed storage proteins are important for the determination of total protein content in seeds and their quality for various end uses, especially as a dietary protein source. There are four groups of storage proteins: globulins, albumins, prolamins, and glutelins. As a pseudocereal crop, buckwheat seeds have a high content of proteins with balanced ratio and concentrations of essential amino acids. During seed development, storage proteins accumulated. Among them, the globulins are the most abundant ones. 13S legumin-like protein and vicilin-like protein and 2S albumin are three main storage proteins in buckwheat seeds. [20] In the buckwheat transcriptome, 13S globulin seed storage proteins (legumin-like) were annotated with 3 unigenes (c72845_g2, c72845_g1, and c65459_g3), 7S globulin got hits with 9 unigenes (c69879_ g1, c68087_g1, c72404_g1, c74029_g1, c71235_g1, c77859_g1, c69916_g1, c73369_g2, and c43633_g1; Fig 7A). legumin A-and B-like proteins each got 1 unigene hit (c73084_g1and c73084_g2, respectively), glutelin type-B 5-like was annotated with 1 unigene (c73921_g1), seed storage 2S albumin superfamily protein was annotated with two unigenes (c47296_g1 and c101651_g1) and vicilin-like protein was annotated with 4 unigenes (c66488_g1, c66488_g2, c74713_g1, and c59598_g1). These genes were highly expressed in Fes and Fea species. There were no hits with the prolamins in any of the buckwheat species. These seed storage proteins are important for seed development and further germination, which could cause some allergic responses in human beings.
Proteins annotated with allergens were searched manually. The pollen allergen MetE from Amaranthus retroflexus was expressed in all three species (c80709_g1), and c63473_g1 was annotated as calcium-binding allergen Ole from B. vulgaris ssp. vulgaris. Two unigenes for different 16-kDa allergens were found in Fea, Fes, and Ft (c60579_g1 and c52860_g1), though c52860_g1 had a relatively lower FPKM value. Unigene c68583_g1 annotated as a BW10-kDa allergen protein from Ft; c69345_g1 also annotated as an allergenic protein from Ft, but with a 1000-2000 FPKM level in Fes and Fea and almost 0 FPKM in Ft (Fig 7B).
New genes of P450 family found in seed transcriptome annotation data of buckwheat species P450 is one of the biggest gene super families [21]. Transcriptome analyses have contributed to the increase in P450 sequences; however, among the annotated genes, most were hypothetical proteins with unknown functions in the present study. Plant cytochrome P450s are implicated in a wide variety of biosynthetic reactions and interact with many biomolecules. Recently, investigations have focused on plant hormones and secondary metabolites, lignin biosynthesis and defensive compounds. The CYP78A5 protein annotated by 2 unigenes (c70046_g1 and c41289_g1) may function at the meristem or organ boundary by regulating directional growth [22]. It may also be required in developing ovules to activate cell proliferation and promote seed growth in association with CYP78A7. CYP76AD1 was annotated by 10 unigenes (c56758_g1, c69801_g2, c47614_g1, c58218_g1, c5119_g1, c55327_g1, c51949_g1, c97945_g1, c89448_g1, and c74352_g2) and may be involved in the betalain biosynthetic pathway, which is part of pigment biosynthesis. CYP82C4 was annotated by 2 unigenes (c52803_g1 and c69344_g2) and may function in the early iron deficiency response. CYP94A1, annotated by 6 unigenes (c75089_g1, c78982_g1, c78409_g1, c53666_g1, c71976_g2, c41575_g1) could catalyze the hydroxylation of various fatty acids and may also be involved in plant defense. CYP714A1, annotated with 2 unigenes (c71622_g3 and c46063_g1), may be involved in the repression of early gibberellin intermediates, and CYP71A1 annotated with 10 unigenes (c73439_g1, c63140_g1, c57034_g1, c66035_g1, c62008_g2, c75811_g1, c52794_g2, c79372_g1, c57105_g1, and c77010_g1) and may function in the development of flavor during seed maturation by acting as trans-cinnamic acid 4-hydroxylase (Fig 7D).

Starch and sucrose metabolism genes
Starch accumulated in the amyloplasts of seeds over a long growing season. Structurally two types of starch are synthesized by different enzymes. In the buckwheat seeds, Granulebound starch synthase I was annotated with 9 unigenes (c77665_g1, c77516_g1, c74850_g1, c73329_g1, and c64737_g1). c71042_g1 was annotated as granule-bound starch synthase II (Fig 8A). c77612_g1 and c58034_g1 were identified as the ADP-glucose pyrophosphorylase catalytic subunit from Perilla frutescens. Starch synthase (SS) I annotated as c76571_g1, c60931_g1 was annotated as SSII from A. cruentus, 3 unigenes (c80736_g1, c61026_g1 and c76188_g1) were annotated as SSIII, and 4 unigenes (c74379_g2, c71948_g1, c76970_g1, and c76970_g2) were annotated as SSIV ( Fig  8B). In the starch biosynthesis pathway, c59083_g1 and c20595_g1 were annotated as starch branching enzyme I from Medicago truncatula and part of starch branching enzyme IIa from Triticum aestivum, respectively. c79579_g1 was annotated as an isoamylase1-type starch debrancing enzyme, c79406_g1 was annotated as isoamylase 2 from B. vulgaris ssp. vulgaris, and c78640_g1 was predicted as isoamylase3 from Citrus sinensis (Fig 8C).
For the carbohydrate fixation pathway, there is a close correlation between the metabolism of starch and sucrose. Sucrose synthase genes were identified: c78731_g1 had the highest expression and was annotated from A. cruentus as sucrose synthase1; c58555_g1 and c66318_ g2 were also identified as sucrose synthase1, but with 100-and 1000-times lower FPKM values, respectively, compared with c78731_g1. c80681_g2 and c80681_g1 wereas annotated as sucrose synthase 2 isoforms from A. hypochondriacus and Eucalyptus grandis, respectively; c79495_g1 was predicted as sucrose synthase 5 isoform X1 (Fig 8D).

Discussion
De novo sequencing methods have been performed on Fagopyrum spp. flowers and roots, and the flavonoids biosynthetic pathway-related genes have been discussed [24,15]. In the present study, we sequenced the seed transcriptomes of three Fagopyrum spp. by Next Generation RNA-seq, focusing on nutritional properties and rutin content of Ft, Fes and Fea filling-stage seeds was determined. In the filling stage , RNA transcription activity is more vigorously than in mature seeds [25]. Most flavonoids biosynthesis enzymes are conserved among buckwheat species. In this study we were managing to compare the relative expression levels of unigenes that may influence the final products in flavonoid biosynthesis pathway. We used the pooled data to assemble a uniform reference transcriptome. Then we compared the FPKM value of each candidate unigene. Though the method used in this study has the potential to be improved, the quantitative RT-PCR proved the coherence of the RNA seq results. Enrichment analysis showed that the transcription levels of enzymes in the flavonoid biosynthesis pathway correlated with higher rutin content in developing Fea seeds compared to Fes. However, the DEGs analysis in Ft and Fes did not show obvious enrichment in those flavonoid pathways, which may be attributed to the huge difference in transcript background between these two species. As such, DEGs analysis also showed a lower amount of differentially expressed genes between Fes versus Fea and Fea versus Ft relative to Fes Versus Ft. Hence, Ft and Fes cultivars cannot cross with each other due to their dramatically genome size and structure [26]. Though both are diploid plants, common buckwheat has more repeated sequences and complex chromosomal structures distributing all over its genome [24]. Comparative BLAST analysis of orthologous genes among three buckwheat species showed the evolutionary relationship between Fea and Fes was more closely related than the one between Fes and Ft.
These results may pave the way for future breeding strategies to create high rutin content common buckwheat germplasms.
In previous studies, flavonoid level accumulated in the seeds confirmed the correlation between species and the stages of development, as well as metabolic state, and duration and intensity of environmental stresses [27][28][29][30]. Furthermore, although there was a fluctuation in rutin content during seed development in all species, the final stage of Fagopyrum spp. seed maturation was found to contain different levels of rutin. Our results showed that 20-30 DAP, Ft contained similar amounts of rutin as Fea, whereas Fes rutin content was hardly detected by HPLC. Flavonoid synthase is an important gene family in the biosynthesis pathways.  reported two kinds of FLS in common buckwheat, FLS1 and FLS2 [31]. Based on the current BLAST results, at least two isoforms of FLS were found in the transcriptome of Fagopyrum spp. FLS2 was expressed with almost no difference in all three species (validated by q RT-PCR), and may function as a stress response enzyme [32,33][32] [33]. However, FLS1 was highly expressed in both Ft and Fea, whereas Fes had significantly lower expression. As in Arabidopsis, only FLS1 functions in flavonoid biosynthesis in a manner analogous to Fagopyrum spp. [34]. FLS also controls the divergence of the flow to different products: quercetin and anthocyanin. Anthocyanins and flavonols reportedly share common precursors and that their accumulation pattern is regulated by miR156-targeted SPL, a transcription factor that disrupts the of MYB-bHLH-WD40 complex, leading to accumulation of anthocyanins instead of flavonols [35,36]. These findings showed that dihydroflavonols (dihydrokaempferol and dihydroquercetin) were the same substrates of DFR and FLS1. There should be an equilibrium between these two enzymes as they compete for the same substrate, leading to different products in plants. In crabapple, the ratio between FLS and DFR transcript abundances determine the color of the leaves; when McFLS was overexpressed or Mc5 was silenced, flavonols production was elevated [37]. The content of pigments in rose, peach, carnatio, zealea, camellia, and petunia flowers were determined, as well as the FLS and DFR gene expression levels between white and red flowers. FLS and DFR enzymes direct the production of flavonols and anthocyanins, respectively [36]. Elucidating biosynthetic mechanisms will make metabolic engineering of plants easier in order to produce different final products [15,38]. Current qRT-PCR and FPKM results showed that some enzymes coding for late stage flavonoid biosynthesis were highly expressed in Fes and Fea, especially those involved in proanthocyanin biosynthesis (LAR, DFR, ANS, and ANR). FLS and DFR enzymes may compete for the same substrates to direct synthesis of anthocyanins or flavonols like rutin [36]. Our results showed that FLS1 expression levels are higher in Fea than Fes, which may play a crucial role in fine tuning of anthocyanin and flavonol biosynthesis.
Despite enzymes involved in metabolism, the majority of buckwheat seed proteins are storage proteins which mainly accumulated in the late embryogenesis stages. The accumulation of storage proteins may influence seed germination. In buckwheat species, the FPKM results indicate that the most abundant unigenes of storage proteins are 13S globulins, vicilin like proteins, BW10KD allergen protein, and legumin A like proteins, legumin B like proteins. Besides, Annexin proteins also expressed highly in Fea and Fes which may function in infection, wounding, and abiotic stress responses. These results indicated that the seed storage protein expression patterns may be different among buckwheat species. Fea shared similar pattern with Fes.
Developing seeds import sucrose, which is cleaved to provide carbon skeletons for the synthesis of other storage compounds [39]. We have detected the sucrose-phosphate synthase and sucrose phosphatase which play key roles in sucrose biosynthesis, whereas the sucrose synthase function in the cleavage of sucrose mainly. In our data, the sucrose synthase transcripts expression levels were much higher in the seeds of three buckwheat species during the filling stage.
In contrast, the abundance of enzymes of sucrose biosynthesis are almost 100 times lower than sucrose synthase. The sucrose may be catalyzed by sucrose synthase in the buckwheat seeds to produce UDP-glucose and fructose for the further synthesis of starch in the endosperm. Sucrose also facilitate the storage cell differentiation and reserve accumulation in the developing plant embryos [40].
The granule bound starch synthase and starch synthase gene expression ratio may correlate with the amylose and amylopectin ratio in the seeds such as maize [41]. We have found several GBSS transcripts, c77516_g1 was annotated as Fagopyrum tataricum granule bound starch synthase 1, expressed in the level about 100 FPKM in filling stage seeds of three species. While there is a transcript of c77665_g1 annotated as granule bound starch synthase 1 in Nelumbo nucifera expressed as 2979 FPKM in FEA, 4354 in Fes and 1808 in Ft which may function in other ways. For the starch synthase 1 (c76571_g1) expressed in the level more than 200 FPKM; starch synthase 2(c60931_g1) expressed at relative much lower levels as around 1 FPKM in those seeds.
Collectively, we sequenced RNA from filling stage seeds from three buckwheat species. The discovery of new genes in buckwheat for the first time in different nutrients pathways will provide further bases and data for future studies. Understanding the molecular regulation of flavonoid biosynthesis in buckwheat species will bolster the breeding and engineering of this plant for different end uses for human beings.