Identification and Expression of Fructose-1,6-Bisphosphate Aldolase Genes and Their Relations to Oil Content in Developing Seeds of Tea Oil Tree (Camellia oleifera)

Tea oil tree (Camellia oleifera, Co) provides a fine edible oil source in China. Tea oil from the seeds is very beneficial to human health. Fructose-1,6-bisphosphate aldolase (FBA) hydrolyzes fructose-1,6-bisphosphate into dihydroxyacetone phosphate and glyceraldehyde 3-phosphate, two critical metabolites for oil biosynthesis. The objectives of this study were to identify FBA genes and investigate the relationship between FBA gene expression and oil content in developing seeds of tea oil tree. In this paper, four developmentally up-regulated CoFBA genes were identified in Camellia oleifera seeds based on the transcriptome from two seed developmental stages corresponding to the initiation and peak stages of lipid biosynthesis. The expression of CoFBA genes, along with three key oil biosynthesis genes CoACP, CoFAD2 and CoSAD were analyzed in seeds from eight developmental stages by real-time quantitative PCR. The oil content and fatty acid composition were also analyzed. The results showed that CoFBA and CoSAD mRNA levels were well-correlated with oil content whereas CoFAD2 gene expression levels were correlated with fatty acid composition in Camellia seeds. We propose that CoFBA and CoSAD are two important factors for determining tea oil yield because CoFBA gene controls the flux of key intermediates for oil biosynthesis and CoSAD gene controls the synthesis of oleic acid, which accounts for 80% of fatty acids in tea oil. These findings suggest that tea oil yield could be improved by enhanced expression of CoFBA and CoSAD genes in transgenic plants.


Introduction
Camellia oleifera (Co, tea oil tree) provides high quality edible oil. This small tree is originated and widely cultivated in China. Tea oil (Camellia oil) from the seeds is a sweetish seasoning and cooking oil with over 80% monounsaturated fatty acids. The seed residue after oil extraction has been widely used for laundry purposes in rural areas. Tea oil is sold as cooking oil in supermarkets throughout China as well as Australia, New Zealand and the United States.
Tea oil is very beneficial to human health. Tea oil can lower cholesterol, decrease lipid concentration and prevent hypertension and hardening of arteries [1][2][3]. Tea oil has anti-microbial and antioxidant properties and known to induce cell cycle arrest and apoptosis in cancerous cell lines [1,4,5]. Tea oil has also been shown to have antiulcer effects against ketoprofen-induced oxidative damage in the stomach and intestine [6]. Flavonoids from seed shells have analgesic and anti-inflammatory effects [7,8]. Defattened seeds of the plant have saponins with anti-inflammatory properties [2]. However, the development of tea oil industry is limited due to the low oil yield of tea oil tree. Most of the early research efforts were focused on fatty acid analysis of the seeds. In order to improve tea oil production, many genes coding for key enzymes in tea oil biosynthesis pathway have been identified in the tree [9][10][11][12][13][14].
Fructose-1,6-bisphosphate aldolase (FBA, EC4.1.2.13, or simply aldolase) is a key enzyme in the glycolytic pathway. FBA catalyzes a reversible reaction by converting fructose-1,6-bisphosphate (FBP) into dihydroxyacetone phosphate (DHAP) and glyceraldehyde 3-phosphate (G3P) [15]. DHAP and G3P are two key intermediates for oil biosynthesis. DHAP is converted to glycerol-3-phosphate by glycerol phosphate dehydrogenase [16], which is used to generate phosphatidic acid by the action of acyltransferases. Phosphatidic acid is then hydrolyzed by phosphatidic acid phosphatase to produce diacylglycerol (DAG), the key substrate of diacylglycerol acyltransferease (DGAT) for the synthesis of triacylglycerols (TAGs) [17,18]. Meanwhile, G3P is converted to pyruvate by multiple enzymatic reactions, which is used to generate acetyl-CoA. Acetyl-CoA is converted to malonyl CoA by acetyl-CoA carboxylase. Malonyl CoA is then utilized for the synthesis of fatty acids, the other key element for oil biosynthesis [19]. Therefore, FBA not only affects fatty acid synthesis but also provides acylglycerol for oil biosynthesis.
The objectives of this study were to identify FBA genes, analyze the expression patterns of these genes and investigate the relationship between FBA gene expression and oil content/fatty acid composition in developing seeds of tea oil tree. In this paper, the full-length cDNAs for CoFBA1, CoFBA2, CoFBA3 and CoFBA4 genes were identified from Camellia oleifera based on the analysis of its seed transcriptome digital library. Using real-time quantitative PCR (RT-qPCR), the expression of CoFBA genes was quantified using RNA from eight development stages of seeds. Expression profiles of three key fatty acid synthesis genes coding for acyl carrier protein (CoACP) [11], stearoyl-ACP desaturase (CoSAD) [13] and oleate desaturase (CoFAD2) [20] were also analyzed in the seeds. We further analyzed the oil content and fatty acid composition of the seeds from eight developmental stages. Our results showed that CoFBA as well as CoSAD gene expression was well-correlated with oil content in Camellia seeds.

Ethics Statement
No specific permits were required from collecting the samples because the trees were public-owned and the field studies did not involve protected species.

Plant Materials and RNA Isolation
Camellia oleifera var. 'Hua shuo' was used for this study. Eight stages of seeds were collected for gene expression analysis (May 5, June 5, July 4, August 4, September 4, September 11, September 26 and October 24, 2010). Eight different stages of seeds were used for oil content and fatty acid composition analysis (August 25, September 4, September 11, September 19, September 26, October 3, October 10 and October 24, 2010). RNA from the seeds collected on June 5 and October 24 was used to construct transcriptome libraries. Total RNA was isolated from the seeds with PureLink RNA Mini Kit according to the manufacturer's instruction (Invitrogen, USA). The potential genomic DNA contamination in the RNA samples was eliminated by RNasefree DNaseI digestion (Fermentas, Canada).

cDNA library Construction
Poly-A mRNA was purified from total RNA isolated from the June 5 and October 24 seeds using oligo (dT) magnetic beads and fragmented into 200-500 bp pieces using divalent cations. The mRNA fragments were reverse transcribed into first-strand cDNA using SuperScript II reverse transcriptase and random primers (Life Technologies). The second-strand cDNA was synthesized by E. coli DNA polymerase I (Invitrogen, USA). After doublestranded cDNA synthesis, fragments were end repaired and Atailed. The final cDNA library was created by purifying and enriching the double-stranded cDNA with PCR.

Transcriptome Sequencing and Unigene Analysis
The cDNA sequences were determined through a paired-end flow cell using HiSeq 2000 platform (Beijing Genomics Institute, Shenzhen, China). The clean reads after sequencing were de novo assembled using Trinity with default K-mers = 25 [21]. The contigs without ambiguous bases were obtained by conjoining the K-mers in an unambiguous path. The clean reads were mapped back to contigs using Trinity to construct unigenes with the pairedend information. Finally, the contigs were connected with Trinity, and sequences that could not be extended on either end are defined as unigenes for each library. Unigene sequences were aligned with those in the online database NR, SwissProt, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Group of proteins (COG). Unigenes were categorized into gene ontologies (GO) and functionally annotated using Blast2GO software [22]. Unigenes with p-value ,0.00001 were regarded as known genes [23]. Pathway enrichment analysis of unigenes was performed using KEGG database. FBA genes were identified from the unigenes for digital expression profile analysis [24].

Identification and Digital Analysis of FBA Gene Expression
CoFBA genes were identified from the unigenes for digital expression profile analysis [24]. NCBI's Blast program was used to identify CoFBA genes from the Camellia oleifera seed transcriptome libraries based on sequence conservation to FBA genes from other plants. The full-length cDNAs of CoFBA genes were obtained from cDNA libraries by PCR using RACE technique. To compare differences in FBA gene expression, tag frequencies of both libraries were analyzed by IDEG6 software (http://telethon. bio.unipd.it/bioinfo/IDEG6/) according to the method described by Audic and Claverie [24]. The false discovery rate (FDR) was used to determine the threshold P-value for multiple testing. Calculation of unigene expression uses the reads per kb per million reads (RPKM) method [25]. FDR,0.001 and absolute value of the log2 RPKM ratio .1 were used as the threshold to determine significant differences in gene expression.

Quantitative Gene Expression Analysis
Genes coding for several CoFBA and three oil biosynthetic enzymes/proteins (CoACP, CoFAD2 and CoSAD) were selected for quantitative expression analysis by RT-qPCR. ACP, SAD and FAD were selected for comparative studies and used as positive controls related to oil biosynthesis in the seeds because they are known components in oil biosynthesis pathway. Based on earlier experimental results [26], CoGAPDH gene was used as the reference gene for quantitative gene expression by RT-qPCR. The names, GenBank accession numbers and primer sequences of the genes for RT-qPCR assays are showed in Table 1. The relative abundance of mRNAs at different developmental stages was analyzed by CFX manager software (Bio-Rad, USA). RT-qPCR was performed in triplicates by SYBR Green qPCR assay essentially as described [27] with 10 mM each of the forward and reverse primers using the following PCR program: initial denaturation at 95uC for 5 min, followed by 40 cycles of 95uC for 10 sec and 55uC for 30 s. The relative gene expression for RT-qPCR data was calculated by the 2(-Delta Delta C(T)) Method [28].

Oil Content and Fatty Acid Composition Analysis
Tea oil was extracted from seeds collected at eight developmental stages by Soxhlet extraction method according to the standard protocol of ''Determination of crude fat in foods'' (State Standard of the People's Republic of China, GB/T 14772-2008). Briefly, seed oil was extracted from the seeds with petroleum ether, dried and weighted. The extraction was repeated three times for each oil sample. Oil-yield rate = (oil quantity/seed quantity)6100%. Seed lipids in the oil extract were converted to methyl esters by KOH-methanol solution and extracted with heptane. The organic phase containing lipids was transferred into a vial for GC analysis using a Gas Chromatograph (SHIMADZU GC-2014) following the basic methyl esterification method [18,29]. GC analysis was performed using a FFAP capillary column (30 m60.25 mm60.25 mm) with the following conditions: 60uC R180uC (25uC/min,remain 1 min) R210uC (3uC/min, remain 1 min) R212uC (0.3uC/min,remain 1 min) R240uC (8uC/ min,remain 2 min); inlet and detector temperature: 240uC; the split ratio: 1:50; carrier gas flow velocity:N 2 at 1.23 mL/min; compressed air: 400 mL/min; H 2 : 40 mL/min; sample injection: 1 mL. The oil content and fatty acid composition data were analyzed by SPSS Statistics 17.0 software (http://www.stathome. cn).

Gray Correlation Analysis
The correlation coefficient between gene expression levels and oil content and fatty acid composition was obtained by the grey correlation analysis software (V2.1) [30,31]. The oil content was used as reference series and the mRNA levels of the six genes were used as comparison series. The higher correlation coefficient between the mRNA levels and oil content/fatty acid composition means the more positive effect of the gene product on oil content/ fatty acid composition. The basic idea of grey relational analysis is based on the similarity of sequence curve geometry. When the curves are closer, the correlation between the corresponding sequences is larger. When grey relational degree is bigger, the two factors have more consistent trends.

Statistics of Transcriptome Sequencing Data
Transcriptome sequencing of the pair-ended cDNA libraries from two developmental stages of Camellia oleifera seeds collected on June 5 and October 24 generated 65,536 non-redundant unigenes with a total length of 24,154,817 nucleotides. The lengths of these unigenes were mainly distributed between 100 and500 nucleotides ( Figure 1). These results indicated that our transcriptome sequencing was completed with high quality. All nonredundant unigenes could be classified into three GO (gene ontology) categories including molecular function, cellular com-ponent and biological process, which consisted of 43 functional groups ( Figure 2).

CoFBA Expression Analysis of the Transcriptome Data
Six FBA unigenes were identified from the transcriptome libraries constructed using two developmental stages of the oilseeds: Unigene 23265, Unigene 47982, Unigene 58261, Unigene 62390, Unigene 37785 and Unigene 63726. These CoFBA genes exhibited differential expression at different developmental stages of Camellia oleifera seeds (Table 2). Unigene 47982, Unigene 58261 and Unigene 62390 had False Discovery Rate (FDR),0.01 and the differential expression was more than two folds between the peak and initial developmental stages ( Table 2). These expression results suggest that the three FBA genes might play significant roles during tea oil biosynthesis process. In contrast, Unigene 37785 and Unigene 63726 genes did not express much at the initial developmental stage and their expression was down-regulated during the peak stage of oil accumulation ( Table 2), suggesting that these two genes might have little effect on lipid biosynthesis in the seeds.

CoFBA Gene Expression Patterns during Seed Development
The expression patterns of CoFBA genes were experimentally studied by RT-qPCR using RNA isolated from eight seed development stages (Figure 3). The four FBA genes including CoFBA1 (Unigene 58261), CoFBA2 (Unigene 47982), CoFBA3 (Unigene 23265) and CoFBA4 (Unigene 62390) showed similar expression patterns (Figure 3). Their mRNA levels were low but detectable at the initial oil synthesis stage (May 5 sample), gradually increased during the next two months, up-regulated sharply in early September and reached peak levels in late September (Figure 3). In more mature seeds collected in late October, the expression levels of CoFBA1 and CoFBA3 were slightly down-regulated, whereas those of CoFBA2 and CoFBA4 were still increased. These qPCR data (Figure 3) are in agreement with those of the transcriptome data described above (Table 2). Both lines of data support the roles of these genes in oil accumulation in tea oil seeds.

CoACP, CoSAD and CoFAD2 Gene Expression during Seed Development
RT-qPCR was also used to analyze the expression patterns of CoACP, CoSAD and CoFAD2, three genes directly involved in fatty acid metabolic pathway in the oilseeds. The common features of the expression of these three genes were that their relative expression levels in the seeds were extremely low during the early seed development but dramatically increased in early August and reached peak levels in late September followed by a sharp decline in late October ( Figure 4). Some minor differences in mRNA levels among the three genes were noticed. During the period from May to October, CoACP mRNA levels exhibited two peaks of expression: a small peak in early August followed by small decline and a large peak in late September followed by sharp decline (Figure 4). CoSAD and CoFAD2 exhibited only one peak of expression in late September during the same seed stages from May to October. CoFAD2 mRNA levels were similar between early August and early September (Figure 4) but CoSAD mRNA levels were increased continually until peaked in late September ( Figure 4).

Oil Content and Fatty Acid Composition
Tea oil content and fatty acid composition were determined by Soxhlet extraction method and GC using the seeds collected from late August to late October. Seeds from these developmental stages were shown by qPCR to have active expression of CoFBA CoACP, CoSAD and CoFAD2 genes (Figures 3 and 4). Representative chromatogram of fatty acid profiles from seeds collected in late October was shown in Figure 5. Oleic acid was the most abundant fatty acid in the seeds and accounted for 88% of the total fatty acids in the late stage of tea oil seeds, whereas other fatty acids accounted for only minor percentage of the tea oil. The developmental profile of oil accumulation in tea oil seeds showed that oil content was gradually increased from late August to late October and the degree of oil increases was smaller after late September, indicating that seeds began to mature after late September ( Figure 6). The composition of fatty acids from the seeds collected at the eight developmental stages showed a remarkable correlation between the total oil content and oleic acid content in the seeds ( Figure 6A). Oleic acid accounted for approximately 80% of the fatty acids in all stages of the seeds analyzed, and the other four fatty acids accounted for only 20% of the fatty acids in the seeds ( Figure 6B). The general trends of the seed fatty acid composition were that the relative percentage of oleic acid increased a few percentages and other four fatty acids decreased accordingly in the seeds ( Figure 6B). According to oil content, lipid synthesis was in the initial stage before early August. Oil composition could not be detected accurately in the seeds collected during these early stages by Soxhlet extraction method (data not shown). Correlation between CoFBA, CoACP, CoFAD2 and CoSAD

Gene Expression and Oil Content and Fatty Acid Composition
CoACP, CoFAD2 and CoSAD are key enzymes for lipid synthesis. It is expected that the expression of these genes are closely related to oil content and/or fatty acid composition. The grey correlation analysis software evaluated the relevance between CoFBA mRNA levels, along with these three reference genes and oil content (Table 3). Relevance ranking showed that the order of correlation coefficient of the seven genes with tea oil content in the seeds were CoSAD.CoFBA3.CoFBA1.CoACP.CoFBA4.  CoFAD2.CoFBA2. The correlation coefficient of CoFBA gene expression to fatty acid composition was less than 0.6. In contrast, there was a high correlation coefficient between CoFAD2 expression levels and fatty acid composition in the seeds (Table 4).

Discussion
Triacylglycerols are the major form of energy storage in eukaryotes. In addition, they serve as a reservoir of fatty acids for cellular membrane biogenesis and lead to obesity when excessively accumulated in adipose tissues [32]. Understanding plant TAG biosynthesis will facilitate creating oilseed crops with value-added properties [33]. However, choice of target genes for genetic engineering of plant oils is difficult because the plant oil is synthesized by at least 10 enzymatic steps and each step is catalyzed by multiple isozymes [17,33,34]. For example, tung oil biosynthetic pathway contains at least three genes coding for diacylglycerol acyltransferases (DGAT) [18,34], three genes coding for fatty acid desaturases [35], and five genes coding for oleosins [36,37].
Camellia oleifera (tea oil tree) provides widely used high quality cooking oil from the seeds. Tea oil is also very beneficial to human health. We have studied a number of genes in the oil biosynthetic pathway of tea oil to gain information for improving oil content and/or fatty acid composition in the seeds. In this study, we focused on FBA gene family which encodes a key enzyme that catalyzes the conversion of FBP into DHAP and G3P, two key   intermediate metabolites for oil biosynthesis. We identified six forms of FBA from seed transcriptome and found the expression levels of four of them, along with three key oil biosynthesis genes, were well-correlated with oil content in developing tea oil seeds.
FBA is a key regulatory enzyme in the glycolytic pathway [38], which is located to the upstream of fatty acid biosynthesis pathway in oilseeds. In this study, we showed by RT-qPCR that the expression of CoFBA genes in tea oil seeds exhibited a trend of upregulation before fruit enlargement stage and down-regulation slowly after reaching peak stage of oil accumulation, a pattern highly similar to that of CoSAD gene in developing tea oil seeds. The members of CoFBA gene family showed different expression patterns in the developmental seeds. Using grey correlation analysis software, the correlativity between expression levels of CoFBA genes and seed oil content/fatty acid composition was evaluated. Our results showed that the expression levels of CoFBA genes especially CoFBA3 and CoFBA1 genes were highly  Table 3. The relationship between gene expression levels and oil content at different developmental stages in tea oil seeds by Grey correlation degree analysis. correlated with the amount (only behind CoSAD gene) but less with fatty acid composition of tea oil in the seeds. These results suggest that FBA may control the flux of carbohydrates and therefore play an important role in oil yield but has less effect on fatty acid composition in the oilseeds.
ACP, SAD and FAD2 are known enzymes/proteins in plant oil biosynthesis pathway. During the developmental process of Camellia oleifera seeds, the expression levels of these genes are coordinated with the oil content and fatty acid composition in the seeds [39]. ACP is located at the center of the fatty acid synthetase multi-enzyme complex and functions as a carrier of acyl by transferring acyl moiety from one enzymatic reaction to another [40]. SAD dehydrogenates the saturated fatty acid stearic acid (C18:0) to form the monosaturated fatty acid oleic acid (C18:1). Therefore, SAD may determine the total content of unsaturated oleic acid and the ratio of saturated stearic acid and unsaturated oleic acid [41,42]. Since 80% of the fatty acids in tea oil is oleic acid, SAD may play an essential role in determining the amount of oleic acid and therefore the yield of tea oil in the seeds. This conclusion is supported by correlation analysis showing that the expression levels of CoSAD gene were highly correlated with the amount of tea oil in seeds (correlation coefficient was 0.97801). However, it is not clear from this analysis why the correlation between SAD mRNA levels and oleic acid itself was less significantly (0.652996, ranked second only to FAD2 with 0.798968). Under FAD gene family regulation, oleic acid is further dehydrogenated to form linoleic acid (C18:2) and linolenic acid (C18:3). In particular, FAD2 regulates the double-bond formation between 12 th and 13 th carbon atoms of oleic acid to produce linoleic acid. Therefore, FAD2 is considered an essential enzyme for determining the ratio of oleic acid and linoleic acid [43,44]. This is supported by our data showing that CoFAD2 mRNA levels were well-correlated with fatty acid composition in the oilseeds (ranked first among every fatty acid composition tested). However, only 2% of total fatty acids in tea oil are linoleci acid. Our correlation analysis is in agreement with the expectation that there could be a poor correlation between CoFAD2 expression and oil content (ranked second to the last). Therefore, CoFAD2 probably plays a minor role in the determination of oil content in tea oilseeds. Since plant oil is synthesized by at least 10 enzymatic steps and each step is catalyzed by multiple isozymes [17,33,34], it is important to identify other important genes critical for increasing tea oil content and fatty acid composition in future studies.

Conclusions
This study identified multiple forms of CoFBA genes from seed transcriptome analysis, quantitatively evaluated the expression of four CoFBA genes in developing seeds, along with CoACP, CoSAD and CoFAD2 genes, and analyzed the developmental profiles of oil content and fatty acid composition of tea oil. Correlation study indicated that the mRNA levels of CoFBA and CoSAD genes were positively correlated with oil content. The results suggest that CoFBA and CoSAD are important factors for determining tea oil yield because CoFBA gene controls the flux of key intermediates for oil biosynthesis and CoSAD gene controls the synthesis of oleic acid, the predominate fatty acid in tea oil. CoFAD2 probably plays an important role in determining the composition of fatty acids in tea oil. This study suggests that tea oil yield could be improved by over-expression of CoFBA and CoSAD genes in transgenic plants. Table 4. The relationship between gene expression levels and fatty acid composition at different developmental stages in tea oil seeds by Grey correlation degree analysis.