Reliable reference genes for normalization of gene expression data in tea plants (Camellia sinensis) exposed to metal stresses

Tea plants [Camellia sinensis (L.) O. Kuntze] are an important leaf-type crop that are widely used for the production of non-alcoholic beverages in the world. Exposure to excessive amounts of heavy metals adversely affects the quality and yield of tea leaves. To analyze the molecular responses of tea plants to heavy metals, a reliable quantification of gene expression is important and of major importance herein is the normalization of the measured expression levels for the target genes. Ideally, stably expressed reference genes should be evaluated in all experimental systems. In this study, 12 candidate reference genes (i.e., 18S rRNA, Actin, CYP, EF-1α, eIF-4α, GAPDH, MON1, PP2AA3, TBP, TIP41, TUA, and UBC) were cloned from tea plants, and the stability of their expression was examined systematically in 60 samples exposed to diverse heavy metals (i.e., manganese, aluminum, copper, iron, and zinc). Three Excel-based algorithms (geNorm, NormFinder, and BestKeeper) were used to evaluate the expression stability of these genes. PP2AA3 and 18S rRNA were the most stably expressed genes, even though their expression profiles exhibited some variability. Moreover, commonly used reference genes (i.e., GAPDH and TBP) were the least appropriate reference genes for most samples. To further validate the suitability of the analyzed reference genes, the expression level of a phytochelatin synthase gene (i.e., CsPCS1) was determined using the putative reference genes for data normalizations. Our results may be beneficial for future studies involving the quantification of relative gene expression levels in tea plants.


Introduction
Quantification of gene expression levels is an important part of the systematic characterization of gene transcriptional mechanisms and regulatory networks. The quantitative real-time polymerase chain reaction (qRT-PCR) is one of the most commonly used methods to quantify target genes expression levels due to its practical simplicity, specificity, reproducibility, and highly sensitivity in detecting transcripts with low copy numbers [1,2]. An accepted standard procedure to conduct and interpret qRT-PCR experiments was lacking prior to 2009, which is heavy metal stress conditions. Our data may be useful for developing more accurate and reliable protocols to analyze the expression of other tea plant genes in response to metal stresses.

Plant materials and treatments
Two-year-old tea plants (C. sinensis cv. 'Longjing-changyecha') were collected from the fields of Gaochun District, Jiangsu Province, China (longitude: 118.57E, latitude: 31.19N). The plants were pre-incubated in a control nutrient solution [30] for 28 days (from September 8,2015 to October 6, 2015) in a climate-controlled chamber under a 14-h light (24˚C)/ 10-h dark (20˚C) photoperiod (light intensity: 220 μmol m -2 s -1 ) and relative humidity of 75%. We then added MnSO 4 , Al 2 (SO 4 ) 3    1.8-2.0 and an A 260 /A 230 ratio > 2.0 were used for the subsequent cDNA synthesis. The integrity of the purified RNA was further confirmed by 1.2% agarose gel electrophoresis. We generated cDNA from 1 μg total RNA using the TransScript 1 All-in-One First-Strand cDNA Synthesis SuperMix for qPCR (One-Step gDNA Removal) (TransGen). The resulting cDNA was diluted 20-fold in distilled deionized water and analyzed in a qRT-PCR assay.

Quantitative real-time PCR assay
We conducted the qRT-PCR assay using the TransStart 1 Tip Green qPCR SuperMix (Trans-Gen) and a CFX96 Real-Time System (Bio-Rad, Hercules, CA, USA). The PCR solution (20 μl) contained 10 μl 2×TransStart 1 Tip Green qPCR SuperMix, 0.2 μM each primer, 1 μl diluted cDNA, and nuclease-free water. The PCR reaction conditions were as follows: 95˚C for 2 min; 40 cycles of 95˚C for 10 s, 60˚C for 15 s, and 72˚C for 20 s; 72˚C for 3 min. The final dissociation curve was obtained between 65˚C and 95˚C to verify primer specificity. Each assay included three technical and biological replicates, and involved a standard curve with six serial dilution points. Amplification efficiencies (E) were calculated using standard curves with satisfactory linear relationships (R 2 > 0.99). The following equation was used to calculate the Evalue (%): E = (10 [−1/slope] −1) × 100.

Data analysis
The expression levels of the 12 genes were determined according to the quantification cycle (Cq) value. The raw data are listed in Table A in S1 File. Three different Microsoft Excel-based programs (i.e., geNorm v3.5 [5], NormFinder v0.953 [6], and BestKeeper v1.0 [7]) were used to determine the expression stability of the candidate genes. The raw data were analyzed using BestKeeper. However, for geNorm and NormFinder, the raw Cq values were converted into relative quantity values using the formula 2 −ΔCq (ΔCq = each corresponding Cq value-minimum Cq value; Table B in S1 File). Additionally, the fold change in expression of the target gene (CsPCS1; GenBank: KY264048) relative to the different reference genes at various time points was analyzed using the 2 -ΔΔCq method, in which ΔΔCq = (C q, Target gene −C q, Reference gene ) Day x −(C q, Target gene −C q, Reference gene ) Day 0 [31].

Assessment of primer specificity and amplification efficiency
We designed gene-specific primer pairs based on the sequences of the 12 candidate reference genes cloned from C. sinensis transcriptome sequences (Figures B-M and Table C in S1 File). Details regarding the 12 genes, primer sequences, amplicon lengths, and melting temperatures are privided in Table 1. The melting curve analysis revealed that all reactions produced a single distinct peak (Fig 1A). All primers amplified with a single PCR product of the expected size according to 1.5% agarose gel electrophoresis (Fig 1B). The estimated PCR amplification efficiencies for the candidate reference genes varied from 98.2% for TBP to 105.8% for EF-1α, and the correlation coefficients (R 2 ) ranged from 0.995 for UBC to 1.000 for eIF-4α (Table 1).  Quantification cycle values of candidate reference genes The Cq values in qPCR provided an overview of the gene expression levels of 12 candidate reference genes in 60 samples. There were apparent differences in the transcript abundance among genes. The raw data are listed in Table A in S1 File. The mean Cq values of all reference genes ranged from 17.08 to 23.51 (Fig 2). Low Cq value indicates the high expression levels, conversely mean the low expression levels. Among the 12 analyzed genes, UBC exhibited the highest expression levels with a mean Cq of 17.08, while TBP had the lowest expression levels with a mean Cq of 23.51 (Fig 2).

Expression stability of candidate reference genes
To identify the most suitable reference gene, three Microsoft Excel-based algorithms (geNorm, NormFinder, and BestKeeper) were used to analyze the stability of each reference gene. The geNorm analysis indicated that all genes performed well under individual stress conditions, with M values lower than the default limit of 1.5. However, the most suitable reference gene differed among treatments (Table 2). PP2AA3 and TBP with same M values were the two best reference genes for Mn-treated leaves ( Figure N in S1 File). Additionally, the most appropriate reference genes were MON1 and TIP41 in Al-treated leaves, MON1 and PP2AA3 in Cu-treated leaves, EF-1α and eIF-4α in Fe-treated leaves, and CYP and PP2AA3 in Zn-treated leaves. TUA exhibited unstable expression in all samples. The pairwise variation [i.e., Vn/n+1 (n ! 2)] of the geNorm program was also used to determine the optimal number of reference genes required for data normalizations. Values (V2/3) lower than the recommended threshold of 0.15 indicated that two reference genes were sufficient for normalizing the gene expression data resulting from the exposure to the five metal stress conditions (Fig 3).
The results of the NormFinder analysis revealed that the genes with the lowest values were the most stably expressed ( Table 2). The three most stably expressed references genes for all samples were Actin (0.129), EF-1α (0.134), and TUA (0.146). Actin was the most stably expressed gene in Mn-and Fe-treated tea leaves (Table D in S1 File). In contrast, 18S rRNA  Lower standard deviations and coefficients of variation during the BestKeeper analysis corresponded to more stable gene expression. According to the BestKeeper rankings, PP2AA3 was the most stably expressed gene in Al-and Fe-treated leaves (Table E in S1 File). Additionally, TIP41 and MON1 were the most stably expressed genes in Mn-and Cu-treated leaves, respectively. UBC expression levels were the most unstable in Zn-treated leaves, which contradicted the NormFinder results that suggested UBC is a good reference gene.

Validation of selected reference genes
CsPCS1 gene which is related to metal stresses was selected to further evaluate the reliability of the potential C. sinensis reference genes in qRT-PCR assays [20,32]. The relative CsPCS1 expression level following an Al treatments was determined using the expression levels of 18S rRNA, MON1, PP2AA3, TIP41, CYP, TBP, EF-1α, and TUA to normalize data (Fig 4). Data normalizations using the more stably expressed reference genes (i.e., 18S rRNA and MON1) resulted in consistent CsPCS1 expression patterns, with the highest expression level observed following a 7-day exposure to Al. Similar expression patterns were observed when the less stably expressed reference genes were used to normalize data (i.e., TIP41, CYP, and TBP). However, when the least stably expressed genes were used for data normalization (i.e., EF-1α and TUA), the expression level of CsPCS1 was considerably biased. This results indicated that the least stable genes EF-1α and TUA failed to standardize the expression data effectively.

Discussion
In recent years, qRT-PCR has become an outstanding technique for studying gene expression profiles because of its accuracy, sensitivity, and reproducibility. To reach its maximum analytical potential, it is necessary to introduce appropriate internal reference genes or housekeeping genes for normalizing data. Sun et al. [33] were the first researchers to identify suitable C. sinensis reference genes for qRT-PCR analyses. They determined that Csβ-actin (GenBank: HQ420251) and CsGAPDH (GenBank: GE651107) could be used as reference gene during analyses of different tissues and leaf developmental stages, respectively. Afterwards, Csβ-actin and CsGAPDH have been widely applied for gene expression analyses in tea plants [14,34,35]. However, it is misleading to use previously identified reference genes for normalization without first investigating the stability of their expression levels under specific experimental conditions. Therefore, 18S rRNA, Actin, CYP, EF-1α, eIF-4α, GAPDH, MON1, PP2AA3, TBP, TIP41, TUA, and UBC were selected to be validated under the experimental conditions of the present study, because they were observed to be stably expressed under various abiotic stresses, in particular in response to heavy metals [36][37][38][39].
Expression levels were determined as Cq values by qRT-PCR. The mean Cq values of the genes ranged from 17.08 (UBC)-23.51 (TBP), and the Cq values for all the tested samples were between 15 and 30. Here, the Cq values of PP2AA3, 18S rRNA, and CYP (Cq max −Cq min < 2 cycles; SD = 0.47, 0.50, and 0.46, respectively) were distributed more centrally than those of the other candidate genes, whereas the Cq value of TBP showed the highest variation (Cq max −Cq min > 10 cycles, SD = 3.03). The Cq value comparison can provide a rough estimate on stability of gene expression, but not sufficient for accurate evaluation of expression patterns of reference genes [40,41].
Three statistical algorithms (i.e., geNorm, NormFinder, and BestKeeper) were further used to determine which reference gene is best suited for transcript normalization in tea leaves exposed to heavy metals. Our geNorm results (V2/3 value < 0.15) revealed that two reference genes were sufficient for qRT-PCR data normalizations for the aforementioned five heavy metal stresses, indicating that the addition of third gene had no significant effect for normalization. In Mn-treated tea leaves, the geNorm analysis indicated that PP2AA3, TBP, and UBC were the most appropriate reference genes. Although TBP was identified as the best reference gene according to the geNorm results, it was only the seventh most suitable gene based on the BestKeeper analysis. NormFinder, geNorm, and BestKeeper ranked Actin as the best, thirdbest, and sixth-best reference gene, respectively. Additionally, TIP41 was identified as the best reference gene by the BestKeeper program, while it was only the eighth-best and seventh-best reference gene according to geNorm and NormFinder, respectively. Based on these results, PP2AA3 combined with TBP or UBC were recommended as the best combination of stable reference genes for normalizing gene expression levels in Mn-treated tea leaves. Similarly, 18S rRNA combined with PP2AA3 or TIP41 are sufficient for analyses of Al-treated tea leaves. MON1 combined with 18S rRNA or TIP41 should be used for Cu treatments, eIF-4α combined with Actin or CYP are suitable for Fe treatments. Finally, PP2AA3 combined with CYP or 18S rRNA are sufficient for analyzing Zn-treated tea leaves.
During previous analyses of gene expression levels in tea plant, CsGAPDH (GenBank: GE651107) was often used to normalize qRT-PCR data [13,42,43] because it was stably expressed in many other plant species [44][45][46]. However, according to geNorm and Best-Keeper results of our study, GAPDH was identified as the least stably expressed reference gene in Zn-and Mn-treated tea leaves. These results highlight the fact that there is no single universal reference gene that is stably expressed under all experimental conditions [47][48][49] and that reference genes should be reconfirmed according to specific experimental conditions [50,51].
To validate the suitability of potential reference genes, the expression profiles of CsPCS1 was assessed in Al-treated tea leaves, with 18S rRNA, MON1, PP2AA3, TIP41, CYP, TBP, EF-1α, and TUA as internal reference genes. The CsPCS1 expression patterns were similar when the gene expression data were normalized using the most stably expressed reference gene (i.e., 18S rRNA) and less stably expressed reference genes (i.e., PP2AA3, MON1, and TIP41). In contrast, when TBP, EF-1α, and TUA were used for data normalizations, the expression patterns and transcript levels were obviously different from those obtained following data normalizations with 18S rRNA and other suitable reference genes. Hence, using a stable reference gene is a prerequisite for accurate relative quantifications of gene expression levels [52,53].

Conclusions
To the best of our knowledge, this is the first report describing the identification of suitable reference genes for qRT-PCR analyses in C. sinensis leaves exposed to heavy metal stress. The stability analysis of gene expression by geNorm, NormFinder, and BestKepper indicated that no single gene was stably expressed under different metal stress conditions. We have identified PP2AA3, UBC, and TBP as suitable reference genes under Mn stress. 18S rRNA, PP2AA3, and TIP41 were the most stably expressed genes under Al stress. MON1, TIP41, and 18S rRNA were the most stable ones under Cu stress; eIF-4α, CYP, and Actin under Fe stress; and PP2AA3, CYP, and 18S rRNA under Zn stress. Therefore, reference genes should be selected according to the specific metal stress being investigated. Moreover, the stability of previously used reference genes should be re-assessed to increase the accuracy of expression data and to avoid error propagation under certain experimental conditions. Additionally, the analysis of CsPCS1 expression levels confirmed the importance of selecting appropriate reference genes for the normalization of qRT-PCR data. The reference genes selected in this study provide more choices for further gene expression analysis and functional studies in C. sinensis.