Screening Reliable Reference Genes for RT-qPCR Analysis of Gene Expression in Moringa oleifera

Moringa oleifera is a promising plant species for oil and forage, but its genetic improvement is limited. Our current breeding program in this species focuses on exploiting the functional genes associated with important agronomical traits. Here, we screened reliable reference genes for accurately quantifying the expression of target genes using the technique of real-time quantitative polymerase chain reaction (RT-qPCR) in M. oleifera. Eighteen candidate reference genes were selected from a transcriptome database, and their expression stabilities were examined in 90 samples collected from the pods in different developmental stages, various tissues, and the roots and leaves under different conditions (low or high temperature, sodium chloride (NaCl)- or polyethyleneglycol (PEG)- simulated water stress). Analyses with geNorm, NormFinder and BestKeeper algorithms revealed that the reliable reference genes differed across sample designs and that ribosomal protein L1 (RPL1) and acyl carrier protein 2 (ACP2) were the most suitable reference genes in all tested samples. The experiment results demonstrated the significance of using the properly validated reference genes and suggested the use of more than one reference gene to achieve reliable expression profiles. In addition, we applied three isotypes of the superoxide dismutase (SOD) gene that are associated with plant adaptation to abiotic stress to confirm the efficacy of the validated reference genes under NaCl and PEG water stresses. Our results provide a valuable reference for future studies on identifying important functional genes from their transcriptional expressions via RT-qPCR technique in M. oleifera.

Moringa oleifera is a promising plant species for oil and forage, but its genetic improvement is limited. Our current breeding program in this species focuses on exploiting the functional genes associated with important agronomical traits. Here, we screened reliable reference genes for accurately quantifying the expression of target genes using the technique of realtime quantitative polymerase chain reaction (RT-qPCR) in M. oleifera. Eighteen candidate reference genes were selected from a transcriptome database, and their expression stabilities were examined in 90 samples collected from the pods in different developmental stages, various tissues, and the roots and leaves under different conditions (low or high temperature, sodium chloride (NaCl)-or polyethyleneglycol (PEG)-simulated water stress). Analyses with geNorm, NormFinder and BestKeeper algorithms revealed that the reliable reference genes differed across sample designs and that ribosomal protein L1 (RPL1) and acyl carrier protein 2 (ACP2) were the most suitable reference genes in all tested samples. The experiment results demonstrated the significance of using the properly validated reference genes and suggested the use of more than one reference gene to achieve reliable expression profiles. In addition, we applied three isotypes of the superoxide dismutase (SOD) gene that are associated with plant adaptation to abiotic stress to confirm the efficacy of the validated reference genes under NaCl and PEG water stresses. Our results provide a valuable reference for future studies on identifying important functional genes from their transcriptional expressions via RT-qPCR technique in M. oleifera.

Introduction
Moringa oleifera Lam., belonging to a single-genus family Moringaceae, is a fast-growing tree species and widely distributed in the tropical and subtropical regions [1]. This species, with a great economic value for food and medical industry [1,2,3,4], has gained interest globally, especially in the developing countries because of its rich nutrition in various organs (in particular, the mostly used leaves and seeds). However, the potential utility of M. oleifera is not fully explored owing to the restriction of effective technologies or the lack of high-yielding varieties in seed production or in total biomass. Currently, besides the physiological and medical studies, researchers start to investigate genetic diversity of M. oleifera using molecular markers, and to develop marker-assistant selection (MAS) for genetic improvement [5]. A reference genome of M. oleifera is now publically accessible [6]. The transcriptomes from M. oleifera leaves were sequenced in our lab, which provided a well-assembled and annotated sequence database for gene function research.
One objective in our breeding program is to search for the genes associated with important agronomical traits. At the transcriptional level, the RT-qPCR technique provides an effective approach for assessing gene expression and for rapidly quantifying mRNA transcripts [7,8,9,10]. Thus, this technique helps to identify the function of important genes in agronomy. However, the technique requires reliable reference genes in designing RT-qPCR experiments, which is crucial for accurately interpreting the expression of target genes [9]. Previous experiments showed that use of the unstably expressed reference genes could produce a biased experiment result and a false-positive conclusion [11,12,13,14]. The significance of using reliable reference genes is also reflected from studies on the expression stability of various genes in experiments in different plant species, such as Arabidopsis [15], soybean [16], tomato [17], rice [18] and tobacco [19]. An ideal reference gene, in principle, should possess a property of a general cell function and a relatively invariable expression in differential tissues, or in different developmental stages, or under different experiment conditions [20,21,22]. Housekeeping genes or endogenous control genes were conventionally used for reference genes in RT-qPCR analysis, but some studies showed that these reference genes could exhibit great variations in expression under different experiment conditions [13,23]. For instance, the glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene exhibited unstably expressions in papaya during its storage at different temperatures [24], and actin was inappropriate as a reference gene under the salinity stress in potato [25]. Because universal reference genes are not available in plants [26], it is necessary to search for the reliable reference genes that are suitable for experiment study in M. oleifera.
Earlier studies with M. oleifera presumed GAPDH and alpha tubulin (TUA) as reference genes in RT-qPCR, but did not examine the expression stabilities of these two genes under different conditions [27]. Here, we selected 18 candidate genes from the transcriptome database of M. oleifera generated in our lab, and evaluated their expression stabilities. These candidate reference genes were GAPDH, phosphoenolpyruvate carboxylase (PEPC), acyl carrier protein 1 (ACP1) and 2 (ACP2), ubiquitin-conjugating enzyme (UBCE), alpha tubulin 1 (TUA1) and 2 (TUA2), ribosomal protein L1 (RPL1) and L2 (RPL2), malate dehydrogenase 1 (MDH1) and 2 (MDH2), actin (ACT), ubiquitin extension protein (UEP), translation elongation factor 1 (EF1) and 2 (EF2), beta-tubulin (TUB) and cyclophilin 1 (CYP1) and 2 (CYP2). The purpose of this study was to identify more reliable reference genes for normalizing target gene expressions via RT-qPCR. Experimental samples were collected under different conditions, including the pods in different developmental stages, various tissues, the leaves under low and high temperature, and the roots under water stresses (NaCl and PEG-simulated). Furthermore, to evaluate the usefulness of the validated reference genes, we examined the expressions of three isotypes of the superoxide dismutase (SOD) gene under NaCl and PEG water stresses, including copperzinc SOD (Cu/Zn-SOD), iron SOD (Fe-SOD), and manganese SOD (Mn-SOD). These genes were known to be involved in plant adaptation to environmental stresses [28,29,30].
In our data analysis, we applied three algorithms (geNorm [12], NormFinder [31] and Best-Keeper [32]) to assessing reference genes and to identifying the most stably expressed genes under different conditions. In addition, these tools were also used to do normalization over multiple reference genes, which could enhance the robustness for expression quantifications. Our finally selected reference genes are suitable for identifying functional genes under different environmental conditions with M. oleifera.

Plant materials
Six experiments were designed for collecting samples ( Table 1). Samples were obtained from an open plantation (E113°37 0 N23°16 0 ) and some indoor plants grown in a green house in South China Agricultural University (SCAU), Guangzhou, China. Pods in different developmental stages were obtained from the plantation. It generally took 15 to 20 days for pod elongation growth. Pod samples were collected on the 3rd, 8th, 13th, and 18th days after pod formation. Except the pods, all other samples were collected from the seedlings of 2.5 months old grown in a culture room (26°C, 60% relative humidity) with perlites in nursery pots. Experiment materials with consistent height and growth were selected, but samples free from serious defects were chosen randomly. Tissue samples were roots, stems, mature and young leaves. For the samples under low and high temperature, plants after cultivation in the culture room were placed in the 4°C and 40°C illumination incubator, respectively. Leaves were then separately harvested at the 0, 2, 6, 12, 24, and 48 hours after these two treatments. For samples under water stress, plants were treated with 100 mmolL -1 NaCl and 50gL -1 PEG6000 solutions, respectively. Roots were separately collected in 0, 2, 6, 12, and 24 hours after these two treatments. All samples were frozen in liquid nitrogen immediately after collection and stored at -80°C for subsequent RNA isolations. There were 90 samples in total, including biological replicates ( Table 1).

RNA isolation and cDNA synthesis
All RNA samples were extracted with plant total RNA kit (OMEGA 8 ) according to the manufacturer's protocol. Samples were treated with the DNase digestion attached in the RNA kit to avoid genomic DNA contamination. The quality and quantity of the RNA samples were measured with NanoDrop ND1000 (Thermo Scientific). RNA samples with the absorbance ratios at both 260/280 and 260/230 nm being around 2.0 were selected. The integrity was assessed through 1.5% agarose gel electrophoresis to select the DNA-free RNA samples with welldefined bands. We added 1.2ug RNA for 30uL reverse-transcription system, which was equivalent to 400ng RNA for 10uL reverse-transcription system. The PrimeScript II first Strand cDNA Synthesis Kit MIX (Takara) was used according to the manufacturer's instructions. The synthesized cDNA was diluted 15 times before used as the template for RT-qPCR.

Selection of candidate reference genes
Eighteen candidate reference genes from previous reports were selected to screen the most reliable reference genes for target gene expressions via RT-qPCR technique. All genes were selected from our transcriptome database obtained by the high throughput Illumina HiSeq™ 2000 sequencing platform (Gene Denovo, Guangzhou, China), including GAPDH, PEPC, ACP1, ACP2, UBCE, TUA1, RPL1, MDH1, MDH2, ACT, UEP, EF1, TUA2, TUB,CYP1,CYP2, RPL2 and EF2. Information of these genes is detailed in Table 2.

Primer design and RT-qPCR conditions
All primers for RT-qPCR analysis were designed using software primer 5.0 and Oligo 7 according to the sequences of candidate reference genes. Primer pairs were synthesized by Sangon company (Guangzhou, China). Primer pairs with a single product and a correct size were selected through 2.5% agarose gel electrophoresis. Each amplicon was sequenced by Sangon company to ensure the correctness of specific sequence. RT-qPCR was conducted to further check the specificity of amplicon. The RT-qPCR reaction mixture system, which was operated in 96-well optical plates (Bio-Rad, Foster City, CA, USA), consisted of 2μL cDNA, 10μL SYBR Green PCR Master Mix(Takara), 0.3μL of 10uM forward primer, 0.3μL of 10uM reverse primer, and 7.4μL ddH 2 O to a final volume 20μL. RT-qPCR reaction was then conducted by the Roche LightCyler 480 system (Roche 10 ) under the following condition: 30s at 95°C followed by 40 cycles (5s at 95°C, 30s at 60°C, 30s at 72°C). Melting curves were generated and analyzed by following the procedure. PCR amplification efficiency (E) and correlation coefficient (R 2 ) for each gene were calculated through the experiments of fivefold cDNA dilution series, with three technical replicates for each standard curve. Primer pairs with the E-values between 90% and 110% were considered. Apart from the calculation of E and R 2 using the fivefold cDNA dilution series, the synthesized cDNA samples from the six experiment designs were diluted 15 times before used as templates for RT-qPCR.

Statistical analysis
Three commonly used statistical algorithms were applied to analyzing the expression stability of the 18 candidate reference genes under different experiment conditions: geNorm [12], NormFinder [31] and BestKeeper [32]. A threshold cycle (CT value), i.e. the number of cycles required for the fluorescent signal to exceed a specific detection threshold (removing noise signals) in the exponential phase of the PCR reaction, determined the expression levels of the tested candidate reference genes. A low CT value means a high level of gene expression. Each CT value in a single sample was from the average of three replicates.
Data inputs were different among the three algorithms. BestKeeper can directly deal with the raw CT values, without the need of data transformation. GeNorm and NormFinder need data transformation before proceeding calculations [12,31]. All CT values were transformed into relative values. First, the "delta CT" value for each gene expression was obtained by subtracting the lowest CT value in a focal sample set. Hence, the lowest relative CT was 0 while others were greater than 0. Then, each relative CT value was transformed by 2 (-delta Ct) . The gene with the highest 2 (-delta Ct) value (the minimum CT) was rescaled to 1, and was set as the reference gene. All others were rescaled with a reference to the reference gene, and hence were less than 1. Finally, the operated data file for each sample set was separately imported into geNorm and NormFinder.
The approach for determining the stably expressed genes is also different among the three algorithms. GeNorm ranked 18 candidate reference genes according to the estimated expression stability value (M) for each gene. The M value for each gene was calculated according to the average pairwise variation from all tested genes. The most stably expressed gene was the one with the lowest M value. GeNorm determined the optimal number of reference genes according to the relative value V n /V n+1 , which measured the effect of adding one more reference gene on the normalization factor (the geometric mean of the expression values of the selected reference genes) [12]. An additional reference gene was included if it had a significant positive effect on the normalization factor. The value of V n /V n+1 below 0.15 was a cut-off value for determining the optimal number of reference genes (n).
NormFinder provided a stability value for each gene based on the variance analysis [31]. The gene with the lowest value had the most stable expression. BestKeeper provided the correlation (r) for a maximum of ten genes [32]. The gene whose r-value was closest to 1 was the most stably expressed gene.

Normalization of SOD genes
Superoxide dismutase (SOD) catalyzes the dismutation reaction of superoxide radical (O 2 -) and protects plants from oxidative damage caused by reactive oxygen species (ROS), which is an adaptive response to environmental stresses [30]. In plants, there are three isotypes of the SOD gene based on their metal cofactors: Cu/Zn-SOD, Fe-SOD, and Mn-SOD. A study reported that the expression of the SOD gene significantly increased in pea under NaCl water stress [33]. The expression patterns of these genes under NaCl or PEG water stress were reported in previously published studies [34]. Here we selected these three isotype genes from our transcriptome database in M. oleifera to evaluate the validated candidate reference genes. Under the 100mM NaCl and 50gL -1 PEG water stresses, expressions of these three isotype genes were quantified using one or two most stably expressed reference genes. Note that a pre-experiment showed that a high PEG concentration (> 50gL -1 ) could make the seedlings of 2.5 month-old be severely wilted. The primer pairs (S1 Table) of the SOD gene were also verified before RT-qPCR analysis (S1 Fig).

Amplification specificity and efficiency
Sequences of the eighteen candidate reference genes from M. oleifera transcriptome database were retrieved using BLASTX from NCBI and annotated according to the best blast result. Information about candidate reference genes and their primer pairs are summarized in Table 2. The results of sequencing and agarose gel electrophoresis (Fig 1) showed the specificity of 18 amplicons with correct sequences and fragment sizes. Verification of RT-qPCR revealed that each gene had a single peak in the melting curve (S2 Fig), indicating that the designed primers accurately amplified the target genes. PCR amplification efficiency and correlation coefficient of each primer pair were estimated through the simulated linear relationship between the log-transformed cDNA concentrations and the corresponding raw CT values from the fivefold cDNA dilution series experiments. PCR amplification efficiency ranged from 90.28% for MDH2 to 104.18% for UEP, which was within the range from 90% to 110%. Correlation coefficients varied from 0.9941 for ACT and UBCE to 0.9999 for TUA1 ( Table 2).

Expression profiles
Means and standard deviations of CT values are summarized for the 18 candidate reference genes ( Table 3; Fig 2). A large difference existed among candidate genes in mean/median and variations. Genes GAPDH, ACT and CYP1 exhibited higher levels of expressions than other genes in all samples, with the average CT values ranging from 19.18 to 19.97 cycles. CYP2 (average CT = 25.97) exhibited the lowest level of expression. All other genes exhibited intermediate levels of expressions across all samples ( Table 3).
Concerning the variation of gene expressions across all samples, CYP2 and TUA2 showed large CT variations, while RPL2 and EF1 showed small CT variations. Expressions for the rest of the candidate genes exhibited intermediate variations ( Table 3; Fig 2).
The same gene exhibited expression variations in different sample designs ( Table 3). There were no concordant patterns among all gene expressions in different experiment designs. Most genes (except CYP2, EF2, and ACP2) exhibited higher levels of expression in pods than in tissues or under the other conditions. Most genes except UBCE exhibited a higher level of expression under the low temperature (4°C) than under the high temperature (40°C). Gene CYP2 exhibited the lowest levels of expressions in tissues and pods samples or under the high temperature (40°C) and the water stress (NaCl and PEG). Gene GADPH had the lowest expression under the high temperature (40°C), but had relatively higher levels of expressions under water stress ( Table 3).
The pooled CT values for water stress (NaCl and PEG) generally exhibited intermediate levels of expressions and variations, compared with the results under NaCl or PEG water stress alone. The same case occurred for the pooled CT values under temperature stress (4°C and 40°C) ( Table 3).
Taken together, none gene was constantly expressed in all samples. It is necessary to further identify the appropriate reference genes for the data normalization under different experiment conditions.

Expression stability
Analyses with geNorm, NormFinder and BestKeeper algorithms showed that the most stably expressed genes varied in different sample sets or with different algorithms.  3). For pooled samples under both NaCl and PEG water stresses, the most stably expresses genes were ACP2 and MDH2 (M = 0.199) (Fig 3). For all pooled samples, RPL1 and ACP2 were the two most stably expressed genes (M = 0.382), and could be used as the reference genes for multiple samples (Fig 3). CYP2 expression was the least stable among all gene expressions. From the pattern of V n /V n+1 (Fig 4), we selected two reference genes (parsimony) in the nine sample sets for data normalization since their V 2 /V 3 values were below 0.15, although an addition of more genes could further reduce the relative value V n /V n+1 . Generally, the top six stably expressed genes were the same in most samples from both geNorm and NormFinder analyses. Two most stably expressed genes were the same from both NormFinder and geNorm analyses under the temperature (4°C and 40°C) and PEG water stresses. Two most stably expressed genes from the geNorm analysis were in the top six stably expressed genes from the NormFinder analysis, except RPL1 that was ranked at the eleventh place in all sample sets and TUB that was ranked at the ninth place in various tissues from the NormFinder analysis. The most unstably expressed genes were generally the same in samples from the pods in different developmental stages or from various tissues. However, under the temperature (4°C and 40°C) and water stresses (NaCl and PEG), the most unstably expressed genes from the geNorm analysis were the third and fourth unstably expressed genes from the NormFinder analysis, respectively.
BestKeeper analyzed the top ten stably expressed genes that were derived from the geNorm analysis (Table 4). Unlike the results from the geNorm and NormFinder analyses, the ranking order from BestKeeper analysis exhibited larger variations, similar to the reports in cucumber and poplar species [14,35]. For instance, the unstably expressed gene EF2 under 40°C was ranked in the tenth and ninth places by geNorm and Normfinder analyses, respectively, but was ranked in the third place from the analysis with BestKeeper. For the less stably expressed genes, these three algorithms displayed a similar result.
To further assess the gene expressions, the top-ten stably expressed genes from the geNorm analysis were scored from 1 to 10, representing the genes with the most stable to the most unstable expressions, respectively. Similarly, these ten genes from the analyses with both NormFinder and BestKeeper were also scored from 1 to 10 according to their previous ranking orders. The total score for each selected gene was calculated by summing these three scores from different algorithms. All the total scores were ordered. The gene with the lowest total score was thought to be the most stably expressed genes and marked as 1. The final results showed that the expression stability by the three algorithms was quite similar for the most stably and unstably expressed genes, even though their overall orders were different among them (S2 Table). Pairwise variation (V) analysis among the candidate reference genes. The relative value V n /V n+1 was analyzed by geNorm to determine the optimal number of reference genes required for RT-qPCR data normalization. The relative value at V2/V3 indicated the optimal number of genes for normalization in each experiment design. doi:10.1371/journal.pone.0159458.g004

Validation of reference genes
Three isotypes of the SOD gene (Cu/Zn-SOD, Fe-SOD, and Mn-SOD) were employed to test the efficacy of the validated candidate reference genes under NaCl and PEG water stresses, respectively (Fig 6). The relative expressions of these three genes were quantified using one or two of the most stably and unstably expressed genes. Our results revealed that the expressions of three genes increased under NaCl water stress. Similar patterns among the three genes occurred when ACP1 alone or the combination of ACP1 and CYP1 was used as the reference genes (Fig 6). Likewise, expressions of the three genes gradually increased under PEG water stress when RPL2 alone or the combination of RPL2 and EF2 was used as the reference genes (Fig 6). In each case, the use of two reference genes under NaCl water stress (ACP1 and CYP1) or under PEG water stress (RPL2 and EF2) generally improved the quantification of the SOD gene expressions, compared with the results of using a single reference gene. However, the patterns changed when the least stably expressed gene TUA1 under NaCl water stress or gene TUB under PEG water stress was used as the reference gene. This provided the evidence of biased quantifications when the unstably expressed genes were used as the reference genes in M. oleifera.

Discussion
This study was to search for reliable candidate reference genes under different experiment conditions, with an aim at more accurately quantifying the expression of target genes in M. oleifera using RT-qPCR technique. The technique is effective for quantifying gene expression due to its sensitivity and accuracy, given that reliable reference genes are available for compensating the   Selection of Reference Genes in M. oleifera variations brought by sample differences [8]. A reliable reference gene should have a minimal variation in expression, whereas a gene of agronomical interest often changes its expression over the course of plant growth and development [21,36,37]. However, there are no universal reference genes suitable for different experiment conditions as many factors affect the stability of gene expressions [16,26]. In addition, the number of selected reference genes can influence the quantification result for some apportioned experiments [12,38]. There is evidence showing that the traditional strategy for normalization based on a single gene could generate biased results [39,40]. These concerns have not been clarified with M. oleifera, and no any report is available on identifying and validating reference genes in this species.
We examined eighteen candidate reference genes from our transcriptome database. Samples were collected under different experiment conditions, including the pods in different developmental stages, various tissues, the leaves under low and high temperature, and the roots under NaCl and PEG water stresses. Three algorithms (geNorm [12], NormFinder [31] and Best-Keeper [32]) were used to analyze gene expression stability. The results showed that the best candidate genes and the number of reference genes were: ACT and RPL1 for pods, TUB and ACT for tissues (stem, roots and leaves), TUB and TUA2 for leaves under low temperature, RPL2 and ACP1 for leaves under high temperature, CYP1 and ACP1 for roots under NaCl Selection of Reference Genes in M. oleifera water stress, EF2 and RPL2 for roots under PEG water stress, RPL2 and ACP1 for leaves under low or high temperature, ACP2 and MDH2 for roots under NaCl or PEG water stress, and RPL1 and ACP2 for all samples. Although different rankings were yielded among the three algorithms (Figs 3 and 5, Table 4), the most stably or unstably expressed candidate genes were essentially the same. We selected three isotypes of the SOD gene to evaluate the efficacy of validated reference genes, and provided the evidence that the number and the type of reference genes could affect the quantification of target gene expressions in M. oleifera (Fig 6). Such influences were verified to be remarkable when the stably or unstably expressed reference genes were used, indicating the importance of applying reliable reference genes to assessing functional genes in this species. Although the improvement of applying two reference genes over a single reference gene was not substantial for quantifying gene expressions, our results implied the necessity of applying multiple reference genes in some cases.
Conventional reference genes, such as GAPDH, TUA, ACT, EF and CYP, are commonly used for quantifying target gene expressions in many plant species. However, some studies reported that these reference genes were not always stably expressed when tested in different species or under various experiment conditions [8,13,41]. Genes GAPDH and TUA were used to quantify the expression of target genes in M. oleifera [27]. Our study revealed that GAPDH and TUA1 were not the best choice in all tested experiments. TUA2 only exhibited the best stability under low temperature. Our results suggested that more suitable reference genes other than GAPDH and TUA should be taken into account in future studies with M. oleifera. A previous study used ACT to normalize gene expression [13], but here we confirmed its feasibility in the pod samples from different developmental stages and in various tissue samples. However, this gene was not the best reference gene under low temperature, consistent with the results in a previous study with potato species under low temperature [25]. In addition, different levels of variations in the relative expressions of EF and CYP were observed in M. oleifera samples under different conditions, which also occurred in banana species [16]. Thus, similar to the conclusions from the studies in other species, the best candidate reference genes were not all in common under different conditions in M. oleifera.
The expression stability of a gene could be related to its biological function or its polymorphism in a species [42,43,44]. We commonly hold that housekeeping genes are often suitable for reference genes because they are more conservative in function or often have low genetic diversity (note that different alleles often exhibit different levels of expressions) [43,45]. Here we confirmed that ACT, TUB and RPL had the best stability in more than one sample design. Genes ACT and TUB are the basal components for cytoskeleton that maintain the life activity for organisms, and exhibit conservative structure during the evolution, especially gene ACT. Gene TUB also presented a little variation in different tissues of Jatropha curcas and showed good stability under low temperature, indicating that TUB could be a good choice as a reference gene under some conditions [46]. RPL codes a ribosomal protein, the main component of ribosome and important for protein synthesis, is also highly conserved. RPL1 and RPL2 were suggested to be reference genes in our four sample designs. In addition, ACP1 and ACP2 were stably expressed in five sample designs, implying that ACP could be a reliable candidate reference gene in M. oleifera. A challenge for future studies is to examine which allele for each of these validated reference genes, provided that they are polymorphic, is dominantly and stably expressed under different conditions in M. oleifera.
In relation to the breeding program in M. oleifera, the reference genes obtained under various conditions could be used for accelerating selection for multiple objectives. To improve a breeding population of M. oleifera to adapt to different environmental conditions, such as different temperatures, soil moisture and salinization degree, it is an important step to identify and exploit suitable reference genes for quantifying the expression profiles of different target genes. As different tissues or the same tissue in different developmental stages contain different nutrient/medical components or active ingredients, the expression of relevant genes may be used to reveal the genetic basis of their synthetic mechanisms, which helps to better utilize M. oleifera. In addition, the genes related to biotic stresses, exogenous hormone treatments and disease resistance can also be explored. All these studies partly depend on the functional gene research besides the field experiment and genetic mating designs in breeding. RT-qPCR provides an effective technique to reveal the gene expression and relevant mechanisms at the transcriptional level. The type and the number of reference genes obtained in this study could be applied to designing breeding programs to explore multiple functional genes associated with different target traits [8,12].
The algorithms used by geNorm [12], NormFinder [31] and BestKeeper [32] are currently used for analyzing candidate reference genes in RT-qPCR experiments, including selecting reference genes and assessing their expression stabilities. GeNorm, not NormFinder and Best-Keeper, can be used to determine the number of reference genes [12]. As indicated in our study, the difference in ranking order existed among the outputs of these three algorithms [31]. Orders of the stably and unstably expressed reference genes were similar among most samples, especially the order based on the analysis with geNorm and on the total score (S2 Table). There was a relatively large difference between geNorm and BestKeeper analyses, but a small difference between geNorm and NormFinder analyses. Such differences were also reported in previous studies [47]. Thus, when the discordant patterns occur from three algorithms under some specific experiment conditions, a comprehensive comparison is needed in determining the number and the type of appropriate reference genes.
Finally, it is of interest to briefly discuss the necessity of using single versus multiple reference genes. Previous studies often emphasize the use of a single reference gene for calibrating expressions of target genes. However, the common situation is that a single reference gene may exhibit unstable expressions under different experiment conditions, which necessitates the use of multiple reference genes [36,38,41,48,49]. As demonstrated in our work, two stably expressed reference genes can be employed in experiments, and more than two stably expressed reference genes may be necessary when there are more variations under some conditions. Our study showed that a combination of multiple reference genes could improve the accuracy and reliability of normalization to some extents. One issue arising from the use of multiple reference genes is the impacts of the potential interaction among reference genes [43]. Positive or negative interactions among reference gene expressions, such as functional gene associations or linkage disequilibrium among expressed alleles, might alter their use for normalizing target gene expressions. However, when the reference genes are linearly additive in their expressions, use of multiple reference genes could enhance the power for normalization under various experiment conditions. Therefore, how to select multiple reference genes remains to be clarified with deliberate experiment designs.

Conclusion
Our work represents the first report in selecting and validating the expression stability of a set of candidate reference genes in M. oleifera. The results also highlight the significance of selecting proper reference genes for chosen experiments in this species. Furthermore, more than one reference gene is suggested for more accurate normalization in RT-qPCR analysis. The expression normalization results from three isotypes of the target SOD gene further confirmed the necessity of reliable reference genes. All these findings could aid in accurately quantifying the expression profiles of multiple target genes under different conditions in M. oleifera.  Table. Orders of the top-ten candidate reference genes from the analyses with geNorm, NormFinder and BestKeeper algorithms. G, N, B and O represent the orders from the analyses with geNorm, NormFinder, BestKeeper and Overall ranking, respectively. (DOCX)