Identification and validation of superior reference gene for gene expression normalization via RT-qPCR in staminate and pistillate flowers of Jatropha curcas – A biodiesel plant

Physic nut (Jatropha curcas L) seed oil is a natural resource for the alternative production of fossil fuel. Seed oil production is mainly depended on seed yield, which was restricted by the low ratio of staminate flowers to pistillate flowers. Further, the mechanism of physic nut flower sex differentiation has not been fully understood yet. Quantitative Real Time—Polymerase Chain Reaction is a reliable and widely used technique to quantify the gene expression pattern in biological samples. However, for accuracy of qRT-PCR, appropriate reference gene is highly desirable to quantify the target gene level. Hence, the present study was aimed to identify the stable reference genes in staminate and pistillate flowers of J. curcas. In this study, 10 candidate reference genes were selected and evaluated for their expression stability in staminate and pistillate flowers, and their stability was validated by five different algorithms (ΔCt, BestKeeper, NormFinder, GeNorm and RefFinder). Resulting, TUB and EF found to be the two most stably expressed reference for staminate flower; while GAPDH1 and EF found to be the most stably expressed reference gene for pistillate flowers. Finally, RT-qPCR assays of target gene AGAMOUS using the identified most stable reference genes confirmed the reliability of selected reference genes in different stages of flower development. AGAMOUS gene expression levels at different stages were further proved by gene copy number analysis. Therefore, the present study provides guidance for selecting appropriate reference genes for analyzing the expression pattern of floral developmental genes in staminate and pistillate flowers of J. curcas.


Introduction
Physic nut (Jatropha curcas L) is a multipurpose, poisonous, drought resistant, semi-evergreen shrub/small tree belonging to the family Euphorbiaceae and it has spread beyond its center of origin to tropical and subtropical regions because of its simple propagation, rapid growth, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 drought resistance and wide adaptability to variety of soils and different climatic conditions [1,2]. In the last decade, physic nut seed oil has gained much interest worldwide as a potential renewable natural resource for the production of bio-diesel and bio-jet fuel to replace fossil fuels [3] The viscous oil obtained from seeds has medicinal value, that has long been used as a source of lamp oil and used for manufacturing candles, soap, paraffin and cosmetic industry [4,5]. Additionally, the residual seed cake after extraction of oil can be used as biomass to power electricity plants, fertilizer, animal fodder and it can also be used as feed in digesters and gasifiers to produce biogas [6].
Flowers are reproductive organs that are widely studied for molecular breeding, sex determination and improved production of seeds in agronomically important plants. In physic nut, the productivity factor depends on ratio between staminate and pistillate flowers (1/10-1/30) [7]. A study of physic nut inflorescence has shown that the ratio of staminate flower to pistillate flower is low and different proportions occur in different seasons in most of the available germplasm [8]. In the recent years, forward and reverse genomics approaches have been employed to improve physic nut growth, high seed yield with high oil content and more staminate flowers [3]. For sex determination and to improve the number of staminate flowers of physic nut, differential expression of relative genes in staminate and pistillate flowers should be understood. Although, a report showed selection of best candidate reference genes in different tissues of physic nut at different environmental conditions [9], but there is no detailed studies on selection of potential candidate reference genes in staminate and pistillate flowers of physic nut.
Quantifying gene expression pattern is an important factor for unraveling the function of important pathway genes in biological samples. In connection to this, Real-time quantitative PCR (RT-qPCR) is widely used as the most accurate and most reliable technique that can detect and measure small number of mRNA copies due to its high sensitivity, accuracy, specificity and high throughput [10]. However, the accuracy of qPCR is influenced by a number of factors, such as RNA stability, quantity, purity, enzymatic efficiency in cDNA synthesis and PCR amplification [11]. An improper reference gene can lead to misinterpretation of expression data and thus generate incorrect results [12]. Thus, to avoid the influence of these factors, it is a pre-requisite to select ideal reference gene(s) to normalize RT-qPCR analysis and its expression is presumed stable in control and experimental conditions [13,14]. It is therefore pivotal to identify the best potential reference genes for the experimental system under study. Based on following background information, the present study is carried out with an aim to choose these steadily expressed potential genes in the transcriptome data base for the selection of reliable reference gene expression in staminate and pistillate flowers of physic nut.

Plant materials
The flowers were collected from physic nut plants grown at Jinhe town (26˚56 0 N, 101˚68 0 E) located in Yanyuan County, Sichuan Province, China (Permission provided by Yanyuan county Government, China) and stem cutting grown at Green house, College of Life Sciences, Sichuan University, China, collected from Haikou city, Hainan Province (Permission provided by Haikou city Government, China) (S1 Fig). Inflorescences were collected twice from in between April to June and July to September in wild types and stem cuttings. One inflorescence was collected from one J. curcas plant and totally 20 inflorescence samples were randomly collected each time from J. curcas. Two to three staminate and pistillate flowers were collected from one inflorescence separately in earlier, middle and later according to the flower developmental stages described by Wu et al [15] (S2 Fig). The flowers of earlier stages were separated from inflorescence under the stereo microscope (Olympus, SZ2). The collected flower samples were immediately frozen in liquid nitrogen and stored at −80˚C for further experimental analysis.

RNA isolation and cDNA synthesis
Total RNA was isolated using TaKaRa MiniBEST Plant RNA Extraction reagents according to the manufacturer's instructions (Takara Bio. Inc., Dalian, China). The quantity and quality of total RNA samples were assessed by measuring the absorbance ratio at 260/280 and 260/230 nm using a spectrophotometer (Nanovue, Healthcare Bio-Sciences AB, Sweden). RNA samples with absorbance ratio between 1.9 and 2.2 were retained for further analysis. The total RNA integrity was evaluated on a 2% agarose gel (S3 Fig). The first cDNA strand was synthesized using an aliquot of 2.0 μg of total RNA in a 50 μl reaction using reverse transcription system kit (PrimeScript™ RT Reagent Kit, Takara Bio. Inc., Dalian, China), following the manufacturer's protocol. All of the cDNA samples were diluted at 1:10 with RNase-free water and stored at −80˚C.

Selection of candidate reference genes and design of RT-qPCR primer
The cDNA sequences of these endogenous candidate reference genes (ACTIN, EF, SLEEPER, GAPDH1, GAPDH2, TUB, UBI, DSK2A, CYC, and PLA) were obtained from the NCBI Gene-Bank database. RT-qPCR primers were designed using Primer Premier 6 software with melting temperature between 59 to 61˚C, primer length between 20 to 23 nucleotides, and PCR amplicon length within 103 to 165 base pair (Table 1).

Analysis of expression stability determination using different statistical algorithms
To obtain high accuracy of stability ranking from the C q values of each candidate reference gene, four different statistical algorithms such as ΔC t [16], geNorm (version 3.5) [17], Best-Keeper (version 1.0) [18], and NormFinder (version 0.953) [19] were used to evaluate expression stability of reference genes. The RT-qPCR data obtained from the Bio-Rad CFX connected Real-time system were exported into an Excel datasheet. The raw Cq values (S1 Table) were used directly for stability calculations in BestKeeper analysis and ΔC t method.
Then the raw Cq values were converted into relative quantities and imported into the geNorm and Norm-Finder analysis using the formula Q = E -ΔCq , in which E = amplification efficiency, ΔCq = the corresponding Cq value -minimum Cq. Genes with the lowest standard deviation (SD) and coefficient of variance (CV) values are considered as the most stable reference genes.
In geNorm, the reference gene expression stability measurement (M) value was calculated as the level of pairwise variation for each reference gene with all other control genes and as the standard deviation (SD) of the logarithmically transformed expression ratios [17]. The RefFinder (http://www.leonxie.com/referencegene.php), a web-based comprehensive tool integrating the data obtained from ΔC t method, Genorm, BestKeeper and NormFinder to calculate the recommended comprehensive ranking order [20]. All the software packages were used according to the manufacturer's instructions. All other multiple comparisons were performed with GrapPad Prism 7 statistical software.

Validation of reference gene by expression analysis of AGAMOUS gene
Physic nut AGAMOUS gene is a floral organ identity gene in C class of ABC model which specifies stamen and carpel development [21]. Expression of floral organ identifying gene, AGA-MOUS, was used as a target gene to demonstrate the usefulness of the selected candidate reference genes in RT-qPCR. For the validation of selected reference gene, the expression level of AGAMOUS gene was analyzed using the most stable reference genes and the most varying reference genes in staminate and pistillate flower samples, which was calculated by 2 -ΔΔCq method. These experiments were performed in triplicate for each sample.

Establishment of standard curve for absolute quantification
The purified plasmid was diluted with sterile deionized water to obtain a standard series from 1.49 × 10 3 to 1.49 × 10 7 copies / μL. Quantitative RT-PCR was performed using 25 ng / μL DNA with three replicates. The standard curve was established (S4 Fig) as described in details previously [23].

Results
Specificity and efficiency of RT-qPCR amplification of the candidate reference genes Potentially useful endogenous reference genes were selected based on previous studies, which are generally used for normalization and also routinely used as control for RT-PCR or blotting techniques. We selected 10 candidate reference genes (ACTIN, EF, SLEEPER, GAPDH1, GAPDH2, TUB, UBI, DSK2A, CYC, and PLA) to normalize the gene expression levels in physic nut inflorescence containing staminate and pistillate flowers at three developmental stage using RT-qPCR. Specific primer pairs were designed for each candidate reference gene, with the amplicon lengths ranging from 103 bp to 165 bp. A single DNA band at the correct molecular weight for each product in agarose gel electrophoresis indicating good specificity of all the primer pairs used in RT-qPCR ( Fig 1A). The melting curves for the amplified products of all 10 candidate genes showed a single peak corresponding to a specific melting temperature ( Fig  1B). PCR amplification efficiencies of each candidate reference gene were calculated from standard curves with significant linear relationship (R 2 > 0.99); amplification efficiencies were in between 92.4% to 106.2% (S2 Table).

Expression profile of reference genes as quantification cycle (Cq)
To provide an overview of the transcript levels, the Cq values of the 10 candidate reference genes were obtained by RT-qPCR using both staminate and pistillate flower samples at three developmental stages. The Cq values of all 10 candidate genes were shown in the box-plot chart (Fig 2). Ranking and determination of the optimal reference genes 1. Analysis using geNorm software. The geNorm software is a visual basic application to determine the most stable reference (housekeeping) genes. Ranking of each gene was produced based on their expression stability. Based on geNorm analysis, EF and UBI were considered as top most stable reference genes followed by TUB as second rank for their stability in staminate flower samples. In pistillate flowers, EF and UBI were ranked as first for their stability followed by CYC as second most stable reference gene. Whereas, ACTIN and PLA; Actin and DSK2A were least expressed, and hence, inconsistent genes in staminate and pistillate flower samples respectively (Fig 3).  GeNorm also computes a pairwise variation (V n /V n+1 ) analysis between normalization factors NFn and NFn+1 to determine the benefit of adding extra reference genes for the normalization. Adding another gene is unnecessary when small variation exists between V n / n+1 and V n+1 /V n+2 . A threshold value of 0.15 may be used to determine the necessity of the addition of more reference genes [17,24]. In our analysis, data shows V 2+3 values of staminate and pistillate flower were lower than 0.15, indicating that two reference genes were suitable for gene normalization (Fig 4). Therefore, pairwise variation analysis suggested that two reference genes could be useful for gene normalization of staminate and pistillate flower samples.
2. NormFinder software analysis. The NormFinder software program generates a stability value for evaluating expression variation when using reference genes for normalization. According to the NormFinder software, the gene that has the lowest stability value (SV) is the most stable reference gene for RT-qPCR. The stability values obtained from all 10 reference genes were listed in Table 2. NormFinder analysis results revealed that TUB (0.53), EF (0.99) and SLEEPER (1.00) were the top three stable reference genes in staminate flower samples, while GADPH1 (0.46), SLEEPER (0.54) and EF (0.69) were identified as top three stable reference genes in pistillate flower. TUB was top most stable gene in staminate flower but in pistillate flower, it exhibited low stability and ranked as sixth position. Similarly, GADPH1 was ranked as top most stable gene in pistillate flower, but in staminate flower samples, it ranked as fourth position. PLA and ACTIN exhibited the most unstable value and identified as low stable gene in both staminate and pistillate flower samples.
3. Analysis using BestKeeper software. The BestKeeper software evaluates the expression stability of the candidate reference genes based on the following three variables the standard deviation (SD), the coefficient of covariance (CV) and the coefficient of correlation (r) [18]. The most stable reference genes produced lower standard deviation values. The stability values of each reference gene were calculated by BestKeeper and the results are as shown in Table 3. EF (0.73), GAPDH1 (0.85) and TUB (0.98) were ranked as top three stable reference genes in staminate flower samples, while EF (0.83) was ranked as top first stable reference gene similar staminate flower followed by SLEEPER (0.92) and EF (1.02) as second and third stable genes for pistillate flowers. PLA and ACTIN exhibited low stability and ranked as least stable reference genes in staminate and pistillate flower samples respectively. In this analysis, EF gene was ranked as most stable reference gene in both staminate and pistillate flowers similar to geNorm analysis. The ranking order obtained in present analysis was found different from the results obtained by NormFinder software analysis.  samples hence, it was ranked as the least appropriate reference genes. TUB and GAPDH1 showed high stability in both staminate and pistillate flower similar to NormFinder analysis.
5. RefFinder software analysis. Finally, RefFinder tool was used to generate an integrated comprehensive ranking of the most stable candidate reference genes based on different software programs, including ΔC t , GeNorm, NormFinder, and BestKeeper. Results obtained in RefFinder analysis were formulated in Table 5. According to the comprehensive ranking analysis of RefFinder, the most stable genes were EF and GAPDH1 for total (staminate and pistillate), whereas TUB and EF for staminate flower samples, and GAPDH1 and EF for pistillate flower samples. EF exhibited stability and ranked as second stable reference gene in both staminate and pistillate flower samples. While, PLA and ACTIN exhibited low stability in both samples and ranked as the most unstable reference genes in staminate and pistillate flower samples.

Reference gene validation
It has been evident that the inappropriate reference genes used for target gene validation can dramatically change the interpretation of the expression pattern [13]. Results of the optimal reference gene study indicated that TUB and EF; GAPDH1 and EF were the two top most stable reference genes for RT-qPCR assays for staminate and pistillate flower samples respectively. To confirm the utility of candidate reference genes, the expression of a target gene, AGAMOUS in response to floral organ identity gene, which specifies stamen and carpel development was determined. The two top-most stable reference genes TUB and EF; GAPDH1 and EF for staminate and pistillate flower samples respectively and the least stable reference gene PLA selected from the analysis were used for the validation analysis. When normalized using the two most stable reference genes TUB and EF as staminate flower internal controls, the relative expression levels of AGAMOUS in staminate flower increased in stage 4 and declined in stage 5 (Fig 5).
On the other hand, the two most stable genes GAPDH1 and EF were used as pistillate flower internal controls, the expression level of AGAMOUS gene increased gradually in stage 1 to stage 4 and then declined at stage 5 ( Fig 5). The expression level of AGAMOUS gene normalized with PLA, which was the least stable reference gene as calculated in both staminate and pistillate flowers, the expression pattern showed fluctuations and failed to achieve a consistent expression pattern in both staminate and pistillate flower samples (Fig 5).

AGAMOUS gene copy numbers determination by quantitative RT-PCR
There is no detailed study on AGAMOUS gene expression pattern at different stages of staminate and pistillate flowers of physic nut. Hence, to confirm the selected stable reference genes stability, copy numbers of AGAMOUS gene was calculated based on standard curve obtain as described above. Resulting, the AGAMOUS gene copy numbers calculated at different stages of staminate and pistillate flower development were shown in Fig 6. AGAMOUS gene copy numbers was gradually increasing from stage 1 to stage 3 and then it was high at stage 4 and then it started to decline at stage 5 in both staminate and pistillate flowers (S3 & S4 Tables). Thus, the obtained result was similar to stable reference gene stability analysis data, while, PLA the least stable reference gene as internal control was in contrast to the gene copy number analysis data. Therefore, gene copy number analysis further proved that PLA was the least stable reference gene.

Discussion
Gene expression studies could lead to a better understanding of process involved in commercially important bioenergy crop Physic nut flower development. RT-qPCR has become mainstreams of a biological research tools due to its specificity, accuracy, efficiency and reproducibility for understanding the gene expression pattern of target genes which not only provides insights into the complex regulatory networks but also identifies novel genes relevant to key biological processes. To achieve high accuracy, a reference gene should have a relatively stable expression level into distinct biological samples, such as across tissues, developmental stages and experimental conditions. The present work is the first detailed study for the identification of set of control genes for gene normalization of transcript levels in staminate and pistillate flowers of physic nut. It has been recommended that more than two statistical algorithms should be used for reference gene stability evaluation [25]. Statistical programs, such as geNorm, NormFinder, ΔC t and BestKeeper have been successfully employed to determine the stability of reference gene expression and identify stable reference genes for various plant species. Besides, RefFinder, a comprehensive tool was developed to generate the final overall ranking of tested reference genes based on the geometric mean of every gene calculated by each program. A lower geometric mean of rankings indicates that the gene is more stable, and more narrow error bars indicate that the result is more reliable [25]. In the present study, discrepancies were found in the ranking order among four statistical analytical programs, which might be caused by distinct statistical algorithm procedures. In our study, geNorm analysis ranked EF, UBI and TUB as top most stable genes for staminate flowers, EF, UBI and CYC as best reference genes for pistillate flowers. NormFinder analysis recommended TUB, EF and SLEEPER as top three reference genes for staminate flowers, while GAPDH1, SLEEPER and EF as top three ranked reference genes in pistillate flowers. BestKeeper software generated EF, GAPDH1 and TUB for staminate, EF, GAPDH1 and CYC for pistillate as top most stable reference genes. ΔC t method computed TUB, EF and GAPDH1 for staminate, GAPDH1, EF and SLEEPER for pistillate as the top ranked reference genes. The comprehensive ranking analysis by RefFinder with four statistical programs identified TUB and EF for staminate, GAPDH1 and EF for pistillate as the top most stable reference genes for RT-qPCR of target gene expression studies. Taken all the results into consideration, all 10 reference genes exhibited differential stability in both flower samples.
Traditionally, a single gene served as reference gene without verification of its stability in different species, tissues and specific environmental conditions. Commonly used reference genes are cellular maintenance genes, which play housekeeping roles in basic cellular components and functions, such as elongation factor 1-α (EF1-α), tubulin (TUB), actin (ACTIN), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), 18S ribosomal RNA (18s RNA) and ubiquitin (UBQ). However, recent studies indicated that these classical reference genes were used for gene expression studies in different plant species, but its gene expression level among different species, tissues and environmental conditions were varied [26]. In physic nut flower development studies, TUB1 [7], 18s rRNA [27], 26s rRNA and GAPDH [28], ACTIN1 and ACTIN2 [29] genes were used for gene normalization. However, in this study, ACTIN exhibited unstable expression in both staminate and pistillate flower samples. The TUB gene, which plays a crucial role in cell structural maintenance, has also been widely used as a reliable reference gene in sweet grass [30] and peach [31]. Similarly, in our study, TUB was recommended as most stable reference gene in staminate flowers. However, this gene was ranked as least stably expressed reference gene in all the tested conditions in Lycoris auria [32]. EF exhibited stable expression in Glycine max [33], Populus euphratica [34] and Caragana korshinskii [35] under various environmental conditions. In this current study, EF was ranked as second stable gene in both staminate and pistillate flower samples. GAPDH1 gene, which encodes a key enzyme involved in the glycolysis and gluconeogenesis [36] is another commonly used reference gene. It has been reported as the most stable reference gene in physic nut over different tissues, developmental stages and environmental conditions [9]. Similarly, in our study GAPDH1 was the top most stable reference gene in pistillate flowers. The selected stable reference genes in staminate and pistillate flowers are diverse in various studies that evaluated reference genes in different plant species under specific environmental conditions.
In addition, we analyzed the optimal number of reference genes required for accurate normalization using geNorm software. To date, most studies of reference gene selection have reported that when evaluating levels of target gene expression, the results are more pronounced and reliable when two or more reference genes are utilized [37][38][39]. The pairwise variation studies suggest that a cut off value 0.15 is considered as the ratio, below which there is no additional requirement of any reference gene for normalization [17]. Further, a study indicated that pairwise variation value 0.15 is not an absolute cut off value but rather an ideal value and its use will depend upon the data [40]. However, in our study, the pairwise variation data recommended that two best stable reference genes could be useful for gene normalization in staminate and pistillate flowers of physic nut.
Significant variations of target gene expression levels were found when unstable reference genes were used as the internal control, leading to misinterpretation of experimental results [14,41]. Any gene with a minimal expression level variation in every analyzed sample is considered as best candidate reference gene [32]. To demonstrate the utility of validated reference genes, the expression pattern of AGAMOUS gene was examined in staminate and pistillate flowers of physic nut. AGAMOUS belongs to the C-class genes in the ABC model of floral organ development, which involve in the regulation of stamen and pistil development [21]. AGAMOUS gene expressed remarkably in stage 4 than young stage when TUB and EF; GAPDH1 and EF in staminate and pistillate flowers respectively. However, when the least stable reference gene PLA3 was applied, fluctuating expression pattern of AGAMOUS gene in both staminate and pistillate flowers. Thus, these results indicate that the appropriate selection of reliable reference genes serves important roles for normalization of target gene expression pattern.
Absolute quantification was employed to determine the copy number of AGAMOUS (Gene Bank: XM_012236469) gene cDNA expressed during the different developmental stages of physic nut flower. The absolute quantity analysis of AGAMOUS gene expression exhibits AGA-MOUS gene was expressed at stage 4, which were consistent with the stable reference genes analysis expression pattern, whereas, the least stable reference gene data differed. Hence, this data further confirms that the TUB and EF; GAPDH1 and EF were the stable reference genes for staminate and pistillate flowers respectively.

Conclusion
Investigation of ten candidate reference genes in staminate and pistillate flowers of physic nut using five different statistical algorithms indicated that TUB and EF; GAPDH1 and EF were the two most stable reference genes in staminate and pistillate flowers respectively. On the other hand, PLA and ACTIN were recommended as the least stable reference gene in both staminate and pistillate flowers. Thus, the stable reference genes identified in this report will enhance accuracy of normalization and quantification of target gene expression with RT-qPCR analysis for staminate and pistillate flower developmental studies in physic nut.