Reference Gene Selection for Quantitative Real-time PCR Normalization in Caragana intermedia under Different Abiotic Stress Conditions

Quantitative real-time reverse transcription polymerase chain reaction (qPCR), a sensitive technique for gene expression analysis, depends on the stability of the reference genes used for data normalization. Caragana intermedia, a native desert shrub with strong drought-resistance, sand-fixing capacity and high forage value that is widespread in the desert land of west and northwest China, has not been investigated regarding the identification of reference genes suitable for the normalization of qPCR data. In this study, 10 candidate reference genes were analyzed in C. intermedia subjected to different abiotic (osmotic, salt, cold and heat) stresses, in two distinct plant organs (roots and leaves). The expression stability of these genes was assessed using geNorm, NormFinder and BestKeeper algorithms. The best-ranked reference genes differed across the different sets of samples, but UNK2, PP2A and SAND were the most stable across all tested samples. UNK2 and SAND would be appropriate for normalizing gene expression data for salt-treated roots, whereas the combination of UNK2, SAND and EF-1α would be appropriate for salt-treated leaves. UNK1, UNK2 and PP2A would be appropriate for PEG-treated (osmotic) roots, whereas the combination of TIP41 and PP2A was the most suitable for PEG-treated leaves. SAND, PP2A and TIP41 exhibited the most stable expression in heat-treated leaves. In cold-treated leaves, SAND and EF-1α were the most stably expressed. To further validate the suitability of the reference genes identified in this study, the expression levels of DREB1 and DREB2 (homologs of AtDREB1 and AtDREB2) were studied in parallel. This study is the first systematic analysis for the selection of superior reference genes for qPCR in C. intermedia under different abiotic stress conditions, and will benefit future studies on gene expression in C. intermedia and other species of the leguminous genus Caragana.


Introduction
Quantitative real-time reverse transcription polymerase chain reaction (qPCR) is an efficient, specific, and reproducible method for quantifying transcript expression levels, and is widely used to analyze mRNA in different organisms [1], developmental stages [2,3] and responses to abiotic and biotic stress [4][5][6][7].However, the accuracy of qPCR is influenced by a number of variables, such as RNA stability, quantity, purity, enzymatic efficiency in cDNA synthesis and PCR amplification [8].Thus, to avoid bias, a normalization step is an essential pre-requisite.The most accepted approach for normalization is to include one or a small number of reference genes (internal control genes), whose expression is presumed stable in control and experimental conditions [9,10].
The traditional reference genes are mostly cellular maintenance genes, such as 18S ribosomal RNA (18S rRNA), actin (ACT), tubulin (TUB), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), and elongation factor 1-a (EF1-a) [11,12].However, recent studies indicate that these genes are not always stably expressed when tested in other species or under a wider range of experimental treatments [13][14][15].Recently, some new reference genes were identified by microarray analyses in Arabidopsis thaliana and soybean that show highly stable expression levels [16,17].These reference genes include SAND family protein (SAND), protein phosphatase 2A (PP2A), TIP41-like family protein (TIP41), F-box/kelch-repeat protein (F-box), phosphoenolpyruvate carboxylase-related kinase 1 (PEPKR1) and others.Many of these reference genes were found to outperform traditional reference genes, for example, PP2A in hybrid roses [18], SAND in buckwheat [19] and TIP41 in peanut [20] were the most stably expressed genes in those systems.Therefore, systematic validation of reference genes is essential for certain experimental conditions and in different species [10].Statistical algorithms, such as geNorm [21], NormFinder [22], and BestKeeper [23], have been used to identify the best reference genes for qPCR data normalization under different experimental conditions.
C. intermedia belongs to the family Fabaceae, and is a native desert shrub with strong drought-resistance, sand-fixing capacity and high forage value that is widespread in the desert land of west and northwest China [30].From a scientific standpoint, it has proven an ideal material for studying the mechanisms of drought and salt tolerance of shrubs in China, because of its easy cultivation and strong abiotic resistance [30][31][32].
In this study, 10 candidate reference genes (ACT7, TUA5, EF-1a, PP2A, SAND, TIP41, F-box, PEPKR1, UNK1, UNK2) were selected because of their stable expression in microarray studies in A. thaliana and soybean [16,17].The stability of these genes was analyzed in C. intermedia subjected to different abiotic (osmotic, salt, cold and heat) stresses, in two distinct plant organs (roots and leaves).Furthermore, to validate the selection of candidate reference genes, the expression levels of DREB1 and DREB2 homologues were assessed using different reference genes.This work will benefit future studies on gene expression in C. intermedia and other species of the leguminous genus Caragana.

Expression Profiling of Candidate Reference Genes
A total of 10 candidate reference genes were assessed using qPCR to quantify their mRNA levels (Table 1).The expression levels of the candidate reference genes were determined as quantification cycle (Cq) values, and the transcripts of these genes showed different levels of abundance (Figure 1).The mean Cq values of the genes ranged from 26-35, with most lying between 28 and 30 across all tested samples.EF-1a had the lowest Cq (mean Cq of 25.8), indicating the highest level of expression, SAND, PP2A, TIP41, TUA5, UNK1 and UNK2 were moderately expressed, F-box and ACT7 were expressed at low levels (mean Cq of 32.9 and 34.9).SAND showed the least gene expression variation (coefficient of variation, CV, of 3.05%), while ACT7 (5.78%) and TUA5 (5.74%) were the most variable across all samples.
The variation in relative transcript amount of the reference genes across all tested samples is shown in Figure 2. Transcript amounts are represented as percentages, relative to the aggregated reference transcript pool for each sample.The proportions of PP2A, SAND and UNK2 transcript remained relatively constant, while those of ACT7, TUA5 and F-box were more variable across all samples.The transcript levels of UNK1 remained relatively constant in PEG-treated roots (PR) or leaves (PL), and those of TIP41 were also relatively constant in the PL and cold-treated leaves (CL).Although the expression level of EF-1a was more variable across all samples and particularly high in the PR and salt-treated roots (SR), its expression was relatively constant in the CL and salt-treated leaves (SL).These results clearly suggested that the expression level of none of the reference genes is truly constant, and varies in different spatial and temporal patterns and environmental conditions.

Expression Stability of Candidate Reference Genes
a) geNorm analysis.The expression stability of the 10 reference genes was assessed using the geNorm software.The geNorm algorithm is based on the principle that the logarithmically transformed expression ratio between two genes should be constant if both genes are stably expressed in a given sample set.The candidate reference genes were ranked by geNorm based on the expression stability value M, which is calculated for all genes being investigated (the lower the M value, the higher the gene's expression stability) [21].
Figure 3 shows the ranking of the tested genes according to their expression stability in the C. intermedia samples, using data from six sets of treatment.When all 38 samples were analyzed together, UNK2, PP2A, and SAND were the most stable genes, while ACT7 and EF-1a were the least stable (in order).In salt stress treatments, UNK2 and SAND were the most stable genes, while ACT7 and PEPKR1 were the least stable in the SR and SL treatments, respectively.In the PR treatment, UNK2 and UNK1 were the most stable genes and ACT7 the least; In the PL treatment, TIP41 and PP2A were the most stable and F-box the least.In heat-treated leaves (HL), SAND and PP2A were the most stable genes and ACT7 was the least stably expressed.In the CL treatment, the expression  levels of SAND and EF-1a were the most stable and UNK1 was the least stable.In addition, all of the tested reference genes showed relatively high stability with M values of less than 1.5, which is below the default limit of M #1.5.
geNorm performs a stepwise calculation of the pairwise variation (V n /V n+1 ) between sequential normalization factors (NF n and NF n+1 ) to determine the optimal number of reference genes required for accurate normalization.A large variation means that the added gene had a significant effect and should preferably be included for calculation of a reliable normalization factor [21].As shown in Figure 4, the inclusion of a fourth gene had no significant effect (that is, low V 3/4 value) for all pooled samples or for the SL treatment, so three reference genes would be optimal for normalizing gene expression under those conditions.Similarly, two reference genes would be sufficient for the SR and PL treatments, four for the PR and CL treatments, and five for the HL treatment.In the practical application, three reference genes for the PR (V 3/4 = 0.1335) and HL (V 3/4 = 0.1480) treatments, and two for the CL treatment (V 2/3 = 0.1282) could also be accepted using a threshold value of 0.15 [14,21,33,34].
b) NormFinder analysis.NormFinder program is a Visual Basic application tool for Microsoft Excel used to determine the expression stabilities of reference genes that ranks all reference gene candidates based on intra-and inter-group variations and combines both results into a stability value for each candidate reference gene [35].The results of NormFinder analysis were slightly different from those of geNorm (Table 2).Both methods of analysis ranked as most stable UNK2 and SAND in the SR treatment; UNK2, SAND and EF-1a in the SL treatment; PP2A and TIP41 in the PL treatment; PP2A, UNK1 and UNK2 in the PR treatment.However, in the CL treatment, PP2A and UNK2 emerged as the most stably expressed, whereas they were ranked fifth and fourth, respectively, by geNorm.In the HL treatment, PP2A, UNK2 and SAND were in the top positions, while geNorm ranked UNK2 in the fourth position.When evaluated across all experimental samples, UNK2, PP2A and TIP41 were in the top positions, whereas TIP41 was ranked fifth by geNorm.
c) BestKeeper analysis.BestKeeper determines the most stably expressed genes based on the coefficient of correlation to the BestKeeper Index, which is the geometric mean of the candidate reference gene Cq values.BestKeeper also calculates the standard deviation (SD) and the coefficient of variation (CV) based on the Cq values of all candidate reference genes [23].Genes with SD greater than 1 are considered unacceptable [36].Reference genes are identified as the most stable genes, i.e. those that exhibit the lowest coefficient of variance and standard deviation (CV 6 SD) [37].The results of BestKeeper analysis are shown in Table 3.In the PL and CL treatments, the same four genes were identified by both the BestKeeper and geNorm programs, although their rank order was slightly altered.In the SL treatment, UNK1 emerged as the most stably expressed (ranked seventh by geNorm and NormFinder).In the SR treatment, the same three genes, UNK2, SAND and TIP41, were identified by both the BestKeeper and geNorm programs, although their rank order was slightly altered.In the HL treatment, EF-1a emerged as the most stably expressed (ranked sixth by geNorm and seventh by NormFinder).When evaluated across all experimental samples, SAND, TIP41 and PP2A were in the top positions, whereas TIP41 was ranked fifth by geNorm and third by NormFinder.

Reference Gene Validation
To validate the selection of candidate reference genes, the expression pattern of DREB1 and DREB2 were analyzed using the selected reference genes (Figure 5).In A. thaliana, expression of the DREB1 was induced by low-temperature stress, whereas expression of the DREB2 was induced by drought and high salinity [38].In this study, expression of DREB1 in cold-stressed leaves and expression of DREB2 in salt-stressed roots were assessed.
When the two most stable reference genes, SAND and EF-1a were used for normalization, the expression levels of DREB1 increased sharply after 3 h of treatment, peaked at 6 h, and thereafter decreased (Figure 5A).When the least stable gene UNK1 was used for normalization, the expression patterns and transcript levels were very different.For DREB2, when the two most stable reference genes, UNK2 and SAND were used for normalization, the transcript levels of DREB2 increased rapidly from 1 h, peaked at 3-6 h, and thereafter decreased.The expression level of DREB2 peaked at 3 h using just SAND and at 6 h using just UNK2.Similar expression patterns were generated when the less stable reference gene TUA5 was employed (Figure 5B).Normalization based on the least stable reference gene ACT7 found that the transcript level increased rapidly and peaked at 6 h, decreased between 6 and 24 h, and thereafter increased again up to 48 h, which obviously differed from normalization against SAND and UNK2.

Discussion
In plant molecular biological research, qPCR has become an important tool for understanding gene expression in different experimental conditions.For accurate qPCR measurements, endogenous reference genes are used as internal controls.An ideal reference gene should be representative of the overall expression across all possible tissues (cells) and experimental conditions [10].However, such a perfect reference gene is impossible and nothing close has yet been reported.This means that reference genes need to be validated under certain experimental conditions and among various species.
C. intermedia is a native desert shrub that is widespread in the desert land of west and northwest China, but the application of qPCR in this species has been limited by a lack of information on reference gene stability in a variety of experimental contexts.Here, we describe the analysis of 10 candidate reference genes to improve relative quantification by qPCR for gene expression analysis in C. intermedia.Using three algorithms (geNorm, NormFinder and BestKeeper), we evaluated the expression stability of these ten genes under different abiotic (osmotic, salt, cold and heat) stress conditions in roots and leaves.As far as can be ascertained, this is the first systematic study of the expression stability of reference genes for qPCR in C. intermedia under different abiotic stress conditions.geNorm, NormFinder and BestKeeper are often used to select reference genes.Because they employ different strategies, they can give different results [39,40].For example, in the SL treatment, UNK1 emerged as the most stably expressed using BestKeeper, while it was ranked seventh by geNorm and NormFinder.In the HL treatment, BestKeeper selected EF-1a as the most stably expressed gene, but it was ranked sixth by geNorm and seventh by NormFinder.Finally, in the CL treatment, SAND and EF-1a were identified as the most stably expressed by geNorm, but were ranked third and fifth, respectively, by NormFinder.We considered the results of the three algorithms together when determining suitable reference genes for qPCR normalization (Table S1) [18].
In the SL treatment, the pairwise variation V 3/4 value (0.1321), calculated by geNorm, suggested the use of UNK2, SAND and EF-1a for normalization.These three genes were also identified by NormFinder.BestKeeper, unlike the other two programs, ranked UNK1 as most stable (ranked seventh by geNorm and NormFinder), UNK2 as second, SAND as third and EF-1a as fourth.Based on these results, we recommended UNK2 combined with SAND and EF-1a as the best combination of stable reference genes for qPCR in the SL treatment.
In the HL treatment, the pairwise variation V 3/4 value (0.148) indicated that the three most stable genes (SAND, PP2A and TIP41) can be used for normalization.BestKeeper ranked EF-1a as most stable (ranked seventh by geNorm and NormFinder), TIP41 as second.NormFinder, ranked PP2A as the best reference gene and SAND and TIP41 as third and sixth, respectively.Based on these results, we inferred that SAND, PP2A and TIP41 would be appropriate for qPCR in the HL treatment.
In the CL treatment, the pairwise variation V 2/3 value (0.1282) indicated that the two top ranked genes (SAND and EF-1a) can be used for normalization.NormFinder ranked SAND as third, and EF-1a as fifth, however, their stability values did not differ substantially from those of higher-ranked genes (e.g.0.160 for SAND versus 0.142 for first-ranked PP2A).BestKeeper ranked SAND as most stable, and EF-1a as third, similar the results of geNorm.Altogether, we recommended SAND and EF-1a to be the suitable reference genes for qPCR in the CL treatment.Similarly, UNK2 and SAND would be sufficient for the SR treatment, PP2A, UNK2 and UNK1 for the PR treatments, and SAND, PP2A and TIP41 for the PL treatment.In summary, UNK2, SAND and PP2A were the most stably expressed genes, while ACT7 was the most variable, over all samples.UNK2 has been noted as showing stable expression across tissues and developmental stages in tomato [13], soybean [41] and aspen [42].SAND and PP2A have been noted as showing stable expression across tissues and different abiotic and biotic stress conditions in roses [18] and buckwheat [19].The weakness of ACT7 was also seen in soybean, where its expression was found to be variable [41].
The transcript levels of DREB2 peaked at 3-6 h and then began to decline at 12 h in roots under salt stress conditions when UNK2 and SAND were used for normalization (Figure 5B).A similar expression pattern was described under salt stress conditions in Caragana korshinskii [43], indicating that the reference genes identified in this study are suitable under such conditions.In the expression profile or transcript abundance quantification produced from normalization using the least stable gene ACT7, the transcript level increased rapidly and peaked at 6 h, decreased between 6 and 24 h, and thereafter increased again up to 48 h, which obviously differed from normalization against SAND and UNK2.Obviously, ACT7 is not a suitable reference gene to normalize gene expression in C. intermedia under such conditions.These results indicate that the incorrect use of reference genes without validation may reduce precision or produce misleading results.

Conclusions
To our knowledge, this study is the first systematic analysis for the selection of superior reference genes for qPCR in C. intermedia under different abiotic (osmotic, salt, cold and heat) stress conditions.Analysis of expression stability using geNorm, NormFinder and BestKeeper revealed that UNK2, PP2A and SAND could be considered to be appropriate reference genes for gene expression analysis of different tissues under different abiotic stress conditions, whereas ACT7, PEPKR1 and F-box showed relatively low expression stability.This work will benefit future studies on gene expression under different abiotic stress conditions in C. intermedia and other species of the leguminous genus Caragana.

Plant Materials and Treatments
Seeds of C. intermedia were collected from the Experimental Base (Hohhot, Inner Mongolia, China), Research Institute of Forestry, Chinese Academy of Forestry.Seeds were washed three times with tap water, and then sown in plastic pots filled with peat soil in a growth chamber with a 16 h light/8 h dark photoperiod at 25/ 22uC day/night temperatures and relative humidity 80%.For salt and osmotic stress treatments, three-week-old seedlings were carefully removed from the soil to avoid injury, their roots were washed cleanly with tap water, and they were placed in NaCl (200 mM) or PEG6000 (15%) solutions, respectively, for 0, 1, 3, 6, 12, 24, or 48 h in the growth chamber.For the cold and heat stress treatments, the seedlings in pots were grown at 4uC or 42uC, respectively, for 0, 1, 3, 6, 12, 24, or 48 h.Leaves were collected from the three-week-old seedlings subjected to all four treatments, and roots were collected from the seedlings subjected to salt and osmotic stress treatments.These were immediately frozen in liquid nitrogen and stored at 280uC.Samples above were collected from 3 seedlings to give 3 replicas.

Total RNA Isolation and cDNA Synthesis
Total RNA was extracted from treated tissues using Trizol reagent (Invitrogen, USA) according to the manufacturer's instructions.The remaining DNA was removed by RNase-free DNase according to the manufacturer's instructions (Promega, USA).Total RNA concentration and purity was determined using a Nanodrop ND-1000 spectrophotometer (Nanodrop Technologies, USA).RNA samples with an absorbance ratio at OD260/ 280 between 1.9 and 2.2 and OD260/230 < 2.0 were used for further analysis.RNA integrity was verified by 1.5% agarose gel electrophoresis.Samples with 28S/18S ribosomal RNA between 1.5 and 2.0 and without smears on the agarose gel were used for subsequent experiments.
For each sample, 1 mg of total RNA was reverse transcribed using the RevertAidTM First Strand cDNA Synthesis Kit (Fermentas, Germany) in a 20 ml reaction using oligo dT primers according to manufacturer's instructions.The cDNAs were diluted 1:30 with nuclease-free water prior to the qPCR analyses.

Selection of Candidate Reference Genes
Potential homologues of the ten published reference genes were identified from the transcriptome data sequences of C. intermedia seedlings (unpublished data) (Table 1).

PCR Primer Design and Test of Amplification Efficiency
Primers were designed using the Primer Premier 5 software (http://www.PremierBiosoft.com/primerdesign/primerdesign.html) with melting temperatures 58-62uC, primer lengths 20-24 bp, GC content 45-55% and amplicon lengths 50-230 bp (Table 1, Table S2 and Figure S1).For each primer pair, amplification efficiency estimates were derived from a standard curve generated from a serial dilution of pooled cDNA (1, 10, 10 2 , 10 3 , 10 4 , 10 5 6dilutions; each gene in triplicate) (Figure S2).Mean quantification cycle (Cq) values of each ten-fold dilution were plotted against the logarithm of the pooled cDNA dilution factor.Efficiency (E) for each gene was determined with the slope of a linear regression model [44] using the Cq values and the following equation was used: Quantitative Real-time RT-PCR qPCR reactions were carried out with an ABI Prism 7700 Sequence Detection System (Applied Biosystems, USA), using SYBRH Premix Ex Taq TM (Takara, Japan) in a 20 ml reaction volume (containing 2 ml diluted cDNA, 10 ml 2 6 SYBR Premix Ex Taq TM , 0.4 ml ROX Reference Dye, and 0.4 ml each primer).The reaction conditions were: an initial denaturation step of 95uC/30 s, followed by 40 cycles of 95uC/5 s and 60uC/30 s.The dissociation curve was obtained by heating the amplicon from 60 to 95uC (Figure S3).All qPCR reactions were carried out in biological and technical triplicate.A non-template control was also included in each run for each gene.The final quantification cycle (Cq) values were the means of nine values (biological triplicate, each in technical triplicate).

Statistical Analysis
Three different types of Microsoft Excel-based software, geNorm [45], NormFinder [46] and BestKeeper [47], were used to rank the expression stability of reference genes across all of the experimental sets.Following qPCR data collection, Cq values were converted to relative quantities using the formula: 2 2DCq , in which DCq = each corresponding Cq value 2 minimum Cq value.The sample with the maximum expression level (the minimum Cq value) was used as a calibrator and was set to a value of 1. Relative quantities were used for geNorm and NormFinder, while Best-Keeper analyses were based on untransformed Cq values.All three software packages were used according to the manufacturer's instructions.All other multiple comparisons were performed with SPSS17.0.

Figure 4 .
Figure 4. Determination of the optimal number of reference genes required for effective normalization.Pairwise variation (V n/Vn +1 ) analysis between the normalization factors (NF n and NF n +1 ) was performed by the geNorm program to determine the optimal number of reference genes, and carried out for qPCR data normalization in various sample pools.doi:10.1371/journal.pone.0053196.g004

Figure 5 .
Figure 5. Relative quantification of DREB1 and DREB2 expression using validated reference genes for normalization.The results are represented as mean fold changes in relative expression compared to the first sampling stage (0 h). cDNA samples were taken from the same set used for gene expression stability analysis.(A), leaves were collected from three-week-old seedlings subjected to cold stress after 0, 1, 3, 6, 12, 24 and 48 h treatment.(B), roots were collected from three-week-old seedlings subjected to salt stress after 0, 1, 3, 6, 12, 24 and 48 h treatment.doi:10.1371/journal.pone.0053196.g005

Table 1 .
Caragana intermedia candidate reference genes descriptions and comparison with Arabidopsis orthologs.

Table 2 .
Expression stability of the reference genes calculated by NormFinder software.SR, roots exposed to high-salt treatment; SL, leaves exposed to high-salt treatment; PR, roots exposed to PEG treatment; PL, leaves exposed to PEG treatment; HL, leaves exposed to heat treatment; CL, leaves exposed to cold treatment. doi:10.1371/journal.pone.0053196.t002

Table 3 .
Expression stability of the reference genes calculated by BestKeeper software.SR, roots exposed to high-salt treatment; SL, leaves exposed to high-salt treatment; PR, roots exposed to PEG treatment; PL, leaves exposed to PEG treatment; HL, leaves exposed to heat treatment; CL, leaves exposed to cold treatment.Descriptive statistics of 10 candidate genes based on the coefficient of variance (CV) and standard deviation (SD) of their Cq values were determined using the whole data set.Reference genes were identified as the most stable genes, i.e. those with the lowest coefficient of variance and standard deviation (CV 6 SD). doi:10.1371/journal.pone.0053196.t003