A Comprehensive Approach to Identify Reliable Reference Gene Candidates to Investigate the Link between Alcoholism and Endocrinology in Sprague-Dawley Rats

Gender and hormonal differences are often correlated with alcohol dependence and related complications like addiction and breast cancer. Estrogen (E2) is an important sex hormone because it serves as a key protein involved in organism level signaling pathways. Alcoholism has been reported to affect estrogen receptor signaling; however, identifying the players involved in such multi-faceted syndrome is complex and requires an interdisciplinary approach. In many situations, preliminary investigations included a straight forward, yet informative biotechniques such as gene expression analyses using quantitative real time PCR (qRT-PCR). The validity of qRT-PCR-based conclusions is affected by the choice of reliable internal controls. With this in mind, we compiled a list of 15 commonly used housekeeping genes (HKGs) as potential reference gene candidates in rat biological models. A comprehensive comparison among 5 statistical approaches (geNorm, dCt method, NormFinder, BestKeeper, and RefFinder) was performed to identify the minimal number as well the most stable reference genes required for reliable normalization in experimental rat groups that comprised sham operated (SO), ovariectomized rats in the absence (OVX) or presence of E2 (OVXE2). These rat groups were subdivided into subgroups that received alcohol in liquid diet or isocalroic control liquid diet for 12 weeks. Our results showed that U87, 5S rRNA, GAPDH, and U5a were the most reliable gene candidates for reference genes in heart and brain tissue. However, different gene stability ranking was specific for each tissue input combination. The present preliminary findings highlight the variability in reference gene rankings across different experimental conditions and analytic methods and constitute a fundamental step for gene expression assays.


Introduction
Alcoholism is linked to many different health problems including cardiovascular and neurological impairments and increased cancer risks [1]. It is hard to dissect the mechanism of action of alcohol abuse because it depends on many factors, such as gender, developmental stage, dose, and duration of alcohol consumption [2]. A link between hormones and alcohol dependence was also previously proposed [3]. Studies show that alcohol altered the hormone levels (i.e. progesterone, estrogen) in pre-and post-menopausal females [4] and in ovariectomized monkeys [5].
The significance of endocrinology in the etiology and mechanism of alcohol dependence and addiction has long been discussed [3,6]. Transient and permanent hormonal changes might be key players in alcohol-associated pathologies such as breast cancer [7][8][9][10] and neuro-remodeling phenomena like addiction [6]. These alcoholism-related diseases are promoted by fluctuations in gene expressions of some signaling pathways like estrogen and thyroid hormone receptors [9][10][11]. Researchers utilized different model systems, which include mice [12], rats [13], nematodes [14], fruit flies [15] to investigate the pathways involved in mediating alcohol's impact on the body. Despite the number of studies on ethanol-associated symptoms, many questions remain unanswered and require further investigations on the behavioral, genetic, and biochemical levels.
Quantitative real-time PCR (qRT-PCR) is a gold-standard biotechnique for gene expression analyses. Despite the emergence of the next generation deep sequencing technology, qRT-PCR remains the validation tool of choice. Even though qRT-PCR is a mature biotechnique, it is greatly affected by RNA integrity, purity, and concentration, primer and enzyme efficiencies, genomic DNA contamination, pipetting errors, as well as the choice of proper internal controls (reference genes) [16]. Molecular analyses necessitate reliable normalization to avoid false positive results, which introduce data misinterpretations and imprecise conclusions. An ideal reference gene should have a stable basal expression in different tissues, genders, developmental stages, and experimental conditions and should have similar expression levels to the target genes of interest [17]. So far, there is no one gene whose expression fulfills these criteria [18] although housekeeping genes (HKGs) were widely used as reference genes. The expression levels of HKGs are affected by various experimental conditions [19,20]. Thus, the identification of suitable reference genes is crucial and should precede gene expression analyses [17]. With this in mind, several statistical approaches have been designed to identify relatively more stable reference genes in response to specific experimental conditions. In this study, we evaluate the stability of 15 commonly used housekeeping genes using 5 statistical methods, which included geNorm [21], delta-Ct (dCt) method [22], NormFinder [23], and BestKeeper [24]. For more accurate ranking of the reference gene candidates, RefFinder was designed to provide a comprehensive ranking [25]. These programs ranked gene candidates based on pairwise comparisons (geNorm, dCt method, BestKeeper) as well as model-based approaches (NormFinder) to determine the most suitable genes.
Alcohol consumption is associated with adverse effects on the cardiovascular and neural systems. Based on the emergent roles of the hormone system in mediating alcohol-induced anomalies, we were interested in understanding the link between alcoholism and estrogen signaling in the heart and brain tissue of Sprague-Dawley rats. For that, we investigated the effect of chronic ethanol treatment on the stability of the expression levels of 15 housekeeping genes in rat heart and brain tissue for identifying most reliable genes as reference genes for gene expression analysis.
To perform this study, we treated Sprague-Dawley rats with ethanol (ETOH). The effect of ethanol on the stability of 15 reference gene candidates (Table 1) was investigated using rats with different hormonal backgrounds. The study was based on 6 rat groups. Untreated female (SHAM) and male rats were used as controls. One group of rats underwent ovariectomy (OVX). The last group of rats was ovariectomized and then treated with estrogen hormone (OVXE2). Rats belonging to those two groups were divided into two subgroups. Half of OVX as well as OVXE2 rats served as control (no ETOH treatment), while the remaining rats received ETOH.

Animal Handling and Treatment
Animal use and handling protocols were pre-approved and complied with East Carolina University Animal Use and Care Committee guideline. Female and male Sprague-Dawley (SD) rats (9-10 weeks old; Harlan, Indianapolis, UN, USA) were used. Male rats served as control. Female rats were divided into ovariectomized without (OVX) or with and estrogen supplementation (OVXE2) and sham-operated (SO) groups. Ethanol treatment and tissue collection following euthanasia were performed as in our previous studies [26,27]. Tissue isolation quickly followed. Tissue was flash frozen by liquid nitrogen and then stored at 280uC for subsequent molecular assays.

Sample Collection and RNA Extractions
Total RNA extraction was performed for heart and brain tissue weighing about 100-200 mg according to protocol using mirVana miRNA Isolation Kit (Life Technologies, CA, USA). Briefly, lysis buffer was added to each sample. The sample was kept on ice while being thoroughly homogenized. Then, an acid-phenol extraction separated RNAs from DNA and proteins. After adding 100% ethanol, the sample-ethanol mixture was passed through a glassfilter by centrifugation. Several washes preceded the elution of the RNA with DNase/RNase-free water. RNA was quantified and its purity was assessed using the NanoDrop ND-1000 Micro-Volume UVVis Spectrophotometer (NanoDrop Technologies, Wilmington, DE).

Reverse Transcription and qRT-PCR
Reverse transcription was performed using TaqMan microRNA Reverse Transcription kit (Applied Biosystems, Foster City, CA). Poly(T) was used to reverse transcribe the protein coding genes, while specific RT primers were used for the non-coding genes. A total of 1000 ng RNA were used for each RT reaction. RT-PCR was performed in the thermal cycle at 16uC for 30 min followed by 42uC for 30 min, 85uC for 5 min and were finally held at 4uC. For subsequent qRT-PCR, 100 uL DNase/RNase-free water was added to each RT product.
ViiA TM Real-Time PCR System (Applied Biosystem) was used to quantify the expression levels of 15 reference gene candidates on a 384-well-plate. SYBR Green PCR master mix was from SuperArray Bioscience Corp. (Frederick, MD). Specific reverse and forward primers were used (Table 1). Briefly, 5.5 mL DNase/ RNase free water, 7.5 mL SYBR Green master mix, 1 mL cDNA (1 ug), 1 mL primer mix were added to each well for a final 15 uL reaction. Four biological replicates were used. Initially, the reaction was set at 95uC for 10 min for enzyme activation and was followed by 40 two-step-cycles of denaturation for 15 sec at 95uC and an annealing/extension step for 60 sec at 60uC.
To prepare geNorm [21] input, the minimal Ct value was used for normalization of all Ct values for each gene across all samples (Ct original 2Ct min ). Hence, the lowest value was zero. The difference was then transformed (2 2(Ctoriginal2Ctmin) ). Data was structured such that the gene and sample symbols were in the first row and column, respectively. It was then used as input for geNorm applet. To determine the most stable gene pair, geNorm performs pairwise variation analyses (SD value) for each gene pair across all samples. The software assumes that the genes are not coregulated and that the transformed expression values of an ideal gene pair are identical across all samples. Then, the geometric mean of the SD values for each gene-related pair combinations is used to compute an M-value. A lower M-value reflects higher gene stability. What follows is a step-wise exclusion of the gene pairs with the highest M-values to reach the most stable gene pair. A beneficial feature of geNorm output is the V-value that reflects the minimal number of genes required for reliable normalization. Such is based on calculating a normalization factor ratio starting with the most stable genes. The program follows a step-wise inclusion process (V n /V n+1 ) for more genes until there is no significant change in the normalization factor.
Delta-Ct (dCt) method [22] depends on a concept similar to that of geNorm. However, it does not require a program specialist and can be performed using an excel sheet. This method was designed to overcome limitations associated with small samples like difficulties in using the same standardized mRNA concentrations due to possible protein contaminations. Such technical problems are relieved as genes within one sample are compared to each other by calculating the dCt value. Sensibly, the gene pair with the same dCt value (smallest SD) across all samples is considered to be stable and vice versa. The average of all SD values for each gene set of pairwise combinations is used to rank the gene stability. Genes with lower average SD are more stable than others.
NormFinder [23] is unique as it takes into consideration not only the overall intergroup variation (i.e. control vs. treatment), but also, the intragroup variation (i.e. experimental group replicates). Sample subgroups are taken into account to calculate the most stable gene candidate. Their model adds the two sources of variation to determine the systematic error introduced by the investigated gene. Therefore, this approach is less sensitive to misleading expression patterns for coregulated genes. Meanwhile, the approach takes into account the candidates with less intergroup variations, which might be mistakenly disregarded in the pairwise approaches.
BestKeeper [24] determines the stability of the gene candidates based on the SD of the gene expression levels across samples. Then, genes with least variable Ct values are used for subsequent pairwise comparisons, while those with SD.1 are excluded. The geometric mean of the Ct values of the most highly correlated genes is used to calculate a BestKeeper index. Then, the software calculates Pearson correlation coefficient [r] with a P-value to determine the similarity in the expression levels among the candidates. Thus, genes with least SDs and highest correlation with the index are ranked as the most stable genes. This excelbased applet allows the comparison of only 10 gene candidates. Therefore, we excluded the genes (18S, B2m, BACT, GAD-D45AF, and TBP) ranked as the least stable using geNorm, dCt method and NormFinder.
RefFinder [25] is another web-based interface that was used to deduce the most stable gene candidates among all methods. For each gene, RefFinder calculates the geometric mean of the ranks calculated by each of the previous approaches. Genes with the lowest rank geometric mean are considered as most stable.

Comparing Gene Stabilities by Descriptive Statistics
We calculated the mean and the standard derivation (SD) of the Ct values for heart and brain samples together, heart samples alone, and brain samples alone. In all combinations, GADD45A, TBP, BACT, 18S rRNA, and HPRT had the most variable expression levels reflected in their high SD values. On the other hand, TBP (Ct avg = 33.86), TNKS (Ct avg = 31.60), GADD45A (Ct avg = 29.69), B2m (Ct avg = 28.53), and HPRT (Ct avg = 26.12) had the highest Ct values and were therefore the least expressed among the gene candidates in the heart and brain ( Figure 1; Tables 2 and 3). Thus, TBP, GADD45A, and HPRT are less likely to be good candidates for normalization.
Regardless of sample combination, the genes with highest expression levels were the same, which included Z39 (Ct avg = 20.38), GAPDH (Ct avg = 20.32), U6 (Ct avg = 12.69), U5a (Ct avg = 10.69), and 5S rRNA (Ct avg = 7.77). However, Z39, U6, 5S rRNA, U5a, and U87 had the least variation in their expression in heart and in brain, respectively. When considering the Ct values from both tissue, Z39, U2, GAPDH, 5S rRNA, and U87 had the lowest SD ( Figure 1; Tables 2 and 3). Thus, U87, Z39, and 5S rRNA genes are more stable across groups and in all combinations. With this in mind, we can conclude that Z39 and 5S rRNA are likely to be used for normalization. However, a gene that is more highly abundant than the target genes of interest might mask true changes in expression if used for normalization. On the other hand, we can't evaluate others like U87 solely based on basic statistics because even though its expression was the least variable, there is still ambiguity in evaluating its relative expression level. Thus, more sophisticated statistical approaches should be employed to evaluate a candidate reference gene.

Quantitative Analysis of Reference Candidates Based on GeNorm
To determine the minimal number of genes required for normalization, we computed the V-value by geNorm. Starting with 2 genes, the software sequentially adds another gene and recalculates the normalization factor ratio. If the added gene does not increase the normalization factor ratio above the cutoff value (0.15), then the original pair of genes is enough for normalization. However, if the new ratio is above 0.15, then more genes should be included. We combined the heart and brain tissue for input in geNorm. The first V-value,0.15 was after V7/8 ( Figure 2B). This means that 6 additional genes were required for reliable normalization. The analysis started with a gene pair (i.e. 2 reference genes) and therefore the total would be 8 HKGs for normalization. That accounted for more than 50% of the gene list.
In the following paragraphs, we analyzed the rankings based on 5 statistical methods using the input for combined Ct values from heart and brain tissue. For a higher stringency measure, we only considered the first 6 ranked genes (,50%) for further analyses.

Determining Best Reference Candidates Based on dCt Method in Both Tissues
Gene ranking using the dCt method relies on relative pairwise comparisons. Using raw Ct values, the average SD of each gene set is inversely proportional to gene stability. As shown in Tables 4  and 5   Complimentary to the pairwise comparisons, NormFinder tests the stability of genes within each sample group as well as between groups. When considering both heart and brain tissue, the total number of sample groups was 12, each having 4 biological replicates. U87 (0.76) was the gene with the least inter and intravariation in expression levels; thus, U87 would be the most reliable reference gene. The stability values ranged from 0.82 to 1.11 for the other 5 candidate genes (UBC, GAPDH, 5S rRNA, TNKS, and HPRT) ( Table 5). Interestingly, based on geNorm, UBC and HPRT were among the least stable genes. This result is based on low intragroup, yet similar intergroup variation. Recalling the model-based approach, NormFinder prevents the exclusion of genes which might have consistent intergroup expression levels. Not necessarily 'similar', these genes have 'minimal' intergroup as well as intragroup variation. Nevertheless, a drawback in NormFinder is the requirement of a minimum of 8 samples/ group. For many gene expression studies including our own, it is challenging to have such a large sample size. Taken together, the differences in methodologies might be a reason behind the inclusion of these two genes among the most stable candidates.

Determining Best Reference Candidates Based on BestKeeper in Both Tissues
Due to the input size limitation, BestKeeper only analyzed 10 genes, which were ranked the most stable based on other three programs (geNorm, dCt method, and Normfinder). BestKeeper provided a two-way ranking, which separated the correlation of expression among the genes from the overall variations in expression levels (SD). From each approach, we considered the top 3 genes. Those computed to be highly correlated with p-values  (Table 6). Statistically speaking, this trend is sensible. When the homogeneity of a group increases, the variance (SD) decreases as in the case of U87, 5S rRNA, GAPDH, and Z39 and [r] tends to zero [28]. In fact, BestKeeper calculated the least SD values for these 4 genes. Thus, these genes will share less variation with the others in pairwise variation and will therefore have the least correlation coefficient. That does not render them unsuitable as reference genes candidates and stresses on the importance of taking both criteria ([r] and SD) to choose the best candidate(s). The inclusion of the top three from the [r]-based and SD-based ranking was consistent with 4 out of 6 best ranked genes in geNorm, and 5 out of 6 top genes in dCt-method and NormFinder (Table 5). In addition, consistent with NormFinder, UBC and HPRT were also ranked among the top 6 by BestKeeper (Table 6). Comparing among the different methodologies helped remove the doubt in NormFinder's result which might have aroused from its requirement of a larger sample size.

Comprehensive Ranking of Best Reference Genes Using RefFinder in Both Tissues
Based on the geometric mean (GM) of the rankings obtained from 4 complementary statistical approaches, U87 was the preferred candidate (GM = 1.32). The remaining highly ranked candidates were 5S rRNA, GAPDH, U6, U5a, and UBC with GM values ranging from 2.83 to 5.18, respectively. On the other hand, B2m, BACT, 18S rRNA, TBP, and GADD45A all had GM values higher than 10 ( Table 5). These 5 candidates had the lowest ranking and less likely to serve as reliable reference genes for normalizing gene expression.  The first analysis was performed for all samples together to identify a common reliable reference gene. Then, we also analyzed brain and heart samples separately to see whether or not there was a difference between tissues.
Unlike the results from combined tissue, the first V-value,0.15 was at V2/3 ( Figure 2). Thus, only 2 stable reference genes were needed for gene expression analysis in heart or brain tissue. A closer look at the data in Figure 2, the average of V-values for combined heart and brain tissue was 0.18. Individually, the average V-value for the heart and brain were 0.13 and 0.14, respectively. Thus, the gene candidates were merely more stably expressed in the heart tissue than in the brain tissues. However, though the genes' expressions were consistent and stable within each tissue, it was different between the heart and brain. This can be inferred from the dramatic increase in the V-value average when both tissues were combined for analysis.
The top six most stable reference gene candidates were same for heart and brain tissues. The choice of the gene pair depends on the estimated expression levels of the targeted genes of interest. If the expression profiles of the genes of interest is unknown, then choosing reference candidates from the low and high extremes would be recommended such as U87 and 5S rRNA.

Discussion
Housekeeping genes are commonly used for normalizing gene expression because they are thought to be consistently expressed cross different tissues and among different treatment. However, this was challenged recently. Current studies show no one gene remains stable throughout all experimental conditions. Ideal reference genes vary with different species, strains, developmental stage, tissue and even different sampling times [29]. To maintain the integrity of qRT-PCR as a powerful ''discovery'' and ''validation'' biotechnique, the choice and the number of reference genes used should be customized to every experiment setting. Thus, the first task is to identify reference gene candidates from either systematic gene expression studies like microarrays or by compiling a gene list from previous studies with similar experimental conditions. Subsequently, their relative stability is compared using statistical approaches. In our study, we followed the same workflow to determine reliable reference genes in SD rats. Normalizing to the top-ranked genes will reveal possible roles of hormonal/gender differences, mainly estrogen levels, in alcoholism. Below, we highlight the significance of our study by discussing some shortcomings associated with employing single statistical approaches in reference gene identification. In the end, we show the advantages of our combinatorial approach and present recommendations of control gene candidates to use or to avoid in similar experimental settings. The top 6 most stable reference candidates in the bi-tissue input were U87, 5S rRNA, GAPDH, U6, U5a, and UBC. In the singletissue input, the top 6 were similar and only with a slight difference in the order. Results for the combined and separate inputs differed by only one reference gene 'Z39'. This means that Z39 was stable within each tissue (SD,1), but its expression differed between heart (Ct avg = 19.660.5) and brain (Ct avg = 21.1660.7). Moreover, U5a (DCt avg = 1.64) and U6 (DCt avg = 1.59) also had the highest Ct avg difference between heart and brain tissue ( Table 2). Even though the SD for Z39, U5a, and U6 was ,1 in heart or brain tissue, their expression varied between heart and brain. Due to the inter-tissue variation, using Z39, U5a, and U6 in combined gene expression analysis is not recommended. In all input combinations, 18S rRNA, TBP, GADD45A, and BACT were the least stable among the 15 tested genes. These genes might cause inaccurate conclusions in gene expression normalizations for heart and/or brain tissue and therefore should not be used as reference genes.

Comparing Among Different Methods and Tissue
When considering all brain and heart tissue samples together, U87, 5S rRNA, and GAPDH were commonly ranked among the top 6 most stable reference genes in all 5 statistical approaches. On the other hand, only 'U87 and U5a', and 'U87, U5a, U6, and Z39' were ubiquitously ranked among the top 6 across 5 methods in each of the heart tissue, and brain tissue, respectively. As shown in Figure 3, seniority of U87 was common to all statistical methods and all tissue combinations. If the targeted genes of interest are expressed at a lower level, we recommend the use of U87 with GAPDH for the two tissues. If the targeted genes are expressed at a higher level, then 5S rRNA and U87 would be better for normalization. U87 and U5a were commonly ranked among the best in each of heart and brain tissue. Thus, for gene expression analysis concerned with heart tissue or brain tissue, U87 and U5a would serve as better reference genes.

Fallibility of Normalizations to Single Commonly Used HKGs
There is a wealth of resourceful studies that identified experiment-specific reference genes for normalization. We summarized some of the results for investigations that employed similar experimental conditions (i.e. rats, estrogen, and alcohol). Table 9 shows different HKGs as better reference genes for different tissue, treatments, treatment times, strains, species, and statistical methods. This suggests the necessity of conducting preliminary studies to use reference genes adapted for particular experimental conditions. In our study, U87, 5S rRNA, GAPDH and U5a were ranked as the top gene candidates using a combination of 5 statistical approaches. Even though, 5S rRNA was stable in rat liver treated with hepatotoxins [30], both U87 and 5S rRNA were among the least stable in SD rats suffering from oxygen-induced retinopathy [31]. While some studies reported GAPDH to be a relatively stable housekeeping gene in heart and brain tissue [29,[32][33][34], its expression was nevertheless affected by treatments such as MB in rat brain [35], estrogen in ovariectomized C57BL6 mice [36], male and female fathead minnow fish [37], and RARAW 264.7, ATDCDC5 and HFLS cells [38].
On the contrary, 18S rRNA was among the least stable gene candidates in our settings. This is in agreement with other studies using carotid body from different SD strains under different oxygenation levels [39], oligodendrocyte cells from age-asynchronized Winstar rats treated with LXR agonist [40], male flinder rat hippocampus treated with MB [35], and liver of hypophysectomized male and female SD rats [41]. However, 18S rRNA was considered a good reference gene in Winstar rat livers treated with hepatotoxins [30], human livers with alcoholic liver disease [42], the uterus of ovarietomized C57BL6 mice treated with estradiol [36], and in liver and gonads of male and female fathead minnow fish [37]. We also showed that TBP was unstable and that was supported by another study on heart tissue of young and adult SD rats subjected to PHDI treatments under different oxygen pressures [33]. However, that was not the case in the heart of Zucker obese rats under different glycemic states [43], nor in the hippocampus of SD rats with TBI [34]. TBP was also considered stable in response to estrogen in multiple tissue of the fathead minnow fish [37].
The Importance of Using More than One Statistical Approach No one statistical approach can cover all variables associated with gene expression studies. Therefore basing conclusions on one method can be associated with false positive results and misleading conclusions. In our study, we followed a round-about approach to determine good reference candidates for reliable normalization of gene expression data in Sprague-Dawley rats. This allowed us to correct for some inaccurate ranking such as geNorm's raking of UBC, which was corrected by NormFinder. Also, based on the systematic interpretation, we were able to get a clearer picture on the minimal number of reference genes required for reliable normalization. After removing Z39, U5a and U6, only 3 genes are enough to serve as reference genes for analysis on heart and brain tissue combined. This makes the study more practical (8 vs. 3 control genes) and reliable at the same time.
In conclusion, it is difficult to ascertain whether the inconsistency or variability in the stability of the housekeeping genes arises from the employment of different statistical methodologies or different treatments. For example, in two studies concerned with rat liver, GAPDH was stable after TCDD treatment based on Figure 3. Van Diagram that summarizes the commonly ranked top gene candidates. Firstly, the top 6 genes ranked by each of the geNorm, dCt method, NormFinder, BestKeeper, and RefFinder were compared for each input: Heart+brain, heart, and brain samples. Only genes common for all 5 methods were chosen for each input. Those genes were then compared among all input combinations and presented in the diagram above. doi:10.1371/journal.pone.0094311.g003 geNorm [44], but it was not when using other statistical methods in liver cells under different conditions [32]. What is noteworthy is that the ranking of the reference genes is always relative and that can change simply by changing a few candidates in the gene list. Therefore, despite the superfluous publications, research concerned with the determination of reference remains juvenile. With more efforts being dedicated to tackle this issue, a meta-analysis would help reveal patterns that might redirect and standardize our normalization methods for more accurate interpretation of results.