Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identification of Novel Reference Genes Using Multiplatform Expression Data and Their Validation for Quantitative Gene Expression Analysis

  • Mi Jeong Kwon,

    Affiliation Laboratory of Molecular Pathology, Department of Pharmacy, College of Pharmacy, Seoul National University, Seoul, Korea

  • Ensel Oh,

    Affiliation Interdiciplinary Program of Bioinformatics, College of Natural Science, Seoul National University, Seoul, Korea

  • Seungmook Lee,

    Affiliation Department of Statistics, College of Natural Science, Seoul National University, Seoul, Korea

  • Mi Ra Roh,

    Affiliation BIT center, CT&D Ltd., Seoul, Korea

  • Si Eun Kim,

    Affiliation Laboratory of Molecular Pathology, Department of Pharmacy, College of Pharmacy, Seoul National University, Seoul, Korea

  • Yangsoon Lee,

    Affiliation LG Life Sciences, Ltd., R&D Research Park, Daejeon, Korea

  • Yoon-La Choi,

    Affiliation Department of Pathology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea

  • Yong-Ho In,

    Affiliation BIT center, CT&D Ltd., Seoul, Korea

  • Taesung Park,

    Affiliation Department of Statistics, College of Natural Science, Seoul National University, Seoul, Korea

  • Sang Seok Koh,

    Affiliation Protein Therapeutics Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea

  • Young Kee Shin

    Affiliations Laboratory of Molecular Pathology, Department of Pharmacy, College of Pharmacy, Seoul National University, Seoul, Korea, Interdiciplinary Program of Bioinformatics, College of Natural Science, Seoul National University, Seoul, Korea

Identification of Novel Reference Genes Using Multiplatform Expression Data and Their Validation for Quantitative Gene Expression Analysis

  • Mi Jeong Kwon, 
  • Ensel Oh, 
  • Seungmook Lee, 
  • Mi Ra Roh, 
  • Si Eun Kim, 
  • Yangsoon Lee, 
  • Yoon-La Choi, 
  • Yong-Ho In, 
  • Taesung Park, 
  • Sang Seok Koh


7 Aug 2009: Kwon MJ, Oh E, Lee S, Roh MR, Kim SE, et al. (2009) Correction: Identification of Novel Reference Genes Using Multiplatform Expression Data and Their Validation for Quantitative Gene Expression Analysis. doi: info:doi/10.1371/annotation/695436c7-3329-4bdc-9832-f427ecc33698 View correction


Normalization of mRNA levels using endogenous reference genes (ERGs) is critical for an accurate comparison of gene expression between different samples. Despite the popularity of traditional ERGs (tERGs) such as GAPDH and ACTB, their expression variability in different tissues or disease status has been reported. Here, we first selected candidate housekeeping genes (HKGs) using human gene expression data from different platforms including EST, SAGE, and microarray, and 13 novel ERGs (nERGs) (ARL8B, CTBP1, CUL1, DIMT1L, FBXW2, GPBP1, LUC7L2, OAZ1, PAPOLA, SPG21, TRIM27, UBQLN1, ZNF207) were further identified from these HKGs. The mean coefficient variation (CV) values of nERGs were significantly lower than those of tERGs and the expression level of most nERGs was relatively lower than high expressing tERGs in all dataset. The higher expression stability and lower expression levels of most nERGs were validated in 108 human samples including formalin-fixed paraffin-embedded (FFPE) tissues, frozen tissues and cell lines, through quantitative real-time RT-PCR (qRT-PCR). Furthermore, the optimal number of nERGs required for accurate normalization was as few as two, while four genes were required when using tERGs in FFPE tissues. Most nERGs identified in this study should be better reference genes than tERGs, based on their higher expression stability and fewer numbers needed for normalization when multiple ERGs are required.


Gene expression analysis is becoming more important in diagnostic fields as it allows for the identification of novel biomarkers relevant to diseases. Endogenous reference genes (ERGs) are widely used to normalize the mRNA level in the relative quantification to provide an accurate comparison of gene expression between different samples [1]. Traditional ERGs (tERGs), such as GAPDH and ACTB, have been used in expression studies without proper validation because of the assumption that they are expressed at constant levels across different samples and regardless of experimental treatments [2], [3]. However, several reports have shown that the expression of tERGs can vary in different tissues and be regulated by experimental treatments or pathological state [2], [4][10]. As the use of inappropriate ERGs in relative quantification of gene expression can result in biased expression profiles [4], [11], [12], the selection of proper ERGs is essential for accurate measurement especially in quantitative methods including qRT-PCR, which is a highly sensitive and accurate method [13].

Although there have been a number of previous studies aimed at finding suitable ERGs, most of them have focused on selecting the most stable genes from commonly used ERGs [14][17]. Moreover, the identification of novel ERGs (nERGs) has been based primarily on microarray data [10], [18][21]. Although short-oligo microarrays such as Affymetrix GeneChips, have the advantage of being highly sensitive in detecting low abundance transcripts (nearly 3–20 transcripts per million (tpm)) [22], they have some disadvantages such as inaccurate cross hybridization between probes and transcripts, differences in hybridization efficiencies between probe sets, limited linear range of signal, and incorrect annotation of transcripts [23], [24]. Therefore, an approach using only microarray data may not be sufficient to identify the most suitable ERGs. Although an ideal universal ERG may not exist [1], [15], a combination of large expression data from different platforms is expected to complement the limitation of each platform [25] and allow for the identification of more suitable ERGs.

Here, we describe an algorithm for the identification of nERGs using the publicly available human gene expression datasets in addition to in-house microarray data. The expression of selected nERGs in datasets was validated by qRT-PCR in 108 human samples and their expression stability was compared to that of tERGs.

Materials and Methods

Ethics Statement

This study was approved by the institutional review board of Samsung Medical Center. Patients' written informed consent was not required to be obtained as the institutional review board approved the use of human tissues in this study without patients' consent with the reason that the data were analyzed anonymously and patients could not be identified.

EST, SAGE, microarray gene expression dataset construction

EST and SAGE human gene expression data were collected from the publicly available Cancer Genome Anatomy Project (CGAP) site ( Microarray data was obtained from the Cancer Genome Expression Database of LG Life Sciences, which is based on the Affymetrix HG-U133 (for the samples included in microarray data, see Table S1)[26]. A detailed description of each dataset construction is provided in the Supplementary Materials and Methods (Text S1) and shown schematically in Figure 1.

Figure 1. Flowchart of the methodology for identification of nERGs.

2,087 candidate HKGs were first identified by selecting the genes meeting the following criteria: 0's prop <0.4 in EST, <0.1 in shortSAGE and <0.3 in longSAGE. 0's prop represents 0's proportion (number of tissues in which the gene is not expressed/total number of tissues, 0≤0's prop≤1). Among the candidate 2,087 HKGs, 13 nERGs with the lowest CVs were further identified by selecting the genes common to all four datasets among the genes with the 400 lowest CVs (approximately 20% of candidate HKGs).

Algorithm for the identification of candidate HKGs and 13 nERGs

The methodology used to identify nERGs is outlined in Figure 1. First, to identify nERGs, we searched for candidate HKGs whose expression is detected in most tissues using 0's proportion in EST and SAGE datasets. 0's proportion is defined as the proportion of different tissues in which a given gene is not expressed and was calculated as follows:

In this equation a value of 0 indicates that the gene is expressed in all tissues, and a value of 1 indicates that the gene is not expressed in any tissue. If a gene was expressed in any one library of multiple libraries corresponding to the same tissue, that gene was considered to be expressed in that tissue. A total of 2,087 candidate HKGs with low 0's proportions in all three datasets were selected (Table S2). Cut-off values for 0's proportion in each dataset were determined to include the majority of the 575 HKGs identified from a previous Affymetrix U95A chip [27] (see Figure 1, Text S1). Affymetrix (Affy) data for 5,317 probe sets corresponding to 2,087 HKGs were obtained. Mean gene expression and CV (%) for each gene were calculated in each dataset. The Student's t-test was used to assess whether the mean gene expression between candidate HKGs and non-HKGs was statistically different. When this set of 2,087 HKGs was classified functionally using the Functional Classification Catalogue (FunCat, Version 2.0) [28], known functions were available for only 1,318 genes of them (for details, see Text S1). For CpG island analysis, sequences upstream of the annotated transcription start site of RefSeq genes were obtained from the UCSC site and CpG islands in their upstream sequences were searched using the CpGIE software [29]. The statistical significance of differences in the frequency of genes with CpG islands between HKGs and non-HKGs was determined using Z-test (R program, P<0.05 was considered to be significantly different.

Among the 2,087 HKGs, potential ERGs were further identified according to the following process: 1) the CV was calculated for each UniGene cluster and 2) the genes in each dataset were ranked by ascending CV values. Among the first 400 genes with the lowest CVs (approximately 20% of 2,087 HKGs), we found 13 nERGs common to all four datasets (i.e. EST, LongSAGE, ShortSAGE, and Affy). The CV values between tERGs and nERGs were compared using Wilcoxon rank sum test.

Genomic variations of nERGs and tERGs

The Database of Genomic Variants ( was used to search for genomic variations of the 2,087 HKGs, including nERGs and tERGs.

RNA preparation from human frozen, FFPE tissues and human cancer cell lines and qRT-PCR

The Department of Pathology at Samsung Medical Center provided 26 human frozen tissues and 60 FFPE tissues and the Korea Cell Line Bank (KCLB) or ATCC provided 22 human cancer cell lines (Table S3). The 60 FFPE tissues consisted of 10 breast cancers, 8 normal stomach, 9 stomach cancers, 10 normal ovaries, 4 ovarian adenomas, 9 ovarian borderline tumors and 10 ovarian cancers.

Total RNA isolation and cDNA synthesis from human samples are described in detail in the Supplementary Materials and Methods (Text S1). Total RNA from frozen tissues and cell lines was isolated using Trizol (Invitrogen Life Technologies). The inclusion criteria for the RNA samples were A260/280≥1.80 and rRNA (28S/18S) ratio≥1.0. The integrity of RNA from frozen tissues and cell lines was assessed using an Agilent Bioanalyzer 2100 (Agilent Technologies, Palo Alto, CA). Paradise Whole Transcript RT Reagent system (Arcturus, CA, USA) was used for RNA isolation and RT of FFPE tissues.

PCR primers and Universal Probe Library (UPL) numbers for this study are provided in Table 1. Primers and probes for each gene were designed to have a short amplicon size. All PCR reactions were performed in a Lightcycler 2.0 (Roche Applied Science) according to standard procedures (for details, see Text S1). PCR efficiency for each gene in frozen tissues and cell lines was determined by both the serial dilution method using Lightcycler 4.0 software and estimation using the LinRegPCR program [30] (Table 1, Text S1). To minimize experimental variation, the same gene in different samples was tested in the same PCR run.

Gene expression stability analysis in qRT-PCR

To analyze gene expression stability, we used the geNorm v.3.4 [1] and NormFinder software [31]. Cp values were converted into relative quantities for analysis with geNorm and NormFinder, taking into consideration the PCR efficiencies of the genes, as shown in Table 1 (for details, see Text S1). The optimal number of ERGs for normalization was also determined using the geNorm program. The statistical differences in gene expression stability values between nERGs and tERGs were determined using Wilcoxon rank sum test. The Pearson and Spearman correlation between the gene expression stability calculated by geNorm or NormFinder (M for geNorm and S for NormFinder) and the CV calculated in each dataset were analyzed using R statistical software. P<0.05 was considered to be significant.


Identification and characterization of 2,087 candidate HKGs

2,087 candidate HKGs were first identified using 0's proportion in each dataset (Figure 1, Table S2). According to the functional classification by FunCat [28], genes with a variety of basic cellular functions were included in this list. In particular, proteins mediating protein fate (23%) and cellular transport (21%) had the highest frequency (Figure 2A). This is in contrast to the previously reported classifications of HKGs, in which metabolic and ribosomal proteins were enriched [27], [32]. We compared the frequency of genes with CpG islands in the upstream sequences of transcription start sites in HKGs relative to non-HKGs. Most HKGs (70%) were found to possess a CpG island within 1,000 bp from the transcription start site, consistent with previous studies [33], [34], while fewer CpG islands were found in the upstream sequences of non-HKGs (P<0.001) (Table 2). Mean expression level of HKGs was significantly higher than that of non-HKGs in all datasets (P<0.001) (Figure 2B), also consistent with previous work [27] (for detailed description on the expression of 2,087 HKGs in 4 datasets, see Text S2). CV values of the 2,087 genes showed a poor correlation between the four datasets, whereas gene expression showed a relatively higher correlation (Table S4).

Figure 2. Characterization of candidate HKGs identified in this study.

(A) The functional distribution of candidate HKGs was classified by FunCat. Among the 2,087 UniGene clusters, a total of 1,605 UniGene clusters which have GO terms under biological processes were classified according to their major functional categories using FunCat (version 2.0) by mapping the GO terms to FunCat categories. A total of 1,318 UniGene clusters were classified by FunCat. The number of UniGene clusters belonging to each category is presented in A. In some cases, UniGene clusters mapped to two or more FunCat categories. (B) Comparison of gene expression between candidate HKGs and non-HKGs in each dataset. Box and Whisker plots provide a simple description of a distribution of values by depicting the 25th and 75th percentile values as the bottom and top of a box, respectively. The Y axis represents the natural logarithm transformed mean gene expression levels. The median expression values of HKGs and non-HKGs are marked by horizontal lines in the boxes and the values are provided to the right of each box. *P<0.001.

Table 2. Comparison of the proportion of genes with CpG islands in sequences upstream of the transcription start site in HKGs and non-HKGs.

Identification and characterization of 13 nERGs

A total of 13 nERGs common to the four datasets (ZNF207, OAZ1, LUC7L2, CTBP1, TRIM27, GPBP1, ARL8B, UBQLN1, PAPOLA, CUL1, DIMT1L, FBXW2, and SPG21, Table 3) were identified from 2,087 HKGs. The highest proportion (5/13) (ZNF207, OAZ1, CTBP1, PAPOLA, FBXW2) of genes were genes involved in cellular metabolism. CpG islands were found in the upstream region from transcription start site of all 13 nERGs (Table S5).

The gene expression for each of the 13 nERGs showed a significant correlation between datasets (P<0.001, Table S6) with high Pearson correlation coefficients (>0.8), although the Spearman correlations of EST versus Affymetrix (0.374, P = 0.206) and shortSAGE versus Affymetrix (0.511, P = 0.076) were not significant. CVs between the datasets were not significantly correlated (P>0.05, Table S6).

To further confirm the suitability of the nERGs, gene copy number variations (CNVs) of the nERGs, which can affect the gene expression, were investigated by searching the Database of Genomic Variants. As shown in Table 4, only OAZ1 and DIMT1L were found to be located in a chromosome region where CNVs were reported, indicating their expression might be deregulated by genomic aberrations.

Comparison of tERGs and nERGs in dataset

We compared the 13 nERGs with 13 commonly used tERGs: GAPDH, ACTB, HPRT1, PPIA, B2M, TBP, HMBS, RPLP0, PGK1, GUSB, TFRC, H6PD, and ALAS1. The mean expression of the nERGs was relatively lower than the highly expressed tERGs, including GAPDH, ACTB, B2M, and PPIA, in all datasets and was expressed at levels similar to those tERGs, which had lower expression levels (Figure 3, Table S7). With respect to variation, most of the tERGs showed relatively higher variation than the nERGs and the mean CV values of nERGs were significantly lower than those of tERGs (P<0.001 in EST, ShortSAGE, and Affymetrix, P = 0.003 in LongSAGE, Table S8), supporting the notion that the identified nERGs are generally expressed more stably and at relatively lower levels than tERGs.

Figure 3. Comparison of gene expression and CV between nERGs and tERGs in each dataset.

(A) Comparison of gene expression between 13 nERGs and 13 tERGs. (B) Comparison of CV between 13 nERGs and 13 tERGs. Empty squares represent nERGs identified in this study and circles represent the tERGs.

A search for genomic variation of tERGs revealed that many of them (ACTB, GAPDH, PGK1, B2M, TBP, TFRC, ALAS1) were located in the genomic locus where CNVs are known, whereas only 2 genes among the nERGs were found in those regions (Table 4), suggesting that the higher expression variation of some tERGs might be due in part to CNVs.

Validation of nERGs by qRT-PCR

To validate both the level and stability of gene expression of nERGs selected from the four datasets, the expression of 13 nERGs and 8 tERGs was measured in a total of 108 human samples, including 26 frozen tissues, 22 cancer cell lines and 60 FFPE tissues, by qRT-PCR using Taqman probes (refer to Text S1 for an explanation for why 8 tERGs among 13 tERGs were chosen for qRT-PCR). Except PPIA, a small amplicon for each gene was designed for the reliable measurement of its expression, especially in FFPE tissues where RNA from these samples is frequently degraded. When the PCR efficiency of each gene was determined using the serial dilution method, each gene was amplified at 90–100% efficiency (Table 1). The CVs of Cp values confirmed that the between-assay precision in two or three repeats was within 5% (data not shown).

First, the expression profiles of these genes in each of the 48 samples, including frozen tissues and cancer cell lines, are presented in Figure 4A. The 13 nERGs were constitutively expressed in all 48 samples. Seven tERGs showed a wide range of expression (Cp: 13.52 to 29.39), but H6PD was not widely expressed in frozen tissues and this gene was consequently excluded from subsequent calculations. The Cp values of 13 nERGs ranged from 18.90 to 28.79 (Figure 4B). tERGs could be divided into highly expressed genes (median <20 cycles) and lowly expressed genes (median>23 cycles). Highly expressed genes included B2M, PPIA, GAPDH, ACTB and lowly expressed genes consisted of HPRT1, TBP and HMBS. Of the nERGs, 12 genes displayed an intermediate expression level between the highly expressed and the lowly expressed tERGs (Figure 4B). The mean expression level of nERGs was significantly lower than that of highly expressed tERGs, whereas it was significantly higher than that of lowly expressed tERGs (P<0.001, Table S9). ZNF207 was the most highly expressed gene, followed by UBQLN1 and CUL1. OAZ1 was the gene with the weakest expression.

Figure 4. The distribution of expression levels of 13 nERGs and 7 tERGs determined by qRT-PCR using Taqman probes in human samples.

(A) The distribution of mRNA levels of tested ERGs in 48 samples, including frozen tissues and cancer cell lines. (B) The mRNA levels of tERGs (red) and nERGs (blue) in Cp values over all 48 samples (left) and 60 FFPE tissues (right). Values are given as “Crossing point” (Cp) values. All measurements of qRT-PCR were repeated three times for frozen tissues and cell lines and twice for FFPE tissues and mean “crossing point” (Cp) values of repeats were calculated. Box and Whisker plots provide a simple description of the distribution of values by depicting the 25th and 75th percentile values as the bottom and top of the box, respectively. The median value is marked by a line within the box and the minimum and maximum values are depicted by error bars, or whiskers, protruding from the box.

We further investigated the expression of the 13 nERGs by qRT-PCR in 60 FFPE tissues to test whether the nERGs could be used in such tissues showing the significant degradation of mRNA. Except DIMT1L, the expression of all genes was measurable in all 60 samples. The Cp range for tERGs was 18.85 to 33.02 and for nERGs was 23.33 to 31.58 (Figure 4B). DIMT1L was not amplified in 5 samples and therefore was excluded from further expression stability analysis. The expression pattern in the FFPE tissues was similar to that of previous 48 samples (26 frozen tissues and 22 cancer lines) despite the discrepancy in sample types. Remarkably, PPIA expression which was detected at high level in frozen tissues/cell lines was observed at markedly decreased level in FFPE tissues. This observation might be due to the long amplicon size of PPIA (326 bp), whereas the amplicon size of other genes is small ranging from 60 to 110 bp (Table 1), indicating that small size of amplicon is required for the detection of gene expression in FFPE tissues in which RNA is frequently degraded.

Gene expression stability of nERGs

We first assessed the gene expression stability (detailed in Text S1) in 48 samples, including 26 frozen tissues and 22 cell lines based on qRT-PCR using two programs, geNorm and NormFinder. All genes tested displayed relatively high expression stability with low M values (<0.9), which were below the default limit of 1.5 in geNorm (Table 5a). GPBP1 and CUL1 were identified as the two most stable genes. B2M was the least stable gene and had the highest M value (0.888), followed by ACTB (0.843), HMBS (0.815), and GAPDH (0.793). When calculated by NormFinder, TBP and PAPOLA were the two most stable genes (i.e. having the lowest S values) (Table 5a). Similar to the results from geNorm, tERGs including B2M, ACTB, GAPDH, HMBS and HPRT1, were found to have less stable expression than nERGs. The mean gene expression stability values in nERGs by geNorm and NormFinder were significantly lower than those in tERGs in both 48 samples and 60 FFPE tissues (P<0.05, Tables S10). Both analyses demonstrated that most nERGs showed relatively higher expression stability compared to tERGs, suggesting that nERGs are more suitable for normalization. Moreover, even when gene expression stability was analyzed with relative expression level calculated by PCR efficiencies using the LinRegPCR program, most of nERGs showed a more stable expression than tERGs (data not shown).

Table 5. nERGs and tERGs ranked according to their expression stability, as calculated by the two programs, geNorm and NormFinder, based on qRT-PCR data in 48 frozen tissues/cell lines and 60 FFPE tissues.

Pearson correlation analysis revealed the higher concordance of both M and S values with CVs in the EST and shortSAGE (Table 6) than in the Affymetrix. High correlation between M and S (Pearson correlation: 0.953, P<0.001) was also observed, indicating that both analyses produced similar results.

Table 6. Correlation between gene expression stability of nERGs and tERGs from qRT-PCR data and CV from each dataset.

Consistently with frozen tissues and cell lines, both gene expression stability values demonstrated that most nERGs with low stability values are superior to tERGs in terms of expression stability in FFPE tissues (Table 5b), proving the usefulness of our nERGs as reliable measurements of gene expression in those tissues. GPBP1 and PAPOLA were the two least variable genes in geNorm and ARL8B and LUC7L2 were the top two ranked genes in NormFinder. However, in the analysis by each tissue, FBXW2, TRIM27, and CUL1 showed high stability values in breast, ovary, and stomach, respectively (Table 7), suggesting that they have high expression variation in each tissue. Also, S values by NormFinder in the ovary and stomach FFPE tissues were calculated based on the combination of intra- and inter-group variations between normal and tumor samples. The relatively high S values of TRIM27 in the ovary and CUL1 in the stomach suggest that their expression might be regulated in specific tumors compared to their normal tissues.

Table 7. nERGs and tERGs ranked according to their expression stability, as calculated by the two programs, geNorm and NormFinder, based on qRT-PCR data in each tissue type of FFPE tissues.

The optimal number of ERGs for normalization was determined using geNorm. In both the 48 human frozen and cell line samples and 60 FFPE tissues, the optimal number of nERGs required for normalization was fewer than when using tERGs (Figure 5). Four tERGs and three nERGs were calculated as the optimal number of ERGs needed in the 48 samples when using a V of 0.15 as the cut-off value [1]. In the FFPE samples, V2/3 was under 0.15 when using nERGs, suggesting that only two genes are sufficient for optimal normalization, whereas four of seven tERGs were necessary for accurate normalization. This indicates that fewer ERGs are required for optimal normalization when using our nERGs rather than using tERGs.

Figure 5. Optimal number of ERGs for normalization calculated by the geNorm program.

Variable V defines the pair-wise variation between two sequential normalization factors containing an increasing number of genes. For example, V2/3 indicates the variation of the normalization factor of two genes in relation to three genes. A large V indicates that the added gene should be included for calculation of the normalization factor. 0.15 was proposed as a cut-off value, below which the inclusion of an additional reference gene is not required. Pair-wise variation analysis to determine the number of ERGs required for accurate normalization was performed in 48 samples including human frozen tissues and cell lines (A), 60 FFPE tissues (B), 33 ovary FFPE tissues (C) and 17 stomach FFPE tissues (D). For each case, the analysis was done for total ERGs, tERGs and nERGs.


In the present study, we identified nERGs in human samples using a comparative analysis of different large datasets of human gene expression profiles, while previous attempts to identify nERGs that are superior to tERGs were limited to the analysis of a microarray [10], [18][21] or EST data [35].

Candidate HKGs were first selected and included 2,087 genes, which is a larger number of genes than previously identified by other groups [27], [32], [36]. Their characteristics, including high levels of expression [27], [36] and the prevalence of CpG islands in the promoter regions [33], [34], were in line with previous studies based on smaller numbers of HKGs, reflecting that “real” HKGs showing constitutive expressions in all tissues are enriched in our list. Thus, this list can be used as a reliable source for the study of HKGs.

The 13 nERGs further identified from candidate HKGs showed relatively lower CV and lower expression than most of the tERGs in all datasets. These findings were further confirmed by qRT-PCR using frozen tissues and cell lines. Generally, the expression of 13 nERGs was lower than the highly expressed tERGs and higher or similar to the weakly expressed tERGs. The expression stability values of the nERGs calculated by both programs also demonstrated that nERGs are generally more stably expressed than tERGs. Although there were slight differences in their rankings between the two programs, PAPOLA, CUL1, TBP, LUC7L2, GPBP1 and TRIM27 were found to be the genes with the most stable expression in 48 samples. The observation that TBP is one of the most stable genes is not surprising because relatively lower variability of TBP among traditional ERGs was already expected in datasets including EST and ShortSAGE (Figure 3). On the other hand, our data further supported the unsuitability of the most commonly used ERGs, like GAPDH, ACTB, for normalization, in line with previous works [7], [8], [14].

Our nERGs were also successfully validated by qRT-PCR in FFPE tissues. Despite the usefulness of archival FFPE tissue specimens in conjunction with clinical data, frequent degradation of RNA from FFPE tissues has been regarded as an obstacle in the gene expression analysis of those samples [37]. The expression of nERGs was measurable with reliable Cp values in all 60 FFPE tissues and most of the nERGs outperformed tERGs with respect to expression stability.

Furthermore, the expression level of target gene can be calculated relative to the expression level of one or multiple nERGs using standard curve or comparative Ct(Cp) method in quantitative gene expression analyses, including qRT-PCR. The most suitable ERG or ERGs in the designed experiment can be selected among 13 nERGs or combination of tERGs and 13 nERGs based on gene expression stability values calculated by the geNorm and/or NormFinder program. Recently developed PCR array, high throughput gene expression measurement using qRT-PCR, also requires more suitable ERGs than conventional tERGs for accurate quantification of gene expression [38]. Recently, normalization using the geometric mean of multiple ERGs has been considered to be more accurate for normalization [1], especially in situations when no optimal ERG has been identified [31]. Moreover, the use of multiple ERGs is more important in the expression analysis using FFPE tissues, where poor RNA quality causes additional variations in gene expression [39]. Five tERGs are included for the normalization of gene expression in the Oncotype DX™ assay (Genomic Health) which is a clinical validated multigene qRT-PCR test using FFPE tissues to predict recurrence of breast cancer [40]. However, the superiority of our nERGs was demonstrated by showing that a fewer number of genes is required for normalization when using the nERGs as compared with using the tERGs. Considering the limited amounts of samples or experiment cost, nERGs also outperformed previous tERGs when multiple genes are needed for reliable normalization. Taken together, nERGs will be a better choice for more reliable normalization in gene expression analysis using FFPE tissues.

The superiority of nERGs over tERGs is based in their lower expression level as well as their high expression stability. Use of ERGs with expression levels similar to target genes is recommended so that the comparisons fall on the same linear scale [19], [41], [42]. Therefore, our nERGs, with relatively lower expression, rather than GAPDH or ACTB showing high expression, can be better candidates for normalization of a wide range of genes, including weakly expressed genes. This is significant given that the majority of transcripts in human tissues are expressed in low abundance [36].

Remarkably, we observed no significant correlation between the stability values calculated from qRT-PCR data and CV in the Affymetrix data, which is in contrast to the significant correlation between stability values and CV in the EST and SAGE dataset. This suggests that EST or shortSAGE may be better sources for exploring nERGs rather than microarrays, which have traditionally been used as a source for screening ERG and supports our initial hypothesis that the microarray might not be a good source for ERGs.

Furthermore, nERGs identified here might be used as reference for relative measurements of gene amplification, which is a frequent genetic alteration leading to unregulated gene expression in cancer [43]. As the relatively constant expression of these genes in both normal and tumor tissues provides the possibility that these genes are located in a chromosomal region in which no genetic alterations are found in human tumors, we investigated the genomic CNVs of nERGs using publicly available databases. Most nERGs, except OAZ1 and DIMT1L, were located in genomic regions where CNVs were not reported, whereas many tERGs were located in regions with CNVs. The relatively lower expression stability of OAZ1 and DIMT1L, as well as tERGs like GAPDH and ACTB, might be explained by these genomic aberrations. However, the suitability of nERGs as a reference for the relative measurement of gene amplification remains to be further investigated and validated through experiments. Meanwhile, even genes with genomic variations can be used for the normalization in gene expression, provided that their expression is not affected by genomic aberrations.

In conclusion, we have identified a set of candidate HKGs and nERGs based on a comparative analysis of EST, SAGE, and Affymetrix datasets. This is the first study using three different platforms to identify nERGs, and most of the 13 ERGs identified in these datasets were demonstrated to be more stably expressed than tERGs through the validation using qRT-PCR in large human samples including cell lines, frozen tissues and FFPE tissues. Moreover, these nERGs were expressed at relatively lower levels than most commonly used high expressing tERGs, making them more suitable for normalization of transcripts from a wide range of expression levels. We have also shown that fewer ERGs are required for accurate normalization using nERGs than using tERGs, especially in FFPE tissues when the use of multiple ERGs is required.

Supporting Information

Text S1.

Supplementary materials and methods


(0.09 MB DOC)

Text S2.

Expression of 2,087 candidate HKGs in the four datasets


(0.04 MB DOC)

Table S1.

List of 567 samples including 13 tissue types in the HG-U133 array used in this study


(0.07 MB DOC)

Table S2.

A list of 2,087 candidate housekeeping genes


(0.86 MB XLS)

Table S3.

Human frozen tissues and cancer cell lines used in qRT-PCR


(0.08 MB DOC)

Table S4.

Correlation of gene expression and CV of 2,087 candidate HKGs between the four datasets


(0.05 MB DOC)

Table S5.

CpG islands analysis in the upstream region from transcription start site of 13 nERGs


(0.04 MB DOC)

Table S6.

Correlation of gene expression and CV of 13 nERGs between the four datasets


(0.05 MB DOC)

Table S7.

tERGs used in this study


(0.05 MB DOC)

Table S8.

Comparison of CV between nERGs and tERGs in the dataset


(0.04 MB DOC)

Table S9.

Comparison of Cp values between nERGs and tERGs in qRT-PCR


(0.04 MB DOC)

Table S10.

Comparison of gene expression stability values between nERGs and tERGs


(0.04 MB DOC)


We are very grateful to Dr. Soonmyung Paik (Division of Pathology, National Surgical Adjuvant Breast and Bowel Project, Pittsburgh, PA) and Dr. Jonathan R. Pollack (Department of Pathology, Standford University School of Medicine, Stanford, CA, USA) for thoughtful suggestions and critical comments.

Author Contributions

Conceived and designed the experiments: MJK YKS. Performed the experiments: MJK SEK. Analyzed the data: MJK EO SL MRR YHI. Contributed reagents/materials/analysis tools: YL YLC SSK. Wrote the paper: MJK. Gave advice on the statistical analysis: TP.


  1. 1. Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, et al. (2002) Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol 3: RESEARCH0034.
  2. 2. Thellin O, Zorzi W, Lakaye B, De Borman B, Coumans B, et al. (1999) Housekeeping genes as internal standards: use and limits. J Biotechnol 75: 291–295.
  3. 3. Bustin SA (2000) Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. J Mol Endocrinol 25: 169–193.
  4. 4. Tricarico C, Pinzani P, Bianchi S, Paglierani M, Distante V, et al. (2002) Quantitative real-time reverse transcription polymerase chain reaction: normalization to rRNA or single housekeeping genes is inappropriate for human tissue biopsies. Anal Biochem 309: 293–300.
  5. 5. Rubie C, Kempf K, Hans J, Su T, Tilton B, et al. (2005) Housekeeping gene variability in normal and cancerous colorectal, pancreatic, esophageal, gastric and hepatic tissues. Mol Cell Probes 19: 101–109.
  6. 6. Schmittgen TD, Zakrajsek BA (2000) Effect of experimental treatment on housekeeping gene expression: validation by real-time, quantitative RT-PCR. J Biochem Biophys Methods 46: 69–81.
  7. 7. Zhong H, Simons JW (1999) Direct comparison of GAPDH, beta-actin, cyclophilin, and 28S rRNA as internal standards for quantifying RNA levels under hypoxia. Biochem Biophys Res Commun 259: 523–526.
  8. 8. Selvey S, Thompson EW, Matthaei K, Lea RA, Irving MG, et al. (2001) Beta-actin–an unsuitable internal control for RT-PCR. Mol Cell Probes 15: 307–311.
  9. 9. Lee PD, Sladek R, Greenwood CM, Hudson TJ (2002) Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Res 12: 292–297.
  10. 10. Hamalainen HK, Tubman JC, Vikman S, Kyrola T, Ylikoski E, et al. (2001) Identification and validation of endogenous reference genes for expression profiling of T helper cell differentiation by quantitative real-time RT-PCR. Anal Biochem 299: 63–70.
  11. 11. Dheda K, Huggett JF, Chang JS, Kim LU, Bustin SA, et al. (2005) The implications of using an inappropriate reference gene for real-time reverse transcription PCR data normalization. Anal Biochem 344: 141–143.
  12. 12. de Kok JB, Roelofs RW, Giesendorf BA, Pennings JL, Waas ET, et al. (2005) Normalization of gene expression measurements in tumor tissues: comparison of 13 endogenous control genes. Lab Invest 85: 154–159.
  13. 13. Huggett J, Dheda K, Bustin S, Zumla A (2005) Real-time RT-PCR normalisation; strategies and considerations. Genes Immun 6: 279–284.
  14. 14. Goidin D, Mamessier A, Staquet MJ, Schmitt D, Berthier-Vergnes O (2001) Ribosomal 18S RNA prevails over glyceraldehyde-3-phosphate dehydrogenase and beta-actin genes as internal standard for quantitative comparison of mRNA levels in invasive and noninvasive human melanoma cell subpopulations. Anal Biochem 295: 17–21.
  15. 15. Haller F, Kulle B, Schwager S, Gunawan B, von Heydebreck A, et al. (2004) Equivalence test in quantitative reverse transcription polymerase chain reaction: confirmation of reference genes suitable for normalization. Anal Biochem 335: 1–9.
  16. 16. Ohl F, Jung M, Xu C, Stephan C, Rabien A, et al. (2005) Gene expression studies in prostate cancer tissue: which reference gene should be selected for normalization? J Mol Med 83: 1014–1024.
  17. 17. Radonic A, Thulke S, Mackay IM, Landt O, Siegert W, et al. (2004) Guideline to reference gene selection for quantitative real-time PCR. Biochem Biophys Res Commun 313: 856–862.
  18. 18. Hoerndli FJ, Toigo M, Schild A, Gotz J, Day PJ (2004) Reference genes identified in SH-SY5Y cells using custom-made gene arrays with validation by quantitative polymerase chain reaction. Anal Biochem 335: 30–41.
  19. 19. Czechowski T, Stitt M, Altmann T, Udvardi MK, Scheible WR (2005) Genome-wide identification and testing of superior reference genes for transcript normalization in Arabidopsis. Plant Physiol 139: 5–17.
  20. 20. Jin P, Zhao Y, Ngalame Y, Panelli MC, Nagorsen D, et al. (2004) Selection and validation of endogenous reference genes using a high throughput approach. BMC Genomics 5: 55.
  21. 21. Shulzhenko N, Yambartsev A, Goncalves-Primo A, Gerbase-DeLima M, Morgun A (2005) Selection of control genes for quantitative RT-PCR based on microarray data. Biochem Biophys Res Commun 337: 306–312.
  22. 22. Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ (1999) High density synthetic oligonucleotide arrays. Nat Genet 21: 20–24.
  23. 23. Haverty PM, Hsiao LL, Gullans SR, Hansen U, Weng Z (2004) Limited agreement among three global gene expression methods highlights the requirement for non-global validation. Bioinformatics 20: 3431–3441.
  24. 24. van Ruissen F, Ruijter JM, Schaaf GJ, Asgharnegad L, Zwijnenburg DA, et al. (2005) Evaluation of the similarity of gene expression data estimated with SAGE and Affymetrix GeneChips. BMC Genomics 6: 91.
  25. 25. Sun M, Zhou G, Lee S, Chen J, Shi RZ, et al. (2004) SAGE is far more sensitive than EST for detecting low-abundance transcripts. BMC Genomics 5: 1–4.
  26. 26. Lee S, Jo M, Lee J, Koh SS, Kim S (2007) Identification of novel universal housekeeping genes by statistical analysis of microarray data. J Biochem Mol Biol 40: 226–231.
  27. 27. Eisenberg E, Levanon EY (2003) Human housekeeping genes are compact. Trends Genet 19: 362–365.
  28. 28. Ruepp A, Zollner A, Maier D, Albermann K, Hani J, et al. (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32: 5539–5545.
  29. 29. Wang Y, Leung FC (2004) An evaluation of new criteria for CpG islands in the human genome as gene markers. Bioinformatics 20: 1170–1177.
  30. 30. Ramakers C, Ruijter JM, Deprez RH, Moorman AF (2003) Assumption-free analysis of quantitative real-time polymerase chain reaction (PCR) data. Neurosci Lett 339: 62–66.
  31. 31. Andersen CL, Jensen JL, Orntoft TF (2004) Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Res 64: 5245–5250.
  32. 32. Hsiao LL, Dangond F, Yoshida T, Hong R, Jensen RV, et al. (2001) A compendium of gene expression in normal human tissues. Physiol Genomics 7: 97–104.
  33. 33. Larsen F, Gundersen G, Lopez R, Prydz H (1992) CpG islands as gene markers in the human genome. Genomics 13: 1095–1107.
  34. 34. Ponger L, Duret L, Mouchiroud D (2001) Determinants of CpG islands: expression in early embryo and isochore structure. Genome Res 11: 1854–1860.
  35. 35. Coker JS, Davies E (2003) Selection of candidate housekeeping controls in tomato plants using EST data. Biotechniques 35: 740–742, 744, 746 passim.
  36. 36. Warrington JA, Nair A, Mahadevappa M, Tsyganskaya M (2000) Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol Genomics 2: 143–147.
  37. 37. Bibikova M, Talantov D, Chudin E, Yeakley JM, Chen J, et al. (2004) Quantitative gene expression profiling in formalin-fixed, paraffin-embedded tissues using universal bead arrays. Am J Pathol 165: 1799–1807.
  38. 38. Spurgeon SL, Jones RC, Ramakrishnan R (2008) High throughput gene expression measurement with real time PCR in a microfluidic dynamic array. PLoS ONE 3: e1662.
  39. 39. Glenn ST, Jones CA, Liang P, Kaushik D, Gross KW, et al. (2007) Expression profiling of archival renal tumors by quantitative PCR to validate prognostic markers. Biotechniques 43: 639–640, 642-633, 647.
  40. 40. Paik S, Shak S, Tang G, Kim C, Baker J, et al. (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351: 2817–2826.
  41. 41. Szabo A, Perou CM, Karaca M, Perreard L, Quackenbush JF, et al. (2004) Statistical modeling for selecting housekeeper genes. Genome Biol 5: R59.
  42. 42. RocheAppliedScience (2005) Technical Note No. LC 15/2005.
  43. 43. Lehmann U, Glockner S, Kleeberger W, von Wasielewski HF, Kreipe H (2000) Detection of gene amplification in archival breast cancer specimens by laser-assisted microdissection and quantitative real-time polymerase chain reaction. Am J Pathol 156: 1855–1864.