Figure 1.
Screenshot of the first 40 lines from reference gene lists.
Complete set of MeSH classified reference gene lists from 9090 array samples are given in hyperlinked Supporting Information S1 spreadsheet.
Figure 2.
Graph of ratio of the number of sets in which a gene has a coefficient of variation (CV) less than a threshold (t), to the number of sets in which the gene is observed.
Graphs were plotted for CV value thresholds t = 0.5 t = 0.1, t = 0.05 and t = 0.01. Percentage of occurrence (PO) is at least 50% of the total sets. The y-axis indicates the number of genes having a ratio greater than the ratio value at the corresponding x-axis. This function is described in the methods section, as x-axis being r and y-axis being fPO(r). The black curve represents housekeeping genes while curves with grey colors show 5 random sets of genes excluding the housekeeping genes. Random sets of genes have the same mean rank distribution as of those housekeeping genes.
Figure 3.
Graph of ratio of the number of sets in which gene has coefficient of variation less than 0.05 to the number of sets in which the gene is observed.
Gene is observed at least a-) 75%, b-) 50%, c-) 25% and d-) 5% of the total sets. The y axis indicates the number of genes having a ratio greater than ratio value at the corresponding x axis. The curve with red color represents housekeeping genes while curves with other colors shows 5 random sets of genes excluding the housekeeping genes. Random sets of genes have the same mean rank distribution as of those housekeeping genes.
Figure 4.
Receiver Operator Characteristic (ROC) curve of the simple threshold based classifier.
The receiver operator characteristic (ROC) curve of a simple threshold classifier over all datasets and some MeSH categories. The housekeeping gene set by Eisenberg et al. [22] is used as the ground truth. The simple threshold classifier classifies all the genes with CV values below a threshold as housekeeping genes. By using different CV thresholds the stringency of the classifier can be varied and the ROC curve can be plotted accordingly. Sensitivity is the ratio of correctly classified ground truth genes over all ground truth genes and specificity is the ratio of correctly identified non-housekeeping genes over all non-housekeeping genes. Complete set of MeSH classified RO-curves are given in hyperlinked Supporting Information S3 spreadsheet.
Table 1.
MeSH groups and number of NCBI-GEO data sets in each group.
Figure 5.
Graphs of coefficient of variations and relative expression levels of 17 reference genes in RT-qPCR.
Coefficient of Variation (CV) was calculated based on the relative expression (efficiency−ΔCq) of each housekeeping gene in (A) Liver, (B) Breast, (C) Colon and (D) All cancer cell lines.
Table 2.
Stability measures of the reference genes that were used in RT-qPCR analysis.
Figure 6.
Stability analysis of reference genes in RT-qPCR based on CV, geNorm and NormFinder.
Genes ranked by stability based on (A) CV, (B) geNorm and (C) NormFinder tools. White, light gray, dark gray and black bars represent Liver, Breast, Colon and All Cancer Cell lines respectively.
Figure 7.
Stability analysis of reference genes in tissue-specific cell lines in RT-qPCR.
Stability analysis in (A) Liver, (B) Breast, (C) Colon and (D) All cancer cell lines. NormFinder and geNorm results were represented in dark gray and light gray respectively.
Table 3.
Stability order of 17 housekeeping genes based on CV, NormFinder and geNorm analysis.