Table 1.
Variability of most commonly used control genes in Leucégène and combined TCGA RNA-seq data sets.
Table 2.
Variability of genes identified as stable in microarray experiments, in Leucégène and combined TCGA RNA-seq data sets.
Table 3.
Selection of candidate control genes based on Leucégène RNA-seq data.
Figure 1.
Distribution of coefficient of variation of control genes in relation to all genes in combined TCGA RNA-seq data.
Mean expression represents the average of all RPKM values for a given gene across the combined TCGA data set (1933 samples). Coefficient of variation equals the standard deviation divided by the mean RPKM. Each dot represents a single gene: small grey dots represent entire transcriptome; dark and light green boxes represent new control genes with expression greater than or less than 100 RPKM, respectively; red boxes represent the indicated standard control genes. Curved blue lines represent the 5th, 25th, 50th and 75th quantiles of coefficient of variation for a given expression level (from darkest to lightest) computed over windows of 2000 ranked genes centered about a given mean RPKM value.
Table 4.
Variability of select candidate endogenous control genes in combined TCGA data sets.
Figure 2.
Average expression consistency of control genes in qRT-PCR.
Average expression consistency (M) was calculated with the GeNorm algorithm [18] based on qRT-PCR for the indicated control gene on a panel of 14 leukemia samples and one cord blood sample. Lower M values relate to genes which proved to have more consistent expression levels across the samples used.
Figure 3.
Correlation between RPKM and delta Ct of CD33 calculated with different control genes.
dCt represents the difference between the Ct value of CD33 and that of the indicated control gene, for a given leukemic sample, measured by qRT-PCR. RPKM is plotted on a log-2 scale and represents the Reads Per Kilobase of transcript per Million mapped reads obtained for each leukemic sample by RNA-seq. ρ represents the Spearman correlation coefficient between the RPKM and the dCt obtained with the indicated control gene.
Figure 4.
Comparison of EIF4H gene expression values calculated with GAPDH or HNRNPL.
RQ represents relative quantification of EIF4H determined by qRT-PCR, calculated using the ddCt method with either GAPDH or HNRNPL as the control gene, relative to the CD34+ cord blood (CB) sample. The X axis indicates the leukemic sample ID. CV (expressed as a percentage) indicates the coefficient of variation and equals the standard deviation divided by the mean RQ of CD33 calculated using the indicated control gene. MFC (mean fold change) represents the maximum divided by minimum RQ value.