Table 1.
Number of RNA-seq and microarray experiments in ArrayExpress and GEO databases.
Fig 1.
Averaged absolute gene expression correlations (RPA mode).
The plots show average absolute gene expression correlations between different RNA-seq data processing methods and the microarray. Different points correspond to different numbers of top expressed genes. The correlations are averaged over all samples in the corresponding data sets: (a) the Marioni et al. data set, (b) the LAML data set. The error bars correspond to standard errors of the mean. For LAML data set the standard errors are so small that the top and bottom error bars are merged in the plot.
Fig 2.
Absolute gene expression correlation scatter plots (RPA mode).
The plots show the comparison of correlations of PREBS vs microarray and MMSEQ vs microarray for all of the samples in the LAML data set. Each point represents one sample. Two different percentages of top expressed genes are taken: (a) 10%, (b) 60%.
Fig 3.
Absolute gene expression scatter plots (RPA mode).
The gene expression values from three different RNA-seq data processing methods (MMSEQ, read counting and PREBS) are plotted against gene expression values from microarray. Only plots for a single sample in each data set are shown. The top row shows results for the kidney sample from the Marioni et al. data set and the bottom row for the 2803 sample from the LAML data set. The figures show 60% of most highly expressed genes. The legend contains Pearson correlation (r) and the number of genes (n).
Fig 4.
Retrieval accuracy of coupled RNA-seq–microarray experiments (RPA mode).
The plot shows average precision of retrieving the corresponding microarray experiment from a large collection based on correlation with expression estimates from RNA-seq as a function of the number of genes used as the signature. Accuracy is measured as a fraction of the samples which have the largest correlation with its true pair.
Fig 5.
Averaged differential gene expression correlations (RPA mode).
The plots show average log2 fold change correlations between different RNA-seq data processing methods and the microarray. Different points correspond to different numbers of top expressed genes. The correlations are averaged over all samples in the corresponding data sets: (a) the Marioni et al. data set, (b) the LAML data set. The error bars in LAML data set plot correspond to standard errors of the mean, although the errors are so small that top and bottom bars are merged. Error bars for Marioni data set plot could not be displayed because there is only one pair of samples for which log2 fold change values were calculated.
Fig 6.
Differential expression scatter plots (RPA mode).
log2 fold change values for differential expression estimated using different RNA-seq analysis methods plotted against corresponding microarray log2 fold change values. The figures show 60% of most highly expressed genes. Only plots for a single sample pair in each data set are shown. The top row shows the fold changes between the kidney and liver samples from the Marioni data set, while the bottom row shows changes between samples 2803 and 2805 from the LAML data set. The legend contains Pearson correlation (r) and the number of genes (n).
Fig 7.
Venn diagrams of differentially expressed genes (RPA mode).
The Venn diagrams illustrate the similarities of lists of genes that are called differentially expressed by different methods. We call genes with the absolute value of log2 fold change higher than 1.5 as significantly differentially expressed. The pairs of samples that are analysed are the same as in Fig 6 (kidney and liver for Marioni data set, 2803 and 2805 for LAML data set).
Fig 8.
Averaged cross-platform differential gene expression correlations (RPA mode).
The plots show average cross-platform differential gene expression correlations between different RNA-seq data processing methods and the microarray. Different points correspond to different numbers of top expressed genes. The correlations are averaged over all possible pairs of samples in the corresponding data sets: (a) the Marioni et al. data set, (b) the LAML data set.
Fig 9.
Original microarray probe set gene expression scatter plots (RPA mode).
The plots show (a) estimated absolute expression values and (b) estimated log2 fold changes values for original microarray probe sets. The plots show 60% most highly expressed genes in the Marioni data set.