Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Mean-dispersion plots for the human RNA-Seq dataset.

The left panel is for the control group and the right panel is for the E2-treated group. Each group has seven biological replicates. The sequencing depth for this dataset is 30 million. Each point on the plots represents one gene with its method-of-moment (MOM) dispersion estimate (ϕ^MOM) on the y-axis and estimated relative mean frequency on the x-axis. The fitted curves for five dispersion models are superimposed on the scatter plot.

More »

Fig 1 Expand

Table 1.

Proportion of variation in log(ϕ^MOM) explained by fitted models.

More »

Table 1 Expand

Table 2.

Estimated level of residual dispersion variation in five real RNA-Seq datasets.

More »

Table 2 Expand

Table 3.

Summary of DE test methods compared.

More »

Table 3 Expand

Fig 2.

True Positive Rate (TPR) vs. False Discovery Rate (FDR) plots for the six DE test methods performed on RNA-Seq datasets simulated to mimic real datasets.

The fold changes of DE genes are estimated from real data. The columns correspond to the following datasets (left to right) used as templates in the simulation: human, mouse, zebrafish, Arabidopsis, and fruit fly. The level of residual dispersion variation, σ, is specified at the estimated value (σ˜) in panels labeled with A (first row), and half the estimated value (0.5σ˜) in panels labeled with B (second row). In each plot, the x-axis is the TPR (which is the same as recall and sensitivity) and the y-axis is the FDR (which is the same as one minus precision). The percentage of truly DE genes is specified at 20% in all datasets. The FDR values are highly variable when TPR is close to 0, since the denominator TP + FP is close to 0.

More »

Fig 2 Expand

Fig 3.

True Positive Rate (TPR) vs. False Discovery Rate (FDR) plots for the six DE test methods performed on RNA-Seq datasets simulated to mimic real datasets.

The fold changes of DE genes are fixed at 1.2 (half of the DE genes are over-expressed and the other half are under-expressed). Other simulation settings are identical to those described in Fig. 2 legend.

More »

Fig 3 Expand

Fig 4.

True Positive Rate (TPR) vs. False Discovery Rate (FDR) plots for the six DE test methods performed on RNA-Seq datasets simulated to mimic real datasets.

The fold changes of DE genes are fixed at 1.5 (half of the DE genes are over-expressed and the other half are under-expressed). Other simulation settings are identical to those described in Fig. 2 legend.

More »

Fig 4 Expand

Table 4.

Actual FDR for a nominal FDR of 0.1.

More »

Table 4 Expand

Fig 5.

True Positive Rate (TPR) vs. False Discovery Rate (FDR) plots for the six DE test methods performed on RNA-Seq dataset simulated to mimic the human dataset.

On each curve, we marked the position corresponding to a reported FDR of 10% with a cross. The fold changes of DE genes are fixed at 1.2 (half of the DE genes are over-expressed and the other half are under-expressed). Other simulation settings are identical to those for the upper row of Fig. 2.

More »

Fig 5 Expand

Fig 6.

Histograms of p-values for the non-DE genes from the six DE test methods.

The simulation dataset is based on the human dataset with σ specified as the estimated value σ=σ˜. Out of a total of 5,000 genes, 80% are non-DE.

More »

Fig 6 Expand

Fig 7.

Histograms of p-values for the non-DE genes from the six DE test methods.

The simulation dataset is based on the human dataset with σ specified as half the estimated value σ=0.5σ˜. Out of a total of 5,000 genes, 80% are non-DE.

More »

Fig 7 Expand

Fig 8.

MA plots for the edgeR:trended, NBPSeq:genewise, edgeR:tagwise-trend and QuasiSeq:QLSpline methods performed on the mouse dataset.

Predictive log fold changes (posterior Bayesian estimators of the true log fold changes, the “M” values) are shown on the y-axis. Averages of log counts per million (CPM) are shown on the x-axis (the “A” values). The M- and A- values are calculated using edgeR. The highlighted points correspond to the top 200 DE genes identified by each of the DE test methods.

More »

Fig 8 Expand

Table 5.

Summary of RNA-Seq datasets analyzed in this article.

More »

Table 5 Expand

Fig 9.

Estimation accuracy of σ^.

In the simulation, the dispersion is simulated according to an NB2 (left panel) or an NBQ (right panel) trend with added individual variation εi(0,σ2). The x-axis is the true σ value and the y-axis is the estimated σ^. For each true σ value, the simulation is repeated three times. The blue dots correspond to the median σ^ values.

More »

Fig 9 Expand

Fig 10.

The calibration plot for estimating residual dispersion variation σ for the mouse dataset.

The x-axis is the σ value used to generate the data. The y-axis is the estimated σ^. The horizontal line correspond to the σ^ estimated from the mouse dataset.

More »

Fig 10 Expand

Table 6.

Calibrated σ˜ values for the five real datasets.

More »

Table 6 Expand