^{1}

^{*}

^{2}

Conceived and designed the experiments: ACR JdUA. Performed the experiments: ACR. Analyzed the data: ACR JdUA. Contributed reagents/materials/analysis tools: ACR JdUA. Wrote the paper: ACR JdUA.

The authors have declared that no competing interests exist.

We developed a new multiple hypothesis testing adjustment called SGoF+ implemented as a sequential goodness of fit metatest which is a modification of a previous algorithm, SGoF, taking advantage of the information of the distribution of

Multiple hypothesis testing has become an important issue since the advent of “omic” technologies: genomics, proteomics, transcriptomics etc. Usually it involves the simultaneous testing of thousands of hypotheses producing a set of significant

The obvious goal when performing multiple testing adjustments is to detect as many true positives as possible while maintaining the false ones below a desired threshold. Therefore, for a fixed percentage of existing effects, the higher the number of tests performed, the higher the number of true positives that should be detected. In a previous work

To check the relationship between power increase through the number of tests and the pFDR committed we measured the ratio Power/pFDR. In the

The family of tests was 1,000 one-sample

We have plotted the true positive rate (i.e. sensitivity, y-axis) against the false positive rate (1-specificity, x-axis) through different percentage of effects from 1 to 80% for the distinct multiple testing methods when the number of tests is 1,000 (

The family of tests was 1,000 one-sample

Interestingly, when sample size is the largest (

Because we are performing simulations we can exactly measure the false discovery rates committed by the different methods. The positive false discovery rate, pFDR, was measured and averaged through replicates. The difference of measuring pFDR instead of FDR is that when measuring FDR the average is taken through all runs including those without discoveries which will have a FDR of 0. However, in the case of pFDR only runs with discoveries are averaged. Upon inspecting

The family of tests was 1,000 one-sample

To assess the performance of the several pFDR estimation methods (see Methods section) we plotted the difference between the estimated (epFDR) and the observed pFDR. Thus, the positive differences indicate conservative estimates. The results are given in

The family of tests was 1,000 one-sample

We performed an example of application of SGoF+ jointly with the estimate of the

We have used the SGoF+ software _{0} using the four previously checked methods. We can appreciate (_{0} to study which proteins should be considered as differentially expressed after correction with SGoF+ jointly with the consideration of the associated _{0} modal estimate if we decide to reject all 26 spots with

Method | π_{0} |

Bootstrap | 0. 80 |

LBE | 0. 84 |

SDPB | 0. 92 |

Smoothing | 0. 61 |

Mode | 0.82 |

Method | # of tests 5% | # of tests 0.1% | ||

SB | 0 | --- | 0 | --- |

BH | 0 | --- | 0 | --- |

SGoF | 6 | 0.22 | 0 | --- |

SGoF+ | 17 | 0.32 | 1 | 0.20 |

There is a problem with multiple test adjustment methods trying to control type I error rates because of the increase of the type II error i.e. the loss of power when the number of tests is high. Furthermore, it is known that the methods controlling FDR are not controlling pFDR in some situations

In fact, we have shown (

Therefore, it is reasonable to take advantage of a priori information on the

Thus, we suggest combining SGoF+ with the information provided by the

The proposed strategy was illustrated by analyzing real data from a protein expression experiment. In the original study

An important issue related to multiple testing in high-throughput experiments is the intrinsic inter-dependency in gene effects

Finally, it is worth mentioning that the above adjustment methods and the

As a conclusion, it seems that SGoF+ shows an improvement in the statistical power to detect true effects with respect to other adjustment methods including SGoF. Combining SGoF+ with the

Consider testing at significance level _{1}, H_{2}, …, H_{S}_{1}≤_{2} ≤…≤ _{S} be the sorted _{i} the null hypothesis corresponding to _{i} . Individually, each null hypothesis is rejected when the _{γ}_{γ}_{ 0} = arg max _{γ}_{γ}_{γ}_{γ}_{γ}

We perform a goodness-of-fit test via an exact binomial test or (for S≥10) a chi-squared test with one degree of freedom onto the null hypothesis H_{0}: _{γ}_{0}) = _{ 0} at a desired level α. The procedure now is identical as in the previous SGoF version _{α}(_{0}_{γ0} for such a goodness-of-fit test (i.e. _{α}(_{0}_{0})_{γ0}_{α}(_{0}_{α}(γ_{0} = min(S×K_{γ0} - b_{α}(_{0}_{α}_{1} , H_{2} ,…, H_{Nα(γ0)}_{α}_{γ0} - b_{α}(_{0}_{α}_{α}(γ_{0})_{0} = α (so _{γ0} - b_{α}(_{0}_{α}

It has been shown _{0} introduced by SGoF+ gives a FWER above the nominal. This problem is avoided by adding a preliminary step to compare the _{γ0} - _{γ0}) with the critical value of the one-sided Kolmogorov-Smirnov test at level α, _{α}_{γ0} - _{γ0})<_{α}_{0}, this correction (which has been incorporated in the implementation of SGoF+) guarantees a family wise error rate of 100α% (

True positive rate (TPR) is expressed as the power or sensitivity, i.e. proportion of true effects which were correctly identified. False positive rate (FPR) is expressed as the fraction of false positives out of the negatives i.e. one minus specificity, where specificity is the proportion of nulls which were correctly identified. Plots of TPR versus FPR are computed for different number of tests (

The proportion π_{0}(λ) of features that are null and the point λ from which the uniform distribution of

We fit a natural cubic spline to the data (λ_{r}, π_{0}(λ_{r})), where λ_{r}, _{0}(λ_{r}) follows formula (1) below, and we evaluate it at the point λ = 1 to get the π_{0} estimate _{r} = 0, 0.05, 0.1,…, 0.95.

We estimate the point λ by minimizing the mean-squared error of the estimated π_{0}. This is attained via bootstrapping the _{0} is computed as

The location based estimator (LBE) of π_{0} proposed in Dalmasso et al ^{2} for the variance upper bound of the estimator was assumed so that,

The method proposed in Meinshausen and Rice _{0} = 1-

Once the proportion of true null hypotheses π_{0} is estimated by any of the methods above, the estimated pFDR (epFDR) is computed as^{S}. This is the robust pFDR estimation given in Storey

To compare the performance of the above pFDR estimations we measure the difference between the estimated epFDR and the observed pFDR. In the real data example section we have assigned the different π_{0} estimates to intervals of length 0.05 in order to compute π_{0} as the mode of the different estimates.

To compare the efficiency of the proposed new SGoF+ metatest jointly with the precision of the pFDR estimation methods, we performed one sample two-tailed

For a given sample size

We assayed different percentages (% effect = 1, 5, 10, 20, 40, 60 and 80%) for the alternative model being true with respect to the total number

(TIF)

We thank AP Diz for comments on the manuscript and for providing the p-values for the example of application with real data.