Benchmarking network propagation methods for disease gene identification
Fig 2
Additive explanatory models for AUROC and top 20 hits.
Each column corresponds to a different model, whereas each row depicts the 95% confidence interval for each model coefficient. Rows are grouped by the categorical variable they belong to: method, cv scheme, network and disease. Each variable has a reference level, implicit in the intercept and specified in brackets: pr method, classic validation scheme, STRING network and allergy. Positive estimates improve performance over the reference levels, whereas negative ones reduce it. For example, the data suggest that method rf performs better than the baseline using both metrics, and is the preferred method using the top 20 hits. Switching from STRING to the OmniPath network, or from classic to block or representative cross-validation, has a negative effect on both performance metrics. Specific model estimates and confidence intervals can be found in Tables H and I in S1 Appendix.