Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models

doi:10.1371/journal.pcbi.1004590

Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models

Fig 2

Cross-validation performance for 45 TF models.

A) Area under precision-recall (AuPR) and receiver operating characteristic (AuROC) curves for different models. Mk, M1, M2, and M3 are estimated by 5-fold cross-validation. M0 model does not use a training set and the AuROC and AuPR where obtained by varying the threshold of the PWM. B) Examples of precision-recall curves for ATF2 and BATF. Random Forest classifiers outperform PWM-based models. M3 models (using experimental data tracks) outperform M1 models (using sequence only).

doi: https://doi.org/10.1371/journal.pcbi.1004590.g002