The ability to classify patients based on gene-expression data varies by algorithm and performance metric

doi:10.1371/journal.pcbi.1009926

The ability to classify patients based on gene-expression data varies by algorithm and performance metric

Fig 6

Relative performance of classification algorithms using gene-expression and clinical predictors and performing feature selection.

We predicted patient states using gene-expression and clinical predictors with feature selection (Analysis 5). We used nested cross validation to estimate which features would be optimal for each algorithm in each training set. For each combination of dataset, class variable, and classification algorithm, we calculated the arithmetic mean of area under the receiver operating characteristic curve (AUROC) values across 5 iterations of Monte Carlo cross-validation. Next, we sorted the algorithms based on the average rank across all dataset/class combinations. Each data point that overlays the box plots represents a particular dataset/class combination.

doi: https://doi.org/10.1371/journal.pcbi.1009926.g006