Methodological challenges in translational drug response modeling in cancer: A systematic analysis with FORESEE

doi:10.1371/journal.pcbi.1007803

Fig 1.

Performance portrayal of translational models trained on GDSC cell line data using the R-package FORESEE to predict GSE6434 patient drug response.

Of a total of 3,920 modeling pipelines, the best modeling pipeline had the following settings: drug: Docetaxel, cell response type: ln(IC₅₀), cell response transformation: binarization with k-means, sample selection: all, duplication handling: remove all duplicates, homogenization: remove unwanted variation, feature selection: landmark genes, feature preprocessing: none, black box algorithm: elastic net (while lasso would have yielded the same performance). (A) The receiver operating curve of the best model reveals an AUC of 0.986. (B) The comparison of the true responders and non-responders and their separation obtained from the best FORESEE model shows an almost perfect distinction, with a p-value of a t-test of 4.19e-6. (C) The performance distribution of all 3,920 model pipelines reveals a median AUC of ROC of 0.579.

More »

Expand

Table 1.

Model performance and pipeline settings of the best model for each patient data set.

More »

Expand

Fig 2.

Heatmaps of the performance of the best modeling pipeline of each patient data set in each of the other patient data sets.

(A) The color depicts the AUC of ROC of the respective pipelines. (B) The color represents the rank of the modeling pipeline among all 3,920 pipelines that were trained for a specific data set. More details of the modeling pipelines are listed in Table 1. The ranks of the best pipelines of the GSE6434 data set and the Sorafenib subset of the GSE33072 data set are 1.5, as two modeling pipelines yielded the exact same performance for the respective data set.

More »

Expand

Fig 3.

Pearson correlation of the performances of 3,920 FORESEE pipelines among the different cell line- and patient data sets.

Pearson correlation of the performances of 3,920 FORESEE pipelines for seven different patient data sets (A): GSE6434, GSE18864, GSE51373, GSE33072 Erlotinib cohort, GSE33072 Sorafenib cohort, GSE9782 GLP96 cohort and GSE9782 GLP97 cohort and (B) six different cell line model scenarios with data from GDSC for Docetaxel, Cisplatin, Paclitaxel, Erlotinib, Sorafenib and Bortezomib. The performance measure for the cell line-to-cell line modeling scenarios is the mean AUC of ROC of a 5-fold cross-validation for each drug.

More »

Expand

Fig 4.

Violin plots of the FORESEE performances compared to random distributions.

Distributions of the performances of all 3,920 FORESEE modeling pipelines in each of the seven different patient data sets: GSE6434, GSE18864, GSE51373, GSE33072 Erlotinib cohort, GSE33072 Sorafenib cohort, GSE9782 GLP96 cohort and GSE9782 GLP97 cohort. The actual performance distributions of the translational models (light blue) are compared to the distributions, where each of the 3,920 translational models was applied to 1,000 patient objects with randomly permuted gene labels (medium blue) and to distributions, where the actual patient response values are compared to 10,000 randomly generated binary vectors to calculate an artificial AUC of ROC measure (dark blue). Data sets are shown in increasing sample size from left to right.

More »

Expand

Fig 5.

Drug specificity plot of FORESEE modeling pipelines.

Impact of the training drug on model performance for each of the seven different patient data sets: GSE6434, GSE18864, GSE51373, GSE33072 Erlotinib cohort, GSE33072 Sorafenib cohort, GSE9782 GLP96 cohort and GSE9782 GLP97 cohort. A set of 100 pipelines, which are listed in S1 Table, were randomly chosen and used to train translational models on the GDSC cell line data with each of the 266 drugs contained in the GDSC database individually and then tested on each of the patient data sets. For each of the data sets, the drugs are ordered with respect to the mean AUC of ROC of the 100 random pipelines trained with that drug. The red color marks the drug that is actually applied to the patient. The first-ranked drug is additionally indicated in order to facilitate the comparison of the different drugs and their modes of action. As an exception, for predicting GSE9782 GPL97 patient outcome, six pipelines that include RUV as homogenization method were not trained on those drugs that resulted in a training set that had more samples than features, as this was not compatible with the PCA step performed in this method.

More »

Expand

Fig 6.

Heatmap of the performances of 3,920 FORESEE pipelines in the different cell line- and patient data sets averaged for model setting categories.

Heatmap of the performance of 3,920 FORESEE modeling pipelines tested with seven different patient data sets (Cell2Patient): GSE6434, GSE18864, GSE51373, GSE33072 Erlotinib cohort, GSE33072 Sorafenib cohort, GSE9782 GLP96 cohort and GSE9782 GLP97 cohort, and six different cell line model scenarios (Cell2Cell): GDSC data for Docetaxel, Cisplatin, Paclitaxel, Erlotinib, Sorafenib and Bortezomib. The performance measure for the cell line-to-cell line modeling scenarios is the mean AUC of ROC of a 5-fold cross-validation for each drug. The color depicts the mean AUC of ROC of modeling pipelines that comply with the corresponding model pipeline setting (x-axis). The black stars denote if a model pipeline setting is significantly enriched (p < 0.01) in the best 5% of all 3,920 modeling pipelines.

More »

Expand

Fig 7.

Validation of model setting guidelines.

Boxplots of the performances of all 3,920 FORESEE modeling pipelines for each of the seven different patient data sets: GSE6434, GSE18864, GSE51373, GSE33072 Erlotinib cohort, GSE33072 Sorafenib cohort, GSE9782 GLP96 cohort and GSE9782 GLP97 cohort, compared to only those 35 pipelines that include landmark genes as gene filter, PCA as preprocessing method and linear regression as modeling algorithm. The p-values are the results of a t-test between the two distributions.

More »

Expand

Table 2.

Patient data sets from GEO.

More »

Expand