On the estimation of inverse-probability-of-censoring weights for the evaluation of survival prediction error

doi:10.1371/journal.pone.0318349

Fig 1.

Baseline scenario with an exponentially distributed and predictor-free censoring process.

The plots present values of the log-transformed RWSE score obtained for 1000 simulation runs as a function of the dataset used for fitting the censoring survival function (test, training or the combined dataset), for different values of n, which corresponds to the size of the combined dataset (with n_training = n_test, panel a), for different values of n_training and a constant value for n_test = 500 (panel b), for different values of n_test and a constant value for n_training = 500 (panel c), and for different values of the censoring rate (20%, 50% and 80%, where n_training = n_test = 500, panel d). The survival function was fitted using a Cox proportional hazards model, whereas the censoring survival function was fitted using a marginal Kaplan-Meier estimator.

More »

Expand

Fig 2.

Weibull-Cox scenario with a Weibull distributed and predictor-dependent censoring process.

The plots present the values of the log-transformed RWSE score obtained for 1000 simulation runs as a function of the dataset used for fitting the censoring survival function (test, training or the combined dataset), for different values of n, which corresponds to the size of the combined dataset (with n_training = n_test, panel a), for different values of n_training and a constant value for n_test = 500 (panel b), for different values of n_test and a constant value for n_training = 500 (panel c), and for different values of the censoring rate (20%, 50% and 80%, where n_training = n_test = 500, panel d). Both the survival and the censoring survival function were fitted using a Cox proportional hazards model.

More »

Expand

Fig 3.

Weibull-Cox and misspecification scenarios with a Weibull distributed and predictor-dependent censoring process.

The plot presents the values of the log-transformed RWSE score obtained for 1000 simulation runs as a function of the dataset used for fitting the censoring distribution (test, training or the combined dataset), and as a function of the type of model used to fit the censoring function (Cox regression [solid lines, correctly specified] or Kaplan-Meier estimator [dashed lines, misspecified]). The total size of the dataset is given by n = n_training + n_test, where n_training = n_test.

More »

Expand

Fig 4.

Low-noise scenario.

The plots present the values of the RWSE score obtained for 1000 simulation runs as a function of the dataset used for fitting the censoring distribution (test, training or the combined dataset), and as a function of the type of model used to fit the survival and the censoring survival functions (Cox proportional hazards, panel a; Lasso, panel b; random forest, panel c; XGBoost, panel d). The total size of the dataset is given by n = n_training + n_test = 1000, where n_training = n_test = 500. The censoring rate was set to approximately 50%.

More »

Expand

Fig 5.

High-noise scenario.

The plots present the values of the RWSE score obtained for 1000 simulation runs as a function of the dataset used for fitting the censoring distribution (test, training or the combined dataset), and as a function of the type of model used to fit the survival and the censoring survival function (Cox proportional hazards, panel a; Lasso, panel b; random forest, panel c; XGBoost, panel d) on a dataset with a signal-to-noise ratio among the predictors of 1:5 (10 informative and 50 non-informative predictors). The total size of the dataset is given by n = n_training + n_test = 1000, where n_training = n_test = 500. The censoring rate was set to approximately 50%.

More »

Expand

Fig 6.

Analysis of the SEER breast cancer data using a Cox proportional hazards model.

The plot shows the mean (bold center line) and standard deviation (shaded area) of the IPCW Brier score obtained on 10 bootstrap test samples, with IPC weights estimated from either the training, test, or the combined dataset. A Cox proportional hazards model was used for estimating the survival and the censoring survival functions.

More »

Expand

Fig 7.

Analysis of the SEER breast cancer data using random forest.

The plots show the mean (bold center line) and standard deviation (shaded area) of the IPCW Brier score obtained on 10 bootstrap test samples, with IPC weights estimated from either the training, test, or the combined dataset. A random forest was used for estimating the survival and censoring survival functions. The left panel shows the results of an untuned model, with the number of trees set to 500 and all other hyperparameters set to their default values, whereas the model in the right panel was tuned using Bayesian optimization.

More »

Expand

Fig 8.

Analysis of the SEER breast cancer data using XGBoost.

The plots show the mean (bold center line) and standard deviation (shaded area) of the IPCW Brier score obtained on 10 bootstrap test samples, with IPC weights estimated from either the training, test, or the combined dataset. XGBoost was used for estimating the survival and censoring survival functions. The left panel shows the results of an untuned model, with the number of boosting rounds set to 500 and all other hyperparameters set to their default values, whereas the model on the right was tuned using cross-validation and Bayesian optimization.

More »

Expand