Fig 1.
Clinical data were generated for a simulated cohort of 1000 patients. The top panel shows the seven clinical phenotype variables. The range of values for each variable is shown in brackets. The plots indicate the combination of values of the critical variables determining treatment responsiveness: orange areas indicate values associated with patients being responsive and blue areas indicate non-response to treatment. The left and right plots indicate X and Y values when Z is present or absent respectively. By this arrangement 43.7% are responsive and 56.3% not responsive to treatment. Patients were randomly assigned to a treatment or placebo group (1:1 randomisation), with trial outcomes based on their true responsiveness and assigned group. The numbers in the ‘Treatment’ and ‘Placebo’ boxes represent the change in outcome measures (mean + /- standard deviations) for each group, according to their true responsiveness. We performed traditional inferential statistical analysis on the trial outcome data, to obtain estimates of effect size (mean change) and precision (95% confidence intervals). Then, machine learning (ML) analysis with XGBoost was conducted to predict individual patient treatment responses and to identify which clinical phenotype variables influenced these predictions. Finally, we assessed how data deficiencies and excesses impact ML analysis. CI = confidence interval.
Fig 2.
Forest plots illustrating mean and 95% confidence interval change in the outcome measure in patients receiving treatment compared with placebo.
The bottom plot shows results across the entire cohort, with different subgroups plotted above that. CI = confidence interval.
Fig 3.
Confusion matrices with classification metrics for predictions of treatment response using XGB.
Predictions of treatment response using XGB analysis are compared with the treatment response apparent in the trial outcome measure (A) and the ground truth (B). Orange shading denotes treatment responsive (suggested in the trial outcome or from ground truth) and blue shading denotes non-treatment responsive cells; bold orange/blue denote correct treatment allocation according to the outcome; light orange/blue denote inappropriate treatment allocation. PPV: positive predictive value; NPV: negative predictive value.
Fig 4.
SHAP (SHapley Additive exPlanations) values for each feature in the ML model.
Each point on the plot represents a variable’s SHAP value for each of the 509 patients receiving treatment for whom predictions of treatment response were made. Colours represent the value of the variable from low (white) to high (dark red). Variables are ordered by their impact on model outputs, from highest (top) to lowest (bottom), based on the sum of SHAP value magnitudes across all predictions.
Fig 5.
Scatter plots of variable values versus SHAP values to interpret how specific variable ranges influence treatment predictions. A: Scatter plot of X values on the x-axis versus SHAP values (left y-axis).
SHAP values reflect the importance of each X value in predicting treatment response, with higher positive values indicating greater importance for predicting treatment responsiveness. The colour of plots indicates what the prediction was, with orange indicating a prediction of treatment responsive, and blue non-responsive. Overlayed histograms show the proportion of predictions (right y-axis) for different X values: orange bars denote treatment-responsive predictions, and blue bars denote non-treatment-responsive predictions. B: Scatter plot of Y values on the x-axis versus SHAP values, filtered to include only instances where X is below 90. This plot demonstrates the importance of Y values in the model’s predictions, where lower negative SHAP values suggest higher importance for predicting non-responsive to treatment. Histograms overlayed on the scatter plot represent the proportion of predictions for different Y values, with orange bars for treatment-responsive and blue bars for non-treatment-responsive predictions. C: Scatter plot of Z values on the x-axis versus SHAP values, with data filtered to include only Y values between 50 and 90, as well as X values below 90. SHAP values indicate the importance of Z values in the model’s predictions, considering the constraints on X and Y. The histograms show the proportion of predictions for Z values, with blue bars representing treatment-responsive predictions and blue bars representing non-treatment-responsive predictions.
Fig 6.
Scatter plot depicting the accuracy of XGB treatment response predictions in relation to the number of noisy variables added to the original seven clinical variables.
Orange plots indicate accuracy compared with trial outcomes, while green plots indicate accuracy based upon ground truth.