Fig 1.
Schematic diagram showing the predictive modeling process.
The STAR*D training dataset was used to create 30 subsamples with equal ratios of cases and controls and 30 models were constructed using the entire training dataset. The 30 models were used to predict the outcome for the independent STAR*D test dataset and RIS-INT-93 dataset and the predicted outcome were average across 30 models.
Fig 2.
AUC for models containing top 2 to 75 representative predictors from k-means cluster (k = 75) was plotted against the number of predictors for each of the machine learning methods in the STAR*D training and test datasets, respectively.
Remission status was used to define TRD using QIDS-C16 data.
Fig 3.
Receiver operating characteristic curves in training and test dataset (STAR*D) using the full set of features, top n features (n ~ 30), and the overlapping features where remission status was used to define TRD (STAR*D remission status was defined using QIDS-C16 data, and RIS-INT-93 remission status was defined using HAM-D17).
Fig 4.
Permutation process to access the model robustness.
The outcome label of the STAR*D training dataset was randomly shuffled 1,000 times and the AUC distribution of the 1,000 null models were plotted for each machine learning machine method (A) XGBoost, (B) Random Forest, (C) l2 penalized logistic regression, and (D) GBDT. In all cases, the observed AUC out-perform the random noise from the 1,000 null models.
Table 1.
Model performance (outcome defined by remission using QIDS-C16) in the STAR*D testing dataset and RIS-INT-93.
Fig 5.
variable of importance in statistical learning approaches for outcomes defined by (A) remission status (B) responder status. In both cases, the outcomes were defined using QIDS-C16. SFHS: Short Form Health Survey (SF-12); WSAS: The Work and Social Adjustment Scale; *from PRISE: The Patient Rated Inventory of Side Effects, which collected symptoms one had experienced in the past week. Those symptoms may or may not have been caused by the treatment.
Table 2.
Predictors from PROC LOGISTIC for TRD phenotype defined using remitter criteria in the STAR*D training and testing datasets.