An interpretable and balanced machine learning framework for Parkinson’s disease prediction using feature engineering and explainable AI

doi:10.1371/journal.pone.0333418

Table 1.

Summary of Parkinson’s disease (PD) classification studies.

More »

Expand

Table 2.

Summary of Parkinson’s disease (PD) classification studies (continued).

More »

Expand

Table 3.

Summary of Parkinson’s disease (PD) classification studies (continued).

More »

Expand

Fig 1.

The overview of the Parkinson’s dataset analysis and explainable artificial intelligence.

More »

Expand

Table 4.

Detailed description of the dataset attributes, including acoustic features and target labels.

More »

Expand

Fig 2.

The overall preprocessing steps applied to the Parkinson’s dataset.

More »

Expand

Fig 3.

Heatmap displaying the correlation between features in the dataset.

More »

Expand

Fig 4.

Feature selection using Featurewiz: Selecting key features while removing redundant ones.

More »

Expand

Fig 5.

Sampling distributions of class labels in the Base, SMOTE, and NearMiss versions of the dataset.

More »

Expand

Table 5.

Performance metrics for base models: Class-wise precision, recall, F1 score, accuracy, G-mean, MCC, and STD.

More »

Expand

Fig 6.

Regression analysis results for base models evaluated in the study.

More »

Expand

Table 6.

Performance metrics for NearMiss models: Class-wise precision, recall, F1 score, accuracy, G-mean, MCC, and STD.

More »

Expand

Fig 7.

Regression analysis results for models using Near Miss undersampling.

More »

Expand

Table 7.

Performance metrics for SMOTE models: Class-wise precision, recall, F1 score, accuracy, G-mean, MCC, and STD.

More »

Expand

Fig 8.

Regression analysis results for models using SMOTE oversampling.

More »

Expand

Table 8.

Performance metrics for ensemble methods (RF, ADB, XGB) using base, SMOTE, and NearMiss.

More »

Expand

Fig 9.

Visual comparison of the performance of the tuned KNN model with SMOTE, showcasing the confusion matrix and ROC curve.

(a) Confusion Matrix for Tuned KNN with SMOTE. (b) ROC Curve for Tuned KNN with SMOTE.

More »

Expand

Table 9.

Comparison of the proposed model’s performance with existing research approaches.

More »

Expand

Fig 10.

Receiver operating characteristic (ROC) curves for the nine models under (a) original imbalanced data, (b) NearMiss undersampling, and (c) SMOTE oversampling.

These curves illustrate each model’s classification performance across data balancing techniques. (a) Base Models. (b) Near Miss Models. (c) SMOTE Models.

More »

Expand

Fig 11.

Precision-recall curves (PRC) for the nine models across different data balancing strategies: (a) original imbalanced dataset, (b) NearMiss undersampling, and (c) SMOTE oversampling.

These curves highlight precision-recall trade-offs under each condition. (a) Base Models. (b) Near Miss Models. (c) SMOTE Models.

More »

Expand

Fig 12.

Comparison of original and predicted data demonstrating model effectiveness in capturing patterns.

More »

Expand

Fig 13.

SHAP summary scatter plot showing the effect of individual feature values on the model’s output.

Red indicates high feature values, blue indicates low.

More »

Expand

Fig 14.

SHAP summary bar plot showing average feature importance.

PPE, Fo(Hz), D2, and spread2 were the most influential in the best model.

More »

Expand

Fig 15.

SHAP waterfall plot showing feature-level contributions for a specific prediction.

Features such as DFA, RPDE, and Fo(Hz) reduce the prediction score, while PPE slightly increases it.

More »

Expand

Fig 16.

LIME explanation for an individual instance.

Features like spread2, PPE, and D2 contribute significantly to the model’s prediction, corroborating the SHAP results.

More »

Expand