Interpretable machine learning for chronic kidney disease prediction: Insights from SHAP and LIME analyses

doi:10.1371/journal.pone.0343205

Fig 1.

Overview of the proposed methodology for CKD prediction.

More »

Expand

Fig 2.

Class distribution of the CKD datasets: (A) Dataset 1 before applying SMOTE, (B) Dataset 2 before applying SMOTE, (C) Dataset 1 after applying SMOTE within training folds, (D) Dataset 2 after applying SMOTE within training folds.

SMOTE was applied exclusively during the training phase of each cross-validation fold to prevent data leakage.

More »

Expand

Table 1.

Conservative hyperparameter optimization for Dataset 1 (UAE Tawam Hospital).

More »

Expand

Table 2.

Conservative hyperparameter optimization for Dataset 2 (UCI CKD).

More »

Expand

Table 3.

Performance metrics for Dataset 1 (UAE Tawam Hospital) without SMOTE.

More »

Expand

Table 4.

Performance metrics for Dataset 1 (UAE Tawam Hospital) with SMOTE.

More »

Expand

Table 5.

Performance metrics for Dataset 2 (UCI CKD) without SMOTE.

More »

Expand

Table 6.

Performance metrics for Dataset 2 (UCI CKD) with SMOTE.

More »

Expand

Fig 3.

ROC curve comparison for Dataset 1 (UAE Tawam Hospital).

Panel A: without SMOTE; Panel B: with SMOTE. XGBoost demonstrates improved discrimination with SMOTE (AUC: 0.886 → 0.904).

More »

Expand

Fig 4.

ROC curve comparison for Dataset 2 (UCI CKD).

Panel A: without SMOTE; Panel B: with SMOTE. XGBoost achieves optimal performance (AUC = 0.948 ± 0.013) with SMOTE.

More »

Expand

Fig 5.

SHAP feature importance comparison across datasets.

Panel A: Dataset 1 (UAE Tawam Hospital), where cardiovascular–renal markers are predominantly influential. Panel B: Dataset 2 (UCI CKD), emphasizing direct renal function indicators.

More »

Expand

Fig 6.

SHAP summary plot comparison across datasets.

Panel A: Dataset 1 (UAE Tawam Hospital). Panel B: Dataset 2 (UCI CKD). Feature value distributions highlight clinically coherent patterns and consistent model behavior across both cohorts.

More »

Expand

Fig 7.

Individual prediction explanations using SHAP waterfall plots.

Panel A: Dataset 1, representative Non-CKD prediction. Panel B: Dataset 2, representative Non-CKD prediction. The plots provide transparent, case-level clinical reasoning by showing how each feature contribution shifts the model output from the baseline toward the final decision.

More »

Expand

Fig 8.

LIME explanation for UCI Dataset case.

Sample 50 showing CKD prediction driven by low specific gravity and severe anemia, validating SHAP’s hematological marker hierarchy.

More »

Expand

Fig 9.

LIME explanation for Tawam Dataset case.

Sample 49 showing non-CKD prediction dominated by excellent eGFR protection, consistent with SHAP’s cardiovascular-renal emphasis.

More »

Expand

Fig 10.

Calibration summary for Dataset 1 (Tawam): Brier (left) and ECE (right) with/without SMOTE.

More »

Expand

Table 7.

Calibration metrics (Brier, ECE) from outer-fold OOF predictions (lower is better).

More »

Expand

Fig 11.

Calibration summary for Dataset 2 (UCI): Brier (left) and ECE (right) with/without SMOTE.

More »

Expand