Fig 1.
Safety-oriented supervised machine learning framework.
Table 1.
Summary of software and implementation environment.
Fig 2.
Data cleaning pipeline and feature extraction strategy.
Fig 3.
ROC curve illustrating the discriminative ability of the LightGBM model for KSI crash risk prediction.
Fig 4.
Precision–Recall curve of the LightGBM model for KSI crash detection under class imbalance.
Table 2.
Quantitative threshold optimization based on sensitivity–predictability trade-off.
Table 3.
Classification performance of the LightGBM model at the optimized threshold (τ* = 0.35).
Fig 5.
Confusion matrix of the LightGBM model at the optimized decision threshold (τ = 0.35), illustrating the trade-off between increased KSI detection and higher false-positive rates under a safety-oriented learning strategy.
Table 4.
Comparison of imbalance-handling strategies.
Table 5.
Comparative performance of machine learning models for KSI crash prediction at the safety-optimized decision threshold (τ* = 0.35).
Fig 6.
Calibration curve of the LightGBM model for KSI prediction.
The close alignment between predicted probabilities and observed frequencies indicates good probabilistic reliability, supporting risk-based interpretation and prioritization of traffic safety interventions.
Fig 7.
SHAP summary plot illustrating the encoded-feature contributions to KSI crash risk prediction.
Note: Feature colors indicate higher (red) and lower (blue) encoded values according to the STATS19 coding scheme. SHAP values represent conditional contributions relative to the model baseline and do not imply causality.
Table 6.
Interpretation of SHAP patterns for selected explanatory variables (non-causal, risk-oriented interpretation).