Safety-oriented and explainable machine learning for KSI crash risk prediction: Evidence from the United Kingdom

doi:10.1371/journal.pone.0347873

Fig 1.

Safety-oriented supervised machine learning framework.

More »

Expand

Table 1.

Summary of software and implementation environment.

More »

Expand

Fig 2.

Data cleaning pipeline and feature extraction strategy.

More »

Expand

Fig 3.

ROC curve illustrating the discriminative ability of the LightGBM model for KSI crash risk prediction.

More »

Expand

Fig 4.

Precision–Recall curve of the LightGBM model for KSI crash detection under class imbalance.

More »

Expand

Table 2.

Quantitative threshold optimization based on sensitivity–predictability trade-off.

More »

Expand

Table 3.

Classification performance of the LightGBM model at the optimized threshold (τ* = 0.35).

More »

Expand

Fig 5.

Confusion matrix of the LightGBM model at the optimized decision threshold (τ = 0.35), illustrating the trade-off between increased KSI detection and higher false-positive rates under a safety-oriented learning strategy.

More »

Expand

Table 4.

Comparison of imbalance-handling strategies.

More »

Expand

Table 5.

Comparative performance of machine learning models for KSI crash prediction at the safety-optimized decision threshold (τ* = 0.35).

More »

Expand

Fig 6.

Calibration curve of the LightGBM model for KSI prediction.

The close alignment between predicted probabilities and observed frequencies indicates good probabilistic reliability, supporting risk-based interpretation and prioritization of traffic safety interventions.

More »

Expand

Fig 7.

SHAP summary plot illustrating the encoded-feature contributions to KSI crash risk prediction.

Note: Feature colors indicate higher (red) and lower (blue) encoded values according to the STATS19 coding scheme. SHAP values represent conditional contributions relative to the model baseline and do not imply causality.

More »

Expand

Table 6.

Interpretation of SHAP patterns for selected explanatory variables (non-causal, risk-oriented interpretation).

More »

Expand