Table 1.
Variable definition and transformation.
Table 2.
Summary and statistics of main variables.
Table 3.
PredictiveAccuracy_FE + XGB.
Fig 1.
Model performance evaluation and overfitting diagnostics.
(a) Comparison of FE-OLS and FE + XGBoost models in terms of training- and test-set RMSE, MAE, and R2 (see Table 4 for detailed values); (b) Scatter plot of predicted vs. actual log rural income for 2022–2023, showing strong alignment (Pearson r = 0.961); (c) Permutation-test results based on 500 random shuffles of training labels; the observed R2 (0.924) lies far to the right of the null distribution (mean = −3.02 ± 2.06, p = 0.002), confirming that the predictive power is not achieved by chance; (d) Rolling five-fold cross-validation RMSE for FE + XGBoost demonstrating temporal robustness across forecast origins.
Table 4.
Top-10 variables ranked by mean absolute SHAP value.
Fig 2.
Global SHAP insights for the FE + XGBoost model.
(a) SHAP summary plot displaying the signed contribution of each variable; (b) Mean absolute SHAP values ranking the ten most influential predictors; (c) SHAP interaction summary highlighting cross-feature complementarities (e.g., between education and cultural services) and substitution effects (e.g., between healthcare and industrialisation).
Table 5.
Turning points of key fiscal items based on SHAP analysis.
Fig 3.
Partial dependence plots of key fiscal expenditures and SHAP turning points.
(a) Education (lnEDU): Inverted-U shape with a saturation threshold around ¥1,800 per capita; (b) Healthcare (lnHEA): Concave pattern with diminishing marginal gains beyond ¥1,000; (c) Infrastructure (lnINF): Positive but tapering returns across the observed range; (d) Social security (lnSOC): Rapidly saturating yet non-negative income effect; Red dashed lines indicate SHAP-based turning points derived from derivative-sign analysis.
Table 6.
SHAP-based feature importance and rank shifts across economic-scale groups.
Fig 4.
Heterogeneity of SHAP dependence by economic capacity.
(a) Education expenditure (lnEDU): The turning point appears earlier in low-capacity cities (¥900) than in high-capacity ones (¥1,350), indicating faster saturation under fiscal constraints; (b) Healthcare expenditure (lnHEA): High-capacity cities show sharper diminishing returns (threshold ≈ ¥800), while low-capacity cities maintain mild positive effects across the range.
Table 7.
Robustness of model performance and education turning point.