Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Positive relationship between education level and risk perception and behavioral response: A machine learning approach

  • Zhipeng Wei,

    Roles Methodology, Writing – original draft, Writing – review & editing

    Affiliations Evidence-based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China, Center for Evidence-based Social Science, School of Public Health, Lanzhou University, Lanzhou, China, Innovation Laboratory of Evidence-based Social Science, Lanzhou University, Lanzhou, China

  • Zhichun Zhang,

    Roles Conceptualization, Resources

    Affiliations Evidence-based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China, Center for Evidence-based Social Science, School of Public Health, Lanzhou University, Lanzhou, China, Innovation Laboratory of Evidence-based Social Science, Lanzhou University, Lanzhou, China

  • Liping Guo,

    Roles Data curation, Investigation

    Affiliations Evidence-based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China, Center for Evidence-based Social Science, School of Public Health, Lanzhou University, Lanzhou, China, Innovation Laboratory of Evidence-based Social Science, Lanzhou University, Lanzhou, China

  • Wenjie Zhou,

    Roles Investigation, Validation

    Affiliations Center for Evidence-based Social Science, School of Public Health, Lanzhou University, Lanzhou, China, Innovation Laboratory of Evidence-based Social Science, Lanzhou University, Lanzhou, China, School of Information Resource Management, Renmin University of China, Beijing, China

  • Kehu Yang

    Roles Investigation, Resources

    yangkh-ebm@lzu.edu.cn

    Affiliations Evidence-based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China, Center for Evidence-based Social Science, School of Public Health, Lanzhou University, Lanzhou, China, Innovation Laboratory of Evidence-based Social Science, Lanzhou University, Lanzhou, China

Abstract

This paper aims to examine the influence mechanism of education level as a key situational factor in the relationship between risk perception and behavioral response, encompassing both behavioral intention and preparatory behavior. Utilizing non-parametric estimation techniques in machine learning, particularly the Random Forest and XGBoost algorithms, this study develops predictive models to analyze the impact of 27 influencing factors on behavioral responses following risk perception. The findings indicate that, while the model’s fit for preparatory behavior is 25.71% and its fit for behavioral intention is below 20%, the model effectively identifies key influencing factors. Further analysis employing SHAP values demonstrates that education level not only exerts a significant influence but also exhibits varying effects across different educational groups. Moreover, statistical testing corroborates the importance of education level in the relationship between risk perception and behavioral response, providing a robust scientific foundation for the development of risk management policies.

1. Introduction

Risk perception and behavioral response are vital elements of decision-making under uncertainty, especially in situations such as natural disasters, public health emergencies, and environmental threats. It’s crucial to understand the mechanisms that affect these processes to create effective risks management strategies. Among the numerous factors that influence risk perception and behavioral response, education level has been identified as a significant contextual variable. Nevertheless, the specific mechanisms by which education level impacts these outcomes are still not well understood, particularly in complex real-world situations where multiple variables interact in dynamic ways.

Risk perception has long been a central focus in disaster management and behavioral science. According to Siegrist and Arvai [1], research on risk perception can be broadly classified into three primary approaches: the characteristics of hazards, the characteristics of risk perceivers, and the application of heuristics in risk judgments.

These various perspectives contribute to a deeper understanding of how individuals assess disaster risks based on factors such as the controllability, predictability, and potential losses associated with hazards. Empirical studies in both natural and human-made disasters demonstrate that individuals’ perceptions of risk are shaped not only by the intrinsic characteristics of the disaster itself but also by emotional and cultural influences. For example, Mitsushita et al. [2] discovered that the cognitive frameworks for evaluating disaster risks differ between laypeople and experts, with the former exhibiting stronger emotional responses, such as dread and fear, towards certain hazards.

As demonstrated by Bodas et al. [3] in a cross-country study, individuals perceive risks associated with various disaster types (e.g., pandemics, extreme weather events, infrastructure failures) in diverse ways, and these perceptions, in turn, significantly influence their emergency response behaviors.

Risk perception is a complex, multidimensional construct that encompasses both cognitive evaluations of risk (e.g., likelihood of occurrence) and emotional responses (e.g., dread, fear, or uncertainty). As demonstrated by Bodas et al. [3] in a cross-country study, individuals perceive risks associated with various disaster types (e.g., pandemics, extreme weather events, infrastructure failures) in diverse ways, and these perceptions, in turn, significantly influence their emergency response behaviors. Consequently, risk perception is not solely an intellectual assessment; rather, it’s a highly subjective and context-dependent phenomenon, shaped by individual experiences as well as broader societal and cultural factors.

The relationship between education level and risk perception has garnered significant academic attention. A considerable body of research has indicated that individuals with higher levels of education typically possess a more comprehensive understanding of complex risks and are more inclined to adopt precautionary measures. Bodas et al. [3] demonstrated that individuals with higher educational attainment tend to exhibit greater awareness of disaster risks and show a heightened willingness to engage in preparatory actions for potential hazards. This phenomenon has been observed across diverse cultural contexts, with educated individuals being more proficient at processing risk-related information and taking proactive steps toward disaster preparedness.

However, the influence of education on risk perception is not always straightforward. In certain instances, individuals with lower levels of education may underestimate the risks associated with disasters due to limited access to or understanding of critical information. For example, Ge et al. [4] conducted a study on flood risk perception in Nanjing, China, and found that individuals with higher educational attainment demonstrated a more accurate understanding of flood risks and were more likely to adopt protective measures. In contrast, those with lower levels of education tended to downplay the severity of the threat, resulting in less engagement in preventive behaviors. This underscores the notion that while education can enhance awareness and understanding of risks, it may also contribute to disparities in how different demographic groups perceive and respond to hazards.

Risk perception plays a critical role in shaping individuals’ disaster preparedness behaviors. The Theory of Planned Behavior (TPB) has been extensively utilized to examine how risk perception influences preparedness intentions and actions. For instance, Ng [5] employed an extended TPB model to investigate disaster preparedness in a typhoon-prone district of Hong Kong, demonstrating that risk perception had a significant impact on individuals’ preparedness intentions. This relationship was further mediated by subjective norms and perceived behavioral control. Individuals with stronger perceptions of risk were more likely to engage in proactive measures to prepare for potential disasters, especially when they felt greater control over their ability to mitigate the associated risks. Similarly, Fang et al. [6] investigated the relationship between risk perception and resistance behaviors among residents living near chemical industry parks in China. Their findings revealed that individuals’ perceived risks played a pivotal role in determining their willingness to engage in protest or resistance activities. Notably, higher levels of social trust and public engagement were associated with lower perceived risks, which subsequently reduced the likelihood of active resistance. This highlights the importance of not only comprehending risk perception but also considering how social and community factors shape behavioral responses to perceived risks.

In recent years, machine learning techniques have become increasingly prevalent in the analysis of risk perception and disaster response behaviors. These methods provide novel opportunities to uncover complex patterns and relationships within large datasets, offering insights that traditional statistical approaches may fail to detect. ML algorithms, including decision trees and support vector machines (SVM), have been employed to predict individuals’ behaviors in response to disaster risks and to identify key factors such as education, age, and socioeconomic status that influence risk perception and preparedness actions. For instance, in the domain of pandemic risk perception, ML methods have been applied to analyze public attitudes and behaviors in relation to COVID-19. Vieira et al. [7] developed a Pandemic Risk Perception Scale and utilized ML techniques to model various risk dimensions, including infection risk, emotional health risk, and health system risk, as well as their influence on public preparedness behaviors. These approaches facilitate a more nuanced understanding of how individuals from diverse backgrounds (e.g., educational levels, socioeconomic status) perceive and respond to risks. ML can identify contextual variations in risk perception, allowing for the customization of disaster management strategies to specific groups. By integrating demographic data, past experiences, and risk communication, ML models can provide real-time predictions of public behavior, thereby enhancing the efficacy of risk communication and emergency preparedness efforts. Despite significant progress in understanding the relationship between education level, risk perception, and behavioral responses, several gaps persist. Most research has concentrated on specific disaster types, such as natural hazards or pandemics, with limited cross-disaster comparisons. Future studies should explore risk perception across diverse hazard types to identify both commonalities and distinct factors influencing risk awareness and behavior.

While ML holds promise for analyzing large-scale data, its application in risk perception research remains nascent. Further investigation is needed to integrate ML with theoretical frameworks, such as the TPB, to develop comprehensive models that address the complex interactions between education, risk perception, and disaster preparedness. Additionally, more context-specific studies are required to examine how cultural and social factors, alongside education, influence risk perception and response behaviors.

This study aims to address these gaps by examining the impact of education level on risk perception and behavioral response using ML techniques. By constructing predictive models based on 27 factors, we identify key predictors of behavioral intention and preparatory behavior, with a particular emphasis on education level. Additionally, SHAP values are employed to interpret model outputs and reveal the mechanisms through which education influences these behaviors. Our findings contribute to the growing literature on the application of ML in risk research and provide a scientific foundation for the development of targeted risk management policies.

By integrating recent advancements in ML and risk research, this study provides a novel perspective on the role of education in shaping risk-related behaviors and highlights the importance of context-specific evidence in guiding policy and practice.

2. Methodology

ML techniques are widely used in environmental risk assessments [8,9], such as flood susceptibility mapping [10], and in community and behavioral risk evaluations, including identifying at-risk students [11]. These applications highlight ML’s potential to reveal complex relationships and improve decision-making in risk-related contexts.

Random Forest and XGBoost are ML algorithms used for regression and classification, both based on ensemble learning to improve performance by combining multiple models. Random Forest, an ensemble method built on decision trees, constructs numerous trees trained on random subsets of data [12]. Each tree makes independent predictions, which are aggregated through voting or averaging to yield the final prediction. This approach enhances robustness and accuracy while reducing overfitting. XGBoost (Extreme Gradient Boosting), based on gradient-boosted trees [13], iteratively builds decision trees that correct the residual errors of the previous tree, optimizing model performance by minimizing a loss function. Known for its high accuracy, efficiency, and ability to handle large datasets, XGBoost excels in predictive tasks.

This allows us to rank feature importance and uncover key influencing factors. By visualizing feature SHAP values and their relationships with feature values, along with interaction plots, we can construct models that highlight the underlying trends.

As shown in Table 1, behavioral intentions and preparatory behaviors serve as dependent variables, while 27 influencing factors are treated as independent variables to build ML predictive models [4,7,1422], including Random Forest Regression and XGBoost Regression (XGBRegressor). Model performance is assessed using of MAE (Mean Absolute Error), MSE (Mean Squared Error), and R², with the goal of identifying the optimal input for an interpretable model. SHAP (Shapley Additive Explanations) is employed to interpret model predictions, providing insight into how the model arrives at its outcomes [23]. Based on game theory, SHAP calculates the Shapley value for each feature, evaluating its contribution across various feature combinations. This allows us to rank feature importance and uncover key influencing factors. By visualizing feature SHAP values and their relationships with feature values, along with interaction plots, we can construct models that highlight the underlying trends.

The data for this study were collected from a questionnaire survey conducted on Wenjuanxing between March 7 and March 9, 2024, yielding 4,507 responses. To ensure data quality, samples with completion times under 200 seconds or over 420 seconds, as well as cases with missing variables, were excluded, resulting in a final dataset of 2,239 valid samples.

3. Results

3.1 Model training and evaluation

The dataset is split into training and validation sets in an 8:2 ratio. Model performance is evaluated using of MAE, MSE, and R2, as shown in Table 2. MAE and MSE measure the error between observed and predicted values, with smaller values indicating lower error. R2 reflects the goodness of fit, with higher values signifying better model performance. Based on these metrics, the random forest model demonstrates superior performance and will be used in subsequent experiments.

Although the selected variables explain only 25.71% of the variance in the preparatory behavior model (ytotscore) and less than 20% in the behavioral intention model (mtotscore), falling short of fully accounting for the output variables in the random forest model, the primary aim of this section is to identify specific scenarios by examining internal influencing factors within a general context, rather than predicting the output variables. Thus, the goodness of fit serves only as a criterion for model selection and does not impede the exploration of specific scenarios in subsequent analyses.

3.2 Results description

As shown in Fig 1, with preparatory behavior as the output variable, Fig 1(a) presents the global feature importance ranking based on the average absolute Shapley values. It’s clear that among the top 20 ranked features, risk perception (xtotscore), place of residence (Q24), and education level (Q12) are particularly significant. To explore how these features influence preparatory behavior, Fig 1(b) displays a scatter plot of SHAP values for each sample and feature. Each point in the plot conveys three pieces of information: the vertical axis represents the feature, the color indicates the feature value (with red indicating higher values and blue values), and the horizontal axis shows the direction of influence on the predicted value.From the feature importance ranking and the influence of each feature on preparatory behavior, we draw the following conclusion.

thumbnail
Fig 1. Feature Impact Summary Based on SHAP Values.

left: (a) Feature Importance Ranking; Right: (b) Influence of Features on Preparatory Behavior Effectiveness.

https://doi.org/10.1371/journal.pone.0321153.g001

Education level (Q12) plays a significant role in shaping preparatory behavior (ytotscore), serving as both a core predictive indicator and a potential contextual factor. Firstly, within the multivariate explanatory variables for preparatory behavior, education level ranks third, contributing 6.87% to the overall model, highlighting its strong explanatory power. Secondly, unlike other variables, which show concentrated SHAP values around zero, education level exhibits distinct dispersion. Specifically, individuals with higher educational attainment (red points) show a strong positive correlation with preparatory behavior, while those with lower educational attainment (blue points) display a negative correlation. This further emphasizes the importance of education level in the model and warrants further exploration of its influencing mechanisms [24]. Additionally, as noted by Woodridge, the interaction of significant variables during evidence generation can notably affect the effect size, suggesting that the role of education level as a control variable can directly impact the outcome, making it a critical factor in shaping general evidence.

As shown in Fig 2, with behavioral intention as the output variable, Fig 2(a) presents the global feature importance ranking based on the average absolute Shapley values for each feature. Among the top 20 features, risk perception (xtotscore), place of residence (Q24), and transparency of research progress on earthquake relief and disaster prevention in the residential area (Q28_Row4) are notably significant. To explore how these features influence preparatory behavior, Fig 2(b) displays a scatter plot of SHAP values for each sample and feature. Each point represents three elements: the vertical axis indicates the feature, the color reflects the feature value (with red representing higher values and blue lower values), and the horizontal axis shows the direction of influence on the predicted value. Based on the feature importance ranking and the impact of each feature on preparatory behavior, the following conclusion can be drawn.

thumbnail
Fig 2. Feature Impact Summary Based on SHAP Values.

Left: (a) Feature Importance Ranking; Right: (b) Influence of Features on Behavioral Intention Effectiveness.

https://doi.org/10.1371/journal.pone.0321153.g002

The education level plays a crucial role in predicting behavioral intention and serves as a significant contextual factor. Although its explanatory power is somewhat reduced in the behavioral intention model, it remains moderate. Unlike many variables, which cluster around a SHAP value of zero, the distribution of education level shows distinct patterns.

3.3 Examination of contextual evidence

Based on ML model results, education level (Q12) emerges as a significant contextual factor. To validate this, a statistical test is conducted to assess its impact on preparatory behavior. The effect size is analyzed in two groups: one with education level as a control variable and one without. In the group where education level is included, the average effect size is 0.0733; in the group without it, the average effect size is 0.0764. Given education’s potential importance in specific contexts, we hypothesize a significant difference between these groups, which is tested using a t-test.

Before performing the t-test, it is essential to test for variance homogeneity, as this is a prerequisite. The homogeneity test shows a significant difference in variances between the two groups, statistically significant at the 1% level. Therefore, we use the t-test for unequal variances. The results indicate that the mean effect size in the group without education level as a control is significantly higher than in the group with it, with statistical significance at the 1% level. This strongly supports the conclusion that education level plays a crucial role in the relationship between risk perception and preparatory behavior.

To assess the impact of risk perception on behavioral intention, the original evidence is divided into two groups: one that includes education level as a control variable and one that does not. The group considering education level has a mean effect size of 0.1727, while the other group has a mean effect size of 0.1725. If education level is a significant contextual factor, a notable difference should exist between the two groups. A t-test is conducted to examine this. Prior to the t-test, the homogeneity of variances must be verified. The results indicate a significant variances difference at the 1% level, prompting the use of a t-test for unequal variances. The findings show that the mean effect size with educating level as a control is significantly higher than without it.

4. Discussion

In summary, education level is a critical contextual factor for several reasons. First, individuals with higher education typically possess enhanced knowledge and information-processing skills [25], enabling them to better understand and assess risks, leading to more effective preparatory measures.

Second, education strengthens risk awareness, with more educated individuals being more vigilant and proactive in preparing for potential risks [26].

Third, higher education fosters logical thinking and critical analysis, allowing individuals to assess situations calmly, develop targeted response strategies, and implement effective preparatory behaviors.

Finally, individuals with higher education are more adept at acquiring and utilizing resources, leveraging information, expert advice, and social networks to manage risks.

5. Conclusion

This study contributes to the growing literature on machine learning in risk research, offering actionable insights to improve public preparedness and resilience.

This study examined the impact of education level on risk perception and behavioral response using advanced ML techniques, including Random Forest and XGBoost, with SHAP values for interpretability. The findings highlight education level as a key contextual factor, with higher education linked to more proactive preparatory behaviors and refined risk perceptions. Although the models’ predictive accuracy was limited, they effectively identified key factors and emphasized the role of education in shaping risk-related decisions. These insights underscore the importance of context-specific evidence in risk management and policy formulation. Future research should expand on these findings by incorporating larger datasets and exploring additional contextual variables to further refine our understanding of risk perception. This study contributes to the growing literature on ML in risk research, offering actionable insights to improve public preparedness and resilience.

References

  1. 1. Siegrist M, Árvai J. Risk Perception: Reflections on 40 Years of Research. Risk Analysis. 2020;40(S1):2191–206.
  2. 2. Mitsushita K, Murakoshi S, Koyama M. How are various natural disasters cognitively represented?: a psychometric study of natural disaster risk perception applying three-mode principal component analysis. Nat Hazards. 2022;116(1):977–1000.
  3. 3. Bodas M, Peleg K, Stolero N, Adini B. Risk Perception of Natural and Human-Made Disasters-Cross Sectional Study in Eight Countries in Europe and Beyond. Front Public Health. 2022;10:825985. pmid:35252099
  4. 4. Ge Y, Yang G, Wang X, Dou W, Lu X, Mao J. Understanding risk perception from floods: a case study from China. Nat Hazards (Dordr). 2021;105(3):3119–40. pmid:33424123
  5. 5. Ng SL. Effects of Risk Perception on Disaster Preparedness Toward Typhoons: An Application of the Extended Theory of Planned Behavior. Int J Disaster Risk Sci. 2022;13(1):100–13.
  6. 6. Fang X, Cao L, Zhang L, Peng B. Risk perception and resistance behavior intention of residents living near chemical industry parks: an empirical analysis in China. Nat Hazards. 2022;115(2):1655–75.
  7. 7. Vieira KM, Potrich ACG, Bressan AA, Klein LL, Pereira BAD, Pinto NGM. A Pandemic Risk Perception Scale. Risk Anal. 2022;42(1):69–84. pmid:34374448
  8. 8. Zhou Z, Zhou X, Qi H, Li N, Mi C. Near miss prediction in commercial aviation through a combined model of grey neural network. Expert Systems with Applications. 2024;255:124690.
  9. 9. Zhou Z, Zhuo W, Cui J, Luan H, Chen Y, Lin D. Developing a deep reinforcement learning model for safety risk prediction at subway construction sites. Reliability Engineering & System Safety. 2025;257:110885.
  10. 10. Ait Naceur H, Igmoullan B, Namous M. Machine learning-based optimization of flood susceptibility mapping in semi-arid zone. DYSONA - Applied Science. 2025;6(1):145–59.
  11. 11. Pek RZ, Ozyer ST, Elhage T, Ozyer T, Alhajj R. The Role of Machine Learning in Identifying Students At-Risk and Minimizing Failure. IEEE Access. 2023;11:1224–43.
  12. 12. Breiman L. Random forests. Machine Learning. 2001;45:5–32.
  13. 13. Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Statist. 2001;29(5):.
  14. 14. Karanci AN, Aksit B, Dirik G. Impact of a community disaster awareness training program in Turkey: does it influence hazard-related cognitions and preparedness behaviors. Soc Behav Pers. 2005;33(3):243–58.
  15. 15. Cheng H, Zhu L, Gou F, Zhai W. Unpacking risk perceptions of COVID-19 in China: insights for risk management and policy-making. Nat Hazards. 2023;120(1):529–46.
  16. 16. Brilly M, Polic M. Public perception of flood risks, flood forecasting and mitigation. Nat Hazards Earth Syst Sci. 2005;5(3):345–55.
  17. 17. Njome MS, Suh CE, Chuyong G, de Wit MJ. Volcanic risk perception in rural communities along the slopes of mount Cameroon, West-Central Africa. Journal of African Earth Sciences. 2010;58(4):608–22.
  18. 18. Paton D, Smith L, Daly M, Johnston D. Risk perception and volcanic hazard mitigation: Individual and social perspectives. Journal of Volcanology and Geothermal Research. 2008;172(3–4):179–88.
  19. 19. Grothmann T, Reusswig F. People at Risk of Flooding: Why Some Residents Take Precautionary Action While Others Do Not. Nat Hazards. 2006;38(1–2):101–20.
  20. 20. Siegrist M, Gutscher H. Flooding risks: a comparison of lay people’s perceptions and expert’s assessments in Switzerland. Risk Anal. 2006;26(4):971–9. pmid:16948689
  21. 21. Paek H, Hove T. Risk perceptions and risk characteristics. Oxford Research Encyclopedia of Communication. 2024.
  22. 22. Heitz C, Spaeter S, Auzet A-V, Glatron S. Local stakeholders’ perception of muddy flood risk and implications for management approaches: A case study in Alsace (France). Land Use Policy. 2009;26(2):443–51.
  23. 23. Lundberg S, Lee S. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017;4768–77.
  24. 24. Wooldridge JM. Introductory econometrics: A modern approach. 2015
  25. 25. Koutna M, Janicko M. Trajectories in the Czech Labour Market: The role of information-processing skills and education. Ekonomicky Casopis. 2018;66(1):3–27.
  26. 26. Bhuiya T, Klares Iii R, Conte MA, Cervia JS. Predictors of misperceptions, risk perceptions, and personal risk perceptions about COVID-19 by country, education and income. J Investig Med. 2021;69(8):1473–8. pmid:34380630