Figures
Abstract
The purpose of this study is to examine the relations between health determinants and overall health within population of the European Union by using the structural equation model, particularly given that not all health-determining factors have been theoretically identified. The significance of this issue stems, among other things, from the fact that the factors determining health are essential for policymakers for implementing an effective and accurate intervention in society, particularly considering that not all such factors have yet been theoretically identified. We hypothesized that (1) There is a statistically significant positive relationship between socioeconomic status and the health of the European Union population, (2) The structural equation model (SEM) is applicable for examining the complex interrelationships between socioeconomic status and health outcomes in the European Union population at the regional level. We used a dataset from EUROSTAT covering finally 258 regions at NUTS 2 in European regions for the year 2022 (or the nearest). We employed the SEM in this study due to its ability to simultaneously analyze multiple variables and latent constructs, thereby minimizing measurement error and enhancing the validity of the findings. So, our research has shown that a better social status of communities in European regions is associated with a higher level of health. Furthermore, a better economic situation of a region significantly improves the health of its inhabitants. However, in general economic factors have a stronger impact on health than social status. Thus, these findings have both theoretical and practical significance, as they identified key modifiable socioeconomic determinants of health and provided valuable insights for shaping effective public health policies and targeted interventions aimed at reducing health inequalities across European regions. Moreover, this study demonstrated the utility of SEM as a robust approach for examining complex relationships among health determinants including direct and indirect effects. By applying SEM, the research aligns methodologically with the growing body of literature in health sciences and contributes to a broader understanding of how socioeconomic factors influence health under varying regional conditions.
Citation: Jankowiak M, Rój J (2025) Is structural equation modeling applicable in a population health determinants assessment? An experience from the European Union. PLoS One 20(12): e0337042. https://doi.org/10.1371/journal.pone.0337042
Editor: Beata Calka, Military University of Technology Faculty of Civil Engineering and Geodesy: Wojskowa Akademia Techniczna im Jaroslawa Dabrowskiego Wydzial Inzynierii Ladowej i Geodezji, POLAND
Received: July 28, 2025; Accepted: November 1, 2025; Published: December 11, 2025
Copyright: © 2025 Jankowiak, Rój. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the Supporting Information file.
Funding: This publication was partly funded by funds from RIGE under the Regional Initiative for Excellence Programme for the project The Poznań University of Economics and Business for Economy 5.0: Regional Initiative – Global Effects (RIGE), received by J.R. There was no additional external funding received for this study.
Competing interests: The authors have declared that no competing interests exist.
Introduction
In recent years, “health” has become a subject of interest of researchers from other various disciplines than just medics and psychologists, due to an alarming increase in population health challenges, which suggests that some risk factors are not yet recognized or addressed by preventive strategies [1]. This is because no consensus has yet been reached on definitions of health and disease [2], while it is known fact, that the underlying processes leading to health are complex and therefore require investigation of interactions between several risk factors [1]. This is a significant problem because health is the most precious asset and widely regarded as the most valuable asset – a ‘resource for living’ allowing people to function and participate in many activities in society to pursue diverse life plans [3,4]. It implies that health holds dual significance: it is valuable on its own, and it also plays a crucial role in social progress and economic development. Therefore, health goals command a significant position on the United Nations 2030 Agenda for Sustainable Development [5].
The most cited definition of health is the one proposed by the World Health Organization (1948) that defines health as “a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity” [6]. Despite remaining unchanged since 1948, the WHO’s definition of health has faced significant and ongoing criticism [7] and this is mainly due to its utopian character, conceptual pluralism as well as an unmeasurable nature [8–10]. Although many researchers have attempted to propose more modern and coherent definitions of health, a clear alternative to the WHO’s definition and a consensus on health definitions has yet to be established [2].
This complexity of health is also reflected by the production function of health, which was first described by Auster, Leveson and Sarachek (1969), who examined health (measured by mortality rate) as a function of both medical care and non-medical inputs [11]. Also, Lalonde (1974) identified four key components of health and emphasized the need for a holistic understanding of health highlighting the significance of socioeconomic factors [12]. Subsequently, numerous researchers have incorporated these approaches into their studies, while also introducing various other variables to explain health status, which are referred to in the literature as the socioeconomic determinants of health [13,14].
So, the World Health Organization proposes three main groups of health determinants such as: individual characteristics and behaviors, social and economic environment and physical environment. These determinants encompass key factors such as genetics, personal behaviors and coping skills, gender, income and social status, education, physical environment, social support networks and health services [15]. By shaping the conditions of daily life in which people are born, grow up, live, work, and age, these socioeconomic determinants affect people’s chances of achieving and maintaining good health. Therefore, many studies have focused on analyzing the relationship between a wide range of health outcomes and various socioeconomic elements, as well as identifying these determinants as the main root cause of many health inequities what was early discussed by Rój & Jankowiak in 2021 [16]. Determining the importance and scope of the impact of individual determinants is crucial for achieving health justice, and especially for developing actions and programs aimed at eliminating health inequalities [17]. Recognizing health inequities as a social justice issue and an ethical imperative, the World Health Organization (WHO) established the Commission on Social Determinants of Health in March 2005. This Commission presented its final report to WHO in July 2008 before concluding its work in which it urged governments and society to the social determinants of health and in creating better social conditions for health, particularly among the most vulnerable people [18]. Also, in 2010, WHO’s Commission on the Social Determinants of Health developed a widely used conceptual framework in purpose to explain the complex relationships between social determinants and health outcomes [19,20].
The good think is that these socioeconomic determinants are modifiable and can be influenced by social, political or economic processes as well as culture and norms, law, investment [21]. However, in the literature, it is also underlined that bringing about a reduction in their distribution inequities requires effective interventions in all sectors and therefore it is an significant challenge for health policies [22–24].
Thus, socioeconomic determinants of health have consequences for the economy, national security, business, and future generations [19]. As the Covid-19 pandemic has disproportionately impacted and exacerbated the current social determinants of health situation at individual and national, regional and global levels [25], therefore this study focus on these determinants in the context of their impact on the health outcomes.
However, the problem with the determinants of health is that some of them are observable such as blood pressure level, cholesterol level but there are some that are not directly observed (that are difficult to measure directly/ interconnected variables) such as lifestyle, socio-demography, and mental health condition or latent, which complicates the process of determining health outcomes [26]. Therefore, there is growing interest in the application of structure equation models (SEM), whose use has recently increased exponentially in health and medical sciences [27,28] as this model allows to handle latent variables (e.g., patient satisfaction, quality of life), test theoretical models, and analyze direct, indirect, and mediating effects [29].
So, the SEM is a set of statistical techniques used to measure and analyze the relations between variables, which can be latent and observed, or only between observed variables and their on each other [27,30,31]. It includes, among other techniques, the Linear Regression Model (LRM), Factor Analysis (FA), Confirmatory Factor Analysis (CFA), and Path Analysis [27].
The essence of the SEM is that it examines linear causal relationships among multiple variables simultaneously, while latent factors reduce measurement error. Thus, it is possible to “determine to what degree unknown factors influence shared error among variables - which may affect the estimated parameters of the model” [32].
This simultaneously ability to account for measurement error is its greatest advantage as the ability to manage measurement error is one of the greatest limitations of most methods [30,31]. Thus, it allows to test complex relationships simultaneously in one model rather than using multiple models [33]. Also, it makes SEM superior over other correlational methods such as conventional multiple regression analyses and in effect it has greater statistical power, which means lower probability of rejecting a false null hypothesis. Moreover, the SEM effectively manages missing data by using raw data rather than summary statistics. Therefore, it can be so useful method for a number of research designs including those analyzing the complex nature of disease and health behaviors, as it allows examination of both direct and indirect, as well as unidirectional and bidirectional relationships between measured and latent variables [32]. In fact, there are some studies, which have demonstrated that a structural equation modeling approach is particularly suitable for modeling the interrelationship between observable and unobservable factors that describe health status, such as: Boniface and Tefft [34], Chern et al. [35], Stafford et al. [36].
Even, Structural Equation Modeling (SEM) originated in the early 1900s, stemming from Spearman’s (1904) factor analysis and Wright’s (1918, 1921) invention of path analysis, however, the first introductory textbook on SEM wasn’t published until 1984. As computer programming advanced, researchers increasingly adopted SEM techniques in their studies. Today, SEM is recognized as “the preeminent multivariate technique” that is primarily and widely used in the social sciences in which it has its roots but then also increasingly used in epidemiology, public health, and the medical sciences [32–34].
Therefore, this study has adopted the structural equation modeling as a conventional technique that fulfills the requirements of the study. There are some of such studies, which focus on the impact of different range of socioeconomics determinants on health using SEM. However, these studies differ in the scope of analyzed determinants, health measures and spatial scope. So, Newton et al. (2024) used SEM to hypothesize a model of relationships between health determinants and outcomes within a region in the North of England using large-scale population survey data [37]. Also, Wang et al. (2019) analyzed the housing affects on health in China by using SEM [38]. Wirayuda (2020), used SEM in purpose to find how health status and resources (HSR), sociodemographic (SD) macroeconomic (ME) factors affect LE in Bahrain [39]. Another research of Wirayuda et al. (2022) also tried to understand how the sociodemographic (SD), macroeconomic (ME), and health-status and resources (HSR) factors affecting LE of population in Oman [40]. Then, Truong and Asare (2021) examined the effect of socio-economic features of low-income communities and COVID-19 related cases in New York City [41]. There are some specifically focusing on the various dimensions of HRQOL and its relationship with various sociodemographic characteristics, functional status and disease activity using a structural equation modeling (SEM) approach in patients with Rheumatoid Arthritis in Southern India [42]. Mosallanezhad et al., (2017) designed their research to evaluate the link between socioeconomic status, physical activity, independence and the health status of older people in Iran by using structural equation modeling [43].
Therefore, the aim of this study is to examine the relations between health determinants and outcomes within European population by using the structural equation model. The hypotheses are as follows:
- There is a statistically significant positive relationship between socioeconomic status and the health status of the European Union population.
- The structural equation model (SEM) is applicable for examining the complex interrelationships between socioeconomic status and health outcomes in the European Union population at the regional level.
In order to verify these hypotheses, we used the database of EUROSTAT [44], which determined the final range of socioeconomic variables adopted for the study and the year of research, which is 2022 (or nearest). Thus, it was possible to derive the data at the NUTS 2 level, which ensures the analysis of basic regions of European Union member states and Switzerland. Hence, our research fills an existing gap by providing more specific information on the spatial diversity of the European population in terms of the socioeconomic determinants of health. The novelty of this research also arises from it being the first time the structural equation model application in a such range of data and spatial scope allowing for the design of regional politics. Thus this study contributes to the research area of socioeconomics health determinants and outcomes and would improve the understanding of the factors, which are associated with health of European Union population. Also, the obtained results may be beneficial for such parties as government, policymakers, as they can support the health and social policies.
This study is structured as follows that this introduction with theoretical arguments for this research constitutes the Introduction section; the Materials and Methods section described the data and methods used in this article; then the results are shown in the Results section and theoretical and practical implications are presented in the Discussion section; the lessons from this article were drawn in the Conclusions section.
Materials and methods
Scope of the study and dataset
In the study the influence of socioeconomic status of European regional populations on their health outcomes was quantitatively examined. Because both socioeconomic status and health are difficult to evaluate using a single indicator, the structural equation modeling (SEM) was implemented. In SEM methodology used in the study socioeconomic status and health outcomes are treated as latent variables not seen directly but measured using aggregates of several commonly available statistical indicators.
Data for the year 2022 (or 2021 if 2022 was unavailable) was derived from the EUROSTAT database. Data was collected at the level of NUTS 2 (the Nomenclature of Territorial Units for Statistics) covering basic regions of 27 member states of the European Union, Switzerland and Norway (countries of European Free Trade Association), and Serbia (the European Union candidate country). The geographic scope of the study was determined by accessibility of published data. Regions for which obtaining information on at least one indicator for one latent variable failed were excluded from statistical analysis. Finally, 258 basic regions were involved into the SEM procedure. The underlying data from all of these 258 European regions used for the further variable descriptive statistics calculation and the analysis of SEM were presented in S1 Appendix.
The study was conducted using aggregated data originating from official statistical publications. There was no inclusion of individual human participants into the study, therefore an ethical approval and a privacy protection were not required.
Measurable variables
Five manifest variables were used in order to assess a socioeconomic status of inhabitants of European regions. These variables were: level of education, Internet usage, unemployment, risk of poverty and regional gross domestic product.
Level of education (EDU) was measured using the EUROSTAT tertiary educational attainment indicator. This indicator showing the percentage of the population in age of 25–64 years old who completed high studies is based on the EU Labour Force Survey.
An assessment for Internet usage (INT) was the indicator of individuals regularly using the Internet which shows the percentage of persons who use the Internet at least one a week.
Unemployment (UNEMP) was assessed using the EUROSTAT unemployment rate indicator. This indicator shows unemployed persons as a percentage of economically active population.
Risk of poverty (POVERT) was evaluated using the EUROSTAT at-risk-poverty rate by NUTS 2 region. This indicator shows the persons with an equivalised disposable income below 60% of the national median equivalised disposable income as a percentage of the total population.
Regional gross domestic product (REG-GDP) was calculated using the EUROSTAT regional gross domestic product by NUTS 2 region indicator. In our dataset regional GDP was expressed in thousands PPS (purchasing power standards) per one inhabitant of a NUTS 2 region. Usage of PPS (concept similar to purchasing power parity) instead of EURO eliminates differences in prices between compared countries.
Health status of European regions residents was judged using three manifest variables: life expectancy and mortality rates due to malignant neoplasms and ischaemic heart disease. Life expectancy at birth (LIFE-EXP) was originated from the EUROSTAT database by NUTS 2 regions and indicates the mean number of years that a newborn child (both female and male) can expect to life.
Mortality rate due to neoplasms (CANCER) refers to the EUROSTAT death due to cancer by NUTS 2 region indicator and shows deaths caused by all malignant neoplasms per 100,000 inhabitants.
Mortality rate due to ischaemic heart disease (IH-DIS) is based on the EUROSTAT death due to ischaemic heart diseases by NUTS 2 region indicator which presents all deaths caused by reduced blood supply to the heart (mainly by myocardial infarction) per 100,000 inhabitants.
Descriptive statistics of above mentioned measurable variables is presented in Table 1. Normality of variables distribution was assessed using the Kolmogorov-Smirnov test. Only three variables (EDU, INT and CANCER) have got a normal distribution (Kolmogorov-Smirnov p-value above 0.05).
Structural model
The structural model consist of three latent variables. Two of them are exogenous. These are the social status and the economic status. The third latent variable, the health status is endogenous, explained by previous mentioned two exogenous variables.
The social status (SOCIO) is measured by three manifest variables: EDU, INT and UNEMPL. The economic status (ECON) is measured by two manifest variables: REG-GDP and POVERT. The endogenous latent variable, the health status (HEALTH) is measured by three manifest variables: LIFE-EXP, CANCER and IH-DIS. The structural model is substantive in its nature, based on theoretical consideration presented in the Introduction section. The path diagram reflecting the structural model and its measurements is shown at Fig 1.
Abbreviations explained in the text.
Statistics
Estimations of the model parameters were done using the asymptomatically distribution free estimator (ADF). This estimator was chosen due to its resistance to deviation from multivariate normality that is important according to non-normal distribution of majority of the observed variables. The model goodness of fit was evaluated using three parameters: Chi square statistic, the root-mean-square-error of approximation index (RMSEA) and the goodness-of-fit index (GFI). Calculations were done using STATISTICA software (TIBCO Software Inc. 2017, Statistica data analysis software system, version 13).
Results
Results of estimation of the model coefficients are presented in Table 2. In the structural model the direction of effects of the exogenous socioeconomic status on the health status (treated as endogenous) is positive, which was expected. The coefficient for the SOCIO → HEALTH path is 0.53 (p < 0.001) that indicates significant positive impact of the good social status of a local population on its level of health. Similarly the coefficient for the ECON → HEALTH path is positive and significant as well (0.85, p < 0.001) which confirms relevant advantageous influence of better economic situation on population health.
In the measurement model coefficients indicate relations between measurable and latent variables. All these coefficients are significant (p < 0.001). The relation of the economic status to the regional GDP is positive (coefficient 3.86) and otherwise to the risk of poverty is negative (coefficient −2.01). Relations between the social status and all of its three measurable variables: education level, internet access and, surprisingly, unemployment rate are positive (coefficients are respectively 6.51, 3.65 and 1.34).
Relations between the health status and its measurements depend on nature of the measurable variable. In case of the life expectancy the correlation is positive (coefficient 2.75). Mortalities due to neoplasms and ischaemic heart disease are correlated negatively (coefficients respectively −19.59 and −102.71).
Goodness of a fit of this model is not excellent. Chi square statistics equals 121.5 and its ratio to df = 18 is 6.75, much more than 3.00 and less indicating a good fit. RMSEA index is 0.19 (90% CI 0.16–0.23) when a good fit needs value below 0.10. GFI index is 0.85, which is below 0.90 needed for a good fitting. Unfortunately, unlike in experimental researches, our study is based on available published macroeconomic data and possibilities of the model optimization are limited by accessible information. The issue of improvement of the model fit goodness is wider talked over in the Discussion section.
Discussion
In this section, we discuss the implications of our findings, compare them with previous research, and suggest potential directions for future studies and practical applications. First and foremost, the results obtained through structural equation modeling (SEM) confirm the hypothesis of a positive association between socioeconomic status and population health. Specifically, this analysis revealed that both social status and economic status exert a statistically significant and positive influence on health. In such, the hypothesis about applicability of the structural equation model (SEM) has been positively verified as the complex interrelationships between socioeconomic status and health at the regional level in the European Union population has been recognized.
These findings have emphasized the utility of structural equation modeling as a powerful tool for estimating hypothetical relationships among latent constructs across multiple levels within a developed model, which is also aligned with some other research [30,45–47]. They showed also that SEM is useful tool in identifying the direction and significance of relationships between constructs such as socioeconomic status and health disparities [48–50]. This way, by applying structural equation modeling (SEM), this study aligns methodologically with the growing body of research in this field. However, the specific combination of the type and scope of variables, as well as the spatial coverage in individual studies [37–39,42,43], makes direct comparison of results difficult. Nevertheless, it contributes to a broader understanding of how socioeconomic factors influence health under different conditions – specifically, in this case, within the population of European regions. The most important, that this study confirmed the existence and direction of the relationship between socioeconomic variables and health outcomes.
Socio and economic status are a complex concept that may influence health both directly or indirectly through related. They encompass such functions as wealth, income, educations, while also reflecting one’s position or rank within a specific social hierarchy. So, our research has shown that a better social status of communities in European regions is associated with a higher level of health. Furthermore, a better economic situation of a region significantly improves the health of its inhabitants. However, in generally economic factors has a stronger impact on health than social status.
In detailed, this research showed that first, a higher level of education contributes to better health. The results of our study are consistent with the widely recognized theoretical and empirical relationship between education and health. This positive link between education and health, originally conceptualized in Grossman’s health demand model is a crucial connections in health economics that’s because education enhances individuals’ cognitive abilities, decision-making skills, and self-efficacy, while also fostering a sense of control that encourages healthier behaviors and lifestyle choices [51–54]. Then numerous empirical studies have indicated that educational attainments are linked to better health [55–57]. Then, greater access to the internet is also linked to better health, which may result from improved access to health information and increased availability of medical services as telemedicine enables remote consultations, which is especially important in rural or underdeveloped regions, which is also widely documented by empirical research [58]. However it is interesting that unemployment also shows a positive impact on health, which may be surprising as opposite relations are rather presented by the theory [59,60]. However such positive relation is possible as unemployment can be correlated with other social factors such as social support or demographic structure. Also, in the literature it is pointed out that unemployed people may be more likely than employed people to visit physicians, take medications, or be admitted to general hospitals, which can positively influence their health [61]. Also, unemployment provides a temporary break from a stressful job, allowing for the restoration of mental balance, reflection on one’s professional life, and more time for physical activity and social relationships. Sometimes employment can lead to occupational exposure to factors harmful to health, and to development of occupational diseases, which worsen the population health status. In addition, unemployment benefits may offset the negative health consequences of joblessness [62,63].
As for economic factors, our findings showed that higher poverty levels worsen health, while a higher level of economic development supports better health. This is consistent with both theoretical frameworks and prior empirical findings as increases in absolute income levels often signal macroeconomic growth, which can expand access to public services such as education, healthcare, and social security – ultimately improving population health as well as it can also result in environmental health poverty [64].
Regarding the health status construct, it is positively associated with life expectancy and negatively associated with mortality due to cancer and ischemic heart disease, reflecting the multidimensional nature of health. So, the longer the life expectancy, the better the overall health status, while higher cancer mortality worsens the health assessment as well as there is a strong negative impact of heart disease on overall health. Therefore, this implies addressing major health challenges such as cancer and cardiovascular diseases.
These findings by identifying these variables highlights their role as risk factors in the emergence of health inequities. Also, this research demonstrates the value of structural equation modeling for the examination of health determinants and outcomes.
The results obtained have both theoretical and practical significance. By identifying the significance and direction of the health determinants, these findings contribute to research on factors influencing health, which are of particular importance in light of evidence from the literature [1] indicating that not all such factors have yet been fully identified and then which is especially important given the value of health for both individuals and the economy. Also, this study contributes to the growing body of research employing structural equation modeling (SEM), reflecting the increasing interest in its application within health and medical sciences, which stems from the fact that SEM enables the examination of direct, indirect, and mediating effects, and is therefore particularly well-suited to capturing the complex and multidimensional nature of health-related factors.
The practical implications of these findings stem from the fact that they address the socioeconomic determinants of health, which are modifiable. At the same time, disparities in these determinants may lead to inequalities in health outcomes. Therefore, accurate identification and assessment of these determinants is essential for the development effective recommendations for health policy particularly these aimed at populations in EU countries. The detailed relationships identified in this study along with the proposed model, have the potential to influence government’s decision to address predisposing factors contributing to health inequalities. This study provides valuable insights for designing targeted health interventions. The nature of these determinants and their significance for health imply that, in practice, it is essential to ensure that public policy and health policy mutually inform and reinforce each other.
Therefore, these findings could have direct implications for public health practice and policy, as they offer a clear evidence-based foundation for designing interventions that address health inequalities. Based on the identified relationships between various determinants and population health, a number of public policy recommendations can be proposed. For example, health education should be integrated into school curricula from an early age, and adult learning programs should be expanded to provide free courses and training in areas such as healthy lifestyles, nutrition, stress management, and disease prevention. It would be also critical to improve digital infrastructure in rural and marginalized areas. Expanding access to the internet enables the development of telemedicine services and ensures broader access to reliable online health information. Such information should also be adapted to the needs of older adults and those with lower levels of education, for example through simplified content and accessible formats. Then, regional development policies should prioritize investment in infrastructure, education, and healthcare services in less developed areas to reduce disparities. It would be also worth to integrate health policy with social policy along with the establishment of cross-sectoral public health teams that bring together experts in health, education, technology, economics, and social policy. In addition, national programs for the prevention of cardiovascular diseases and cancer should be strengthened or further developed. These programs should emphasize early diagnosis, healthy lifestyle promotion, and control of risk factors. Public funding should support screening initiatives and educational campaigns addressing diet, smoking, physical inactivity, and other modifiable risks. Moreover, improving the quality and availability of regional data is essential for tailoring policies to specific local needs and enhancing their effectiveness.
This study is not without limitations. The most important one, stems from the fact that the empirical research is based on publicly available data rather than experimental research. As a result, the potential for model optimization is inherently constrained by the scope and level of detail of the available data. Also, some limitations imply from the weaknesses of the SEM as like any method, it has also its limitations. While latent variables are a closer approximation of a construct than is a measured variable they may still fall short of being entirely pure indicators. Also, their variance may include not only the true variance of the observed concept but also shared measurement error among them. Although, the advantage of SEM lies in its ability to analyze multiple variables simultaneously, this benefit often requires larger sample sizes to maintain the reliability and validity of the results.
Despite these promising results, the model’s overall fit indices suggest space for improvement. Goodness of fit of the model is not too perfect. Except the chi-square test which limitations are discussed in the literature [65], also other fit indices (RMSEA and GFI) showed at most mediocre goodness of fit. We faced challenges in improvement of the model fit. There were two main reasons of this fact. The first reason is missing data. We included to the research 258 European regions for which data deficiencies were tolerable according to the study assumptions, but the number of regions which had got a full dataset was substantially smaller. Only 156 of them had no missing values. Missing values can lead to decreasing a sample size (due to exclusion of these items which data deficiencies are too large), and finally to deterioration of a model estimations accuracy [66]. Our research was not experimental (which allows for designing appropriate sample size), but restricted to published statistical data. Releasing of more complete datasets by national and international statistical offices (like EUROSTAT) would enable a further improvement of our model.
The second reason is probably more essential. Among dozens of socioeconomic and health indices published by EUROSTAT almost all are at a national level. Only a few indices are at a regional level. National level, where study units consist of not regions, but entire countries, generates to small study sample to apply many intricate statistical methods including SEM. Restriction of a variable choice to only several indices leads to limitation of more complex model constructing. Additionally, too narrow choice of variables increases a risk of misspecification of the model due to lack of appropriate indices reflecting real factors behind theoretical considerations [67].
Due to its mediocre goodness of fit our model has got rather weak predictive abilities. Nevertheless, it can play a certain role in exploring substantive processes underlying theories of health determinants and estimation of „the operating model” [68]. Improvement of the model, based on both better quality of statistical data and a development of new analytical techniques used in structural modeling [69], can lead to more complete explanation of real health determination processes.
Besides, another limitation arises from the fact that the analysis was conducted using aggregated regional-level data and thus they cannot fully capture the complexity of SES-health interactions at the micro level. Micro-level analyses would allow for more precise policy recommendations and better identification of vulnerable subpopulations. Moreover, the use of regional averages may obscure important within-region variations as relations observed at the group level do not necessarily reflect individual-level relationships. For example, the unexpected positive association between unemployment and health may be confounded by contextual factors such as welfare policies, demographic structure, or access to healthcare services, which vary significantly across EU regions. These factors may distort the interpretation of results therefore there is a need for caution when drawing conclusions from aggregated data.
Regarding to the direction of future studies, they would greatly benefit from more comprehensive datasets, ideally collected and made available by institutions like EUROSTAT, which will allow to accomplish a perfect model with more data collected over time. Expanding the scope of such data would enable more accurate modeling and deeper insights into socioeconomic and health-related dynamics. Then, as such, the factors determining health are crucial for policymakers to study in order to implement effective and accurate interventions in society, therefore a promising direction for future research would be to conduct in-depth analyses at the level of individual countries, allowing for a more detailed understanding of how national contexts influence health outcomes. Also, in light of the findings on the impact of unemployment, it would be valuable to explore the extent to which factors such as social support and the amount and accessibility of unemployment benefits, demographic structure, and access to healthcare services mediate the relationship between unemployment and social status and health. Moreover, future research could also focus on refining the theoretical constructs used in the model by developing more precise indicators for complex concepts such as socioeconomic status or social support. Incorporating multidimensional measures and validating them across different contexts would enhance the robustness of future models but it could also reduce the risk of misspecification and improve the explanatory and predictive power of structural analyses.
Conclusions
The study is the first attempt to compile all different aspects of health comprehensively in European regions with application of structural equation modeling (SEM). These findings showed that population with higher social status across European regions tend to experience better health. In addition, economic status also plays a significant role in enhancing the health of the population. However, economic conditions appear to have a stronger influence on health than social status. Thus, the findings highlight the significant role of socioeconomic factors – particularly education, economic development, and digital access – in shaping population health, while also revealing nuanced effects such as the potentially positive health impact of unemployment under certain conditions. In addition, by applying SEM, we demonstrated the model’s effectiveness in capturing complex interdependencies between social, economic, and health-related variables. Overall, this research contributes to a deeper understanding of how socioeconomic factors shape health disparities and supports the use of SEM in public health policy analysis. Although the model’s predictive power is limited due to data constraints and only moderate goodness of fit, it still offers valuable insights into the mechanisms behind health disparities. The study underlined the strong need for more comprehensive and regionally detailed datasets to improve model accuracy and policy relevance. Despite its limitations, the research contributes to the growing body of literature using SEM in health sciences and provides a foundation for future studies aimed at refining models and informing targeted public health interventions. Ultimately, understanding and addressing the socioeconomic roots of health inequalities remains essential for effective policymaking in the European context.
Supporting information
S1 Appendix. Summary of data used in calculations.
https://doi.org/10.1371/journal.pone.0337042.s001
(XLSX)
References
- 1. Christoforou R, Lange S, Schweiker M. Individual differences in the definitions of health and well-being and the underlying promotional effect of the built environment. Journal of Building Engineering. 2024;84:108560.
- 2. van der Linden R, Schermer M. Health and disease as practical concepts: exploring function in context-specific definitions. Med Health Care Philos. 2022;25(1):131–40. pmid:34783971
- 3. Voigt K, Wester G. Relational equality and health. Soc Phil Pol. 2015;31(2):204–29.
- 4. McCartney G, Popham F, McMaster R, Cumbers A. Defining health and health inequalities. Public Health. 2019;172:22–30. pmid:31154234
- 5.
United Nations. Development Agenda. [Cited 2025 February 12]. https://www.un.org/sustainabledevelopment/development-agenda/
- 6.
WHO. Constitution of the World Health Organization. Geneva, Switzerland: World Health Organization; 1948.
- 7. Armitage R. The WHO’s definition of health: a baby to be retrieved from the bathwater? Br J Gen Pract. 2023;73(727):70–1. pmid:36702589
- 8. Huber M, Knottnerus JA, Green L, van der Horst H, Jadad AR, Kromhout D, et al. How should we define health?. BMJ. 2011;343:d4163.
- 9. Larson JS. The World Health Organization’s definition of health: Social versus spiritual health. Soc Indic Res. 1996;38(2):181–92.
- 10. Larson JS. The conceptualization of health. Med Care Res Rev. 1999;56(2):123–36. pmid:10373720
- 11. Auster R, Leveson I, Sarachek D. The production of health, an exploratory study. J Hum Resour. 1969;4:411–36.
- 12.
Lalonde M. A new perspective on the health of Canadians. Public Health Agency of Canada. 1974. [Cited 2025 February 20]. http://www.phac-aspc.gc.ca/ph-sp/pdf/perspect-eng.pdf
- 13.
Baciu A, Negussie Y, Geller A, Weinstein JN. The root causes of health inequity. In: National Academies of Sciences, Engineering, and Medicine, editor. Communities in action: Pathways to health equity. Washington, DC, USA: National Academies Press (US); 2017.
- 14. Naik Y, Baker P, Walker I, Tillmann T, Bash K, Quantz D, et al. The macro-economic determinants of health and health inequalities-umbrella review protocol. Syst Rev. 2017;6(1):222. pmid:29100497
- 15.
World Health Organisation. Social Determinants of Health. Overview. [Cited 2025 February 24]. https://www.who.int/health-topics/social-determinants-of-health#tab=tab_1
- 16. Rój J, Jankowiak M. Socioeconomic determinants of health and their unequal distribution in Poland. Int J Environ Res Public Health. 2021;18(20):10856. pmid:34682597
- 17. Marmot M. Social determinants of health inequalities. Lancet. 2005;365(9464):1099–104. pmid:15781105
- 18. McCartney G, Collins C, Mackenzie M. What (or who) causes health inequalities: theories, evidence and implications?. Health Policy. 2013;113(3):221–7. pmid:23810172
- 19.
National Academies of Sciences, Engineering, and Medicine, National Academy of Medicine, Committee on the Future of Nursing 2020–2030, Flaubert JL, Le Menestrel S, Williams DR. The Future of Nursing 2020-2030: Charting a Path to Achieve Health Equity. Flaubert JL, Le Menestrel S, Williams DR, editors. Washington (DC): National Academies Press (US); 2021.
- 20.
Solar O, Irwin A. A conceptual framework for action on the social determinants of health: social determinants of health discussion paper 2. Geneva, Switzerland: World Health Organization; 2010.
- 21.
Baciu A, Negussie Y, Geller A, Weinstein JN. The root causes of health inequity. In: National Academies of Sciences, Engineering, and Medicine, editor. Communities in action: Pathways to health equity. Washington, DC, USA: National Academies Press (US); 2017.
- 22. Mackenbach JP, Stirbu I, Roskam A-JR, Schaap MM, Menvielle G, Leinsalu M, et al. Socioeconomic inequalities in health in 22 European countries. N Engl J Med. 2008;358(23):2468–81. pmid:18525043
- 23. Kunst AE. Describing socioeconomic inequalities in health in European countries: an overview of recent studies. Rev Epidemiol Sante Publique. 2007;55(1):3–11. pmid:17321711
- 24. Rój J. Inequality in the distribution of healthcare human resources in Poland. Sustainability. 2020;12(5):2043.
- 25.
WHO. Social determinants of health. EB148/24. 2021. https://apps.who.int/gb/e/e_eb148.html
- 26.
Tun KT. Are the intermediators there? Structural equation modeling on social determinants of health in eastern Burma. Lund: Lund University; 2021.
- 27.
Reyes-Carreto R, Godinez-Jaimes F, Guzmán-Martínez M. The Basics of Structural Equations in Medicine and Health Sciences. IntechOpen; 2021.
- 28. Dhinakaran M, Singh R, Chakravarthi MK, Pant B, Singh M, Bansal R. Structural Equation Modeling Method of Computation in Health Care Using IoT. In: 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART). 2022. 383–9.
- 29. Bhale U. Structural Equation Modeling (SEM) in Healthcare . Elsevier BV; 2024.
- 30. Hays RD, Revicki D, Coyne KS. Application of structural equation modeling to health outcomes research. Eval Health Prof. 2005;28(3):295–309. pmid:16123259
- 31.
Gunzler DD, Perzynski AT, Carle AC. Structural Equation Modeling for Health and Medicine. Taylor & Francis Group; 2021.
- 32. Beran TN, Violato C. Structural equation modeling in medical research: a primer. BMC Res Notes. 2010;3:267. pmid:20969789
- 33. Christ SL, Lee DJ, Lam BL, Zheng DD. Structural equation modeling: a framework for ocular and other medical sciences research. Ophthalmic Epidemiol. 2014;21(1):1–13. pmid:24467557
- 34. Boniface DR, Tefft ME. The application of structural equation modelling to the construction of an index for the measurement of health-related behaviours. J Royal Statistical Soc D. 1997;46(4):505–14.
- 35. Chern J-Y, Wan TTH, Begun JW. A structural equation modeling approach to examining the predictive power of determinants of individuals’ health expenditures. J Med Syst. 2002;26(4):323–36. pmid:12118816
- 36. Stafford M, Sacker A, Ellaway A, Cummins S, Wiggins D, Macintyre S. Neighbourhood effects on health: A structural equation modelling approach. SCHM. 2008;128(1):109–20.
- 37. Newton D, Stephenson J, Azevedo L, Sah RK, Poudel AN, Richardson O. The impact of social determinants on health outcomes in a region in the North of England: a structural equation modelling analysis. Public Health. 2024;231:198–203. pmid:38703494
- 38. Wang S, Cheng C, Tan S. Housing determinants of health in urban China: A structural equation modeling analysis. Soc Indic Res. 2018;143(3):1245–70.
- 39. Wirayuda AAB, Al-Mahrezi A, Chan MF. Factors impacting life expectancy in Bahrain: evidence from 1971 to 2020 data. Int J Health Serv. 2022.
- 40. Wirayuda AAB, Jaju S, Alsaidi Y, Chan MF. A structural equation model to explore sociodemographic, macroeconomic, and health factors affecting life expectancy in Oman. Pan Afr Med J. 2022;41:75.
- 41. Truong N, Asare AO. Assessing the effect of socio-economic features of low-income communities and COVID-19 related cases: An empirical study of New York City. Glob Public Health. 2021;16(1):1–16. pmid:33222624
- 42. Bodhare T, Bele S, Nallasivan S, Anto JV. Determinants of health-related quality of life in south indian patients with rheumatoid arthritis: A structural equation modeling approach. Indian Journal Rheumatology Association. 2022;18(5).
- 43. Mosallanezhad Z, Sotoudeh GR, Jutengren G, Salavati M, Harms-Ringdahl K, Wikmar LN, et al. A structural equation model of the relation between socioeconomic status, physical activity level, independence and health status in older Iranian people. Arch Gerontol Geriatr. 2017;70:123–9. pmid:28131051
- 44.
EUROSTAT. EUROSTAT. [Cited 2025 March 15]. https://ec.europa.eu/eurostat/data/database
- 45. Bollen KA, Pearl J. Eight myths about causality and structural equation models. Handbooks of Sociology and Social Research. Springer Netherlands; 2013. p. 301–28.
- 46.
Bowen NK, Guo S. Structural equation modeling. Oxford University Press; 2011.
- 47. Rahman W, Ali Shah F, Rasli A. Use of structural equation modeling in social science research. ASS. 2015;11(4).
- 48. Newton D, Stephenson J, Azevedo L, Sah RK, Poudel AN, Richardson O. The impact of social determinants on health outcomes in a region in the North of England: a structural equation modelling analysis. Public Health. 2024;231:198–203.
- 49. Mosallanezhad Z, Sotoudeh GR, Jutengren G, Salavati M, Harms-Ringdahl K, Wikmar LN, Frändin K. A structural equation model of the relation between socioeconomic status, physical activity level, independence and health status in older Iranian people. Archives of Gerontology and Geriatrics. 2017;70:123–9.
- 50. Martinez SA, Beebe LA, Thompson DM, Wagener TL, Terrell DR, Campbell JE. A structural equation modeling approach to understanding pathways that connect socioeconomic status and smoking. PLoS One. 2018;13(2):e0192451. pmid:29408939
- 51.
Grossman M.The Demand for Health: A Theoretical and Empirical Investigation; Columbia University Press for the National Bureau of Economic Research. New York, NY, USA 1972.
- 52. Mirowsky J, Ross CE. Education, learned effectiveness and health. Lond. Rev. Educ. 2005;3:205–220.
- 53. Raghupathi V, Raghupathi W. The influence of education on health: an empirical assessment of OECD countries for the period 1995-2015. Arch Public Health. 2020;78:20. pmid:32280462
- 54.
Feinstein L, Sabatés R, Anderson TM, Sorhaindo A, Hammond C. What are the effects of education on health. Measuring the effects of education on health and civic engagement: Proceedings of the Copenhagen Symposium. Paris, France: OECD; 2006. p. 171–354.
- 55. Zajacova A, Lawrence EM. The relationship between education and health: Reducing disparities through a contextual approach. Annu Rev Public Health. 2018;39:273–89. pmid:29328865
- 56. Deaton A. Policy implications of the gradient of health and wealth. Health Aff (Millwood). 2002;21(2):13–30. pmid:11900153
- 57.
Grossman M. The relationship between health and schooling: What′s new?. 21609. Cambridge, MA, USA: National Bureau of Economic Research; 2015.
- 58. Rój J. Inequity in the Access to eHealth and its decomposition case of Poland. Int J Environ Res Public Health. 2022;19(4):2340. pmid:35206528
- 59. Norström F, Waenerlund A-K, Lindholm L, Nygren R, Sahlén K-G, Brydsten A. Does unemployment contribute to poorer health-related quality of life among Swedish adults? BMC Public Health. 2019;19(1):457. pmid:31035994
- 60.
Picchio M, Ubaldi M. Unemployment and health: A meta-analysis. 15433. Institute of Labor Economics (IZA); 2022.
- 61. Jin RL, Shah CP, Svoboda TJ. The impact of unemployment on health: a review of the evidence. CMAJ. 1995;153(5):529–40. pmid:7641151
- 62. Shahidi FV, Muntaner C, Shankardass K, Quiñonez C, Siddiqi A. The effect of unemployment benefits on health: A propensity score analysis. Soc Sci Med. 2019;226:198–206. pmid:30861433
- 63. Cylus J, Avendano M. Receiving unemployment benefits may have positive effects on the health of the unemployed. Health Aff (Millwood). 2017;36(2):289–96.
- 64. Cui X, Chang CT. How income influences health: decomposition based on absolute income and relative income effects. Int J Environ Res Public Health. 2021;18(20):10738.
- 65. Zheng BQ, Bentler PM. Enhancing model fit evaluation in SEM: Practical tips for optimizing chi-square tests. Structural Equation Modeling: A Multidisciplinary Journal. 2024.
- 66. Allison PD. Missing data techniques for structural equation modeling. J Abnorm Psychol. 2003;112(4):545–57. pmid:14674868
- 67. Robitzsch A. Modeling model misspecification in structural equation models. Stats. 2023;6(2):689–705.
- 68. Zucchini W. An introduction to model selection. J Math Psychol. 2000;44(1):41–61. pmid:10733857
- 69. Wah JNK. Unraveling structural equation modeling: Key assumptions, model fit, and trends. CSB. 2025;70(2):4455–71.