Recent quantitative research on determinants of health in high income countries: A scoping review

Background Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature. Methods We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that performed cross-national statistical analyses aiming to evaluate the impact of one or more aggregate level determinants on one or more general population health outcomes in high-income countries. To assess in which combinations and to what extent individual (or thematically linked) determinants had been studied together, we performed multidimensional scaling and cluster analysis. Results Sixty studies were selected, out of an original yield of 3686. Life-expectancy and overall mortality were the most widely used population health indicators, while determinants came from the areas of healthcare, culture, politics, socio-economics, environment, labor, fertility, demographics, life-style, and psychology. The family of regression models was the predominant statistical approach. Results from our multidimensional scaling showed that a relatively tight core of determinants have received much attention, as main covariates of interest or controls, whereas the majority of other determinants were studied in very limited contexts. We consider findings from these studies regarding the importance of any given health determinant inconclusive at present. Across a multitude of model specifications, different country samples, and varying time periods, effects fluctuated between statistically significant and not significant, and between beneficial and detrimental to health. Conclusions We conclude that efforts to understand the underlying mechanisms of population health are far from settled, and the present state of research on the topic leaves much to be desired. It is essential that future research considers multiple factors simultaneously and takes advantage of more sophisticated methodology with regards to quantifying health as well as analyzing determinants’ influence.


Introduction
Identifying the key drivers of population health is a core subject in public health and health economics research. Between-country comparative research on the topic is challenging. In order to be relevant for policy, it requires disentangling different interrelated drivers of "good health", each having different degrees of importance in different contexts.
"Good health"-physical and psychological, subjective and objective-can be defined and measured using a variety of approaches, depending on which aspect of health is the focus. A major distinction can be made between health measurements at the individual level or some aggregate level, such as a neighborhood, a region or a country. In view of this, a great diversity of specific research topics exists on the drivers of what constitutes individual or aggregate "good health", including those focusing on health inequalities, the gender gap in longevity, and regional mortality and longevity differences.
The current scoping review focuses on determinants of population health. Stated as such, this topic is quite broad. Indeed, we are interested in the very general question of what methods have been used to make the most of increasingly available region or country-specific databases to understand the drivers of population health through inter-country comparisons. Existing reviews indicate that researchers thus far tend to adopt a narrower focus. Usually, attention is given to only one health outcome at a time, with further geographical and/or population [1,2] restrictions. In some cases, the impact of one or more interventions is at the core of the review [3][4][5][6][7], while in others it is the relationship between health and just one particular predictor, e.g., income inequality, access to healthcare, government mechanisms [8][9][10][11][12][13]. Some relatively recent reviews on the subject of social determinants of health [4][5][6][14][15][16][17] have considered a number of indicators potentially influencing health as opposed to a single one. One review defines "social determinants" as "the social, economic, and political conditions that influence the health of individuals and populations" [17] while another refers even more broadly to "the factors apart from medical care" [15].
In the present work, we aimed to be more inclusive, setting no limitations on the nature of possible health correlates, as well as making use of a multitude of commonly accepted measures of general population health. The goal of this scoping review was to document the state of the art in the recent published literature on determinants of population health, with a particular focus on the types of determinants selected and the methodology used. In doing so, we also report the main characteristics of the results these studies found. The materials collected in this review are intended to inform our (and potentially other researchers') future analyses on this topic. Since the production of health is subject to the law of diminishing marginal returns, we focused our review on those studies that included countries where a high standard of wealth has been achieved for some time, i.e., high-income countries belonging to the Organisation for Economic Co-operation and Development (OECD) or Europe. Adding similar reviews for other country income groups is of limited interest to the research we plan to do in this area.

Methods
In view of its focus on data and methods, rather than results, a formal protocol was not registered prior to undertaking this review, but the procedure followed the guidelines of the PRISMA statement for scoping reviews [18].

Search
We focused on multi-country studies investigating the potential associations between any aggregate level (region/city/country) determinant and general measures of population health (e.g., life expectancy, mortality rate).
Within the query itself, we listed well-established population health indicators as well as the six world regions, as defined by the World Health Organization (WHO). We searched only in the publications' titles in order to keep the number of hits manageable, and the ratio of broadly relevant abstracts over all abstracts in the order of magnitude of 10% (based on a series of time-focused trial runs). The search strategy was developed iteratively between the two authors and is presented in S1 Appendix. The search was performed by VV in PubMed and Web of Science on the 16 th of July, 2019, without any language restrictions, and with a start date set to the 1 st of January, 2013, as we were interested in the latest developments in this area of research.

Eligibility criteria
Records obtained via the search methods described above were screened independently by the two authors. Consistency between inclusion/exclusion decisions was approximately 90% and the 43 instances where uncertainty existed were judged through discussion. Articles were included subject to meeting the following requirements: (a) the paper was a full published report of an original empirical study investigating the impact of at least one aggregate level (city/region/country) factor on at least one health indicator (or self-reported health) of the general population (the only admissible "sub-populations" were those based on gender and/or age); (b) the study employed statistical techniques (calculating correlations, at the very least) and was not purely descriptive or theoretical in nature; (c) the analysis involved at least two countries or at least two regions or cities (or another aggregate level) in at least two different countries; (d) the health outcome was not differentiated according to some socio-economic factor and thus studied in terms of inequality (with the exception of gender and age differentiations); (e) mortality, in case it was one of the health indicators under investigation, was strictly "total" or "all-cause" (no cause-specific or determinant-attributable mortality).

Data extraction
The following pieces of information were extracted in an Excel table from the full text of each eligible study (primarily by VV, consulting with PB in case of doubt): health outcome(s), determinants, statistical methodology, level of analysis, results, type of data, data sources, time period, countries. The evidence is synthesized according to these extracted data (often directly reflected in the section headings), using a narrative form accompanied by a "summary-of-findings" table and a graph.

Search and selection
The initial yield contained 4583 records, reduced to 3686 after removal of duplicates (Fig 1). Based on title and abstract screening, 3271 records were excluded because they focused on specific medical condition(s) or specific populations (based on morbidity or some other factor), dealt with intervention effectiveness, with theoretical or non-health related issues, or with animals or plants. Of the remaining 415 papers, roughly half were disqualified upon full-text consideration, mostly due to using an outcome not of interest to us (e.g., health inequality), measuring and analyzing determinants and outcomes exclusively at the individual level, performing analyses one country at a time, employing indices that are a mixture of both health indicators and health determinants, or not utilizing potential health determinants at all. After this second stage of the screening process, 202 papers were deemed eligible for inclusion. This group was further dichotomized according to level of economic development of the countries or regions under study, using membership of the OECD or Europe as a reference "cut-off" point. Sixty papers were judged to include high-income countries, and the remaining 142 included either low-or middle-income countries or a mix of both these levels of development. The rest of this report outlines findings in relation to high-income countries only, reflecting

PLOS ONE
Determinants of health in high income countries: A scoping review our own primary research interests. Nonetheless, we chose to report our search yield for the other income groups for two reasons. First, to gauge the relative interest in applied published research for these different income levels; and second, to enable other researchers with a focus on determinants of health in other countries to use the extraction we made here.

Health outcomes
The most frequent population health indicator, life expectancy (LE), was present in 24 of the 60 studies. Apart from "life expectancy at birth" (representing the average life-span a newborn is expected to have if current mortality rates remain constant), also called "period LE" by some [19,20], we encountered as well LE at 40 years of age [21], at 60 [22], and at 65 [21,23,24]. In two papers, the age-specificity of life expectancy (be it at birth or another age) was not stated [25,26].
While the majority of studies under review here focused on a single health indicator, 23 out of the 60 studies made use of multiple outcomes, although these outcomes were always considered one at a time, and sometimes not all of them fell within the scope of our review. An easily discernable group of indices that typically went together [25,37,41] was that of neonatal (deaths occurring within 28 days postpartum), perinatal (fetal or early neonatal / first-7-days deaths), and post-neonatal (deaths between the 29 th day and completion of one year of life) mortality. More often than not, these indices were also accompanied by "stand-alone" indicators, such as infant mortality (deaths within the first year of life; our third most common index found in 16 of the 60 studies), maternal mortality (deaths during pregnancy or within 42 days of termination of pregnancy), and child mortality rates. Child mortality has conventionally been defined as mortality within the first 5 years of life, thus often also called "under-5 mortality". Nonetheless, Pritchard & Wallace used the term "child mortality" to denote deaths of children younger than 14 years [42].
As previously stated, inclusion criteria did allow for self-reported health status to be used as a general measure of population health. Within our final selection of studies, seven utilized some form of subjective health as an outcome variable [25,[43][44][45][46][47][48]. Additionally, the Health Human Development Index [49], healthy life expectancy [50], old-age survival [51], potential years of life lost [52], and disability-adjusted life expectancy [25] were also used.
We note that while in most cases the indicators mentioned above (and/or the covariates considered, see below) were taken in their absolute or logarithmic form, as a-typically annual -number, sometimes they were used in the form of differences, change rates, averages over a given time period, or even z-scores of rankings [19,22,40,42,44,[53][54][55][56][57].

Regions, countries, and populations
Despite our decision to confine this review to high-income countries, some variation in the countries and regions studied was still present. Selection seemed to be most often conditioned on the European Union, or the European continent more generally, and the Organisation of Economic Co-operation and Development (OECD), though, typically, not all member nations-based on the instances where these were also explicitly listed-were included in a given study. Some of the stated reasons for omitting certain nations included data unavailability [30,45,54] or inconsistency [20,58], Gross Domestic Product (GDP) too low [40], differences in economic development and political stability with the rest of the sampled countries [59], and national population too small [24,40]. On the other hand, the rationales for selecting a group of countries included having similar above-average infant mortality [60], similar healthcare systems [23], and being randomly drawn from a social spending category [61]. Some researchers were interested explicitly in a specific geographical region, such as Eastern Europe [50], Central and Eastern Europe [48,60], the Visegrad (V4) group [62], or the Asia/ Pacific area [32]. In certain instances, national regions or cities, rather than countries, constituted the units of investigation instead [31,51,56,[62][63][64][65][66]. In two particular cases, a mix of countries and cities was used [35,57]. In another two [28,29], due to the long time periods under study, some of the included countries no longer exist. Finally, besides "European" and "OECD", the terms "developed", "Western", and "industrialized" were also used to describe the group of selected nations [30,42,52,53,67].
As stated above, it was the health status of the general population that we were interested in, and during screening we made a concerted effort to exclude research using data based on a more narrowly defined group of individuals. All studies included in this review adhere to this general rule, albeit with two caveats. First, as cities (even neighborhoods) were the unit of analysis in three of the studies that made the selection [56,64,65], the populations under investigation there can be more accurately described as general urban, instead of just general. Second, oftentimes health indicators were stratified based on gender and/or age, therefore we also admitted one study that, due to its specific research question, focused on men and women of early retirement age [35] and another that considered adult males only [68].
In most of the studies, the level of the data (and analysis) was national. The exceptions were six papers that dealt with Nomenclature of Territorial Units of Statistics (NUTS2) regions [31,62,63,66], otherwise defined areas [51] or cities [56], and seven others that were multilevel designs and utilized both country-and region-level data [57], individual-and city-or countrylevel [35], individual-and country-level [44,45,48], individual-and neighborhood-level [64], and city-region-(NUTS3) and country-level data [65]. Parallel to that, the data type was predominantly longitudinal, with only a few studies using purely cross-sectional data [25, 33, 43, 45-48, 50, 62, 67, 68, 71, 72], albeit in four of those [43,48,68,72] two separate points in time were taken (thus resulting in a kind of "double cross-section"), while in another the averages across survey waves were used [56].
In studies using longitudinal data, the length of the covered time periods varied greatly. Although this was almost always less than 40 years, in one study it covered the entire 20 th century [29]. Longitudinal data, typically in the form of annual records, was sometimes transformed before usage. For example, some researchers considered data points at 5- [34,36,49] or 10-year [27,29,35] intervals instead of the traditional 1, or took averages over 3-year periods [42,53,73]. In one study concerned with the effect of the Great Recession all data were in a "recession minus expansion change in trends"-form [57]. Furthermore, there were a few instances where two different time periods were compared to each other [42,53] or when data was divided into 2 to 4 (possibly overlapping) periods which were then analyzed separately [24,26,28,29,31,65]. Lastly, owing to data availability issues, discrepancies between the time points or periods of data on the different variables were occasionally observed [22,35,42,[53][54][55]63].

Health determinants
Together with other essential details, Table 1 lists the health correlates considered in the selected studies. Several general categories for these correlates can be discerned, including health care, political stability, socio-economics, demographics, psychology, environment, fertility, life-style, culture, labor. All of these, directly or implicitly, have been recognized as holding importance for population health by existing theoretical models of (social) determinants of health [74][75][76][77].
It is worth noting that in a few studies there was just a single aggregate-level covariate investigated in relation to a health outcome of interest to us. In one instance, this was life satisfaction [44], in another-welfare system typology [45], but also gender inequality [33], austerity level [70,78], and deprivation [51]. Most often though, attention went exclusively to GDP [27,29,46,57,65,71]. It was often the case that research had a more particular focus. Among others, minimum wages [79], hospital payment schemes [23], cigarette prices [63], social expenditure [20], residents' dissatisfaction [56], income inequality [30,69], and work leave [41,58] took center stage. Whenever variables outside of these specific areas were also included, they were usually identified as confounders or controls, moderators or mediators.
We visualized the combinations in which the different determinants have been studied in Fig 2, which was obtained via multidimensional scaling and a subsequent cluster analysis (details outlined in S2 Appendix). It depicts the spatial positioning of each determinant relative to all others, based on the number of times the effects of each pair of determinants have been studied simultaneously. When interpreting Fig 2, one should keep in mind that determinants marked with an asterisk represent, in fact, collectives of variables.
Distances between determinants in Fig 2 are indicative of determinants' "connectedness" with each other. While the statistical procedure called for higher dimensionality of the model, for demonstration purposes we show here a two-dimensional solution. This simplification unfortunately comes with a caveat. To use the factor smoking as an example, it would appear it stands at a much greater distance from GDP than it does from alcohol. In reality however, smoking was considered together with alcohol consumption [21,25,26,52,68] in just as many studies as it was with GDP [21,25,26,52,59], five. To aid with respect to this apparent shortcoming, we have emphasized the strongest pairwise links. Solid lines connect GDP with health expenditure (HE), unemployment rate (UR), and education (EDU), indicating that the effect of GDP on health, taking into account the effects of the other three determinants as well, was evaluated in between 12 to 16 studies of the 60 included in this review. Tracing the dashed lines, we can also tell that GDP appeared jointly with income inequality, and HE together with either EDU or UR, in anywhere between 8 to 10 of our selected studies. Finally, some weaker but still worth-mentioning "connections" between variables are displayed as well via the dotted lines.

PLOS ONE
Determinants of health in high income countries: A scoping review The fact that all notable pairwise "connections" are concentrated within a relatively small region of the plot may be interpreted as low overall "connectedness" among the health indicators studied. GDP is the most widely investigated determinant in relation to general population health. Its total number of "connections" is disproportionately high (159) compared to its runner-up-HE (with 113 "connections"), and then subsequently EDU (with 90) and UR (with 86). In fact, all of these determinants could be thought of as outliers, given that none of the remaining factors have a total count of pairings above 52. This decrease in individual determinants' overall "connectedness" can be tracked on the graph via the change of color intensity as we move outwards from the symbolic center of GDP and its closest "co-determinants", to finally reach the other extreme of the ten indicators (welfare regime, household consumption, compulsory school reform, life satisfaction, government revenues, literacy, research expenditure, multiple pregnancy, Cyclically Adjusted Primary Balance, and residents' dissatisfaction; in white) the effects on health of which were only studied in isolation.
Lastly, we point to the few small but stable clusters of covariates encircled by the grey bubbles on Fig 2. These groups of determinants were identified as "close" by both statistical procedures used for the production of the graph (see details in S2 Appendix).

Statistical methodology
There was great variation in the level of statistical detail reported. Some authors provided too vague a description of their analytical approach, necessitating some inference in this section.
The issue of missing data is a challenging reality in this field of research, but few of the studies under review (12/60) explain how they dealt with it. Among the ones that do, three general approaches to handling missingness can be identified, listed in increasing level of

PLOS ONE
Determinants of health in high income countries: A scoping review Purely cross-sectional data analyses were performed in eight studies [25,45,47,50,55,56,67,71]. These consisted of linear regression (assumed ordinary least squares (OLS)), generalized least squares (GLS) regression, and multilevel analyses. However, six other studies that used longitudinal data in fact had a cross-sectional design, through which they applied regression at multiple time-points separately [27,29,36,48,68,72].
Apart from these "multi-point cross-sectional studies", some other simplistic approaches to longitudinal data analysis were found, involving calculating and regressing 3-year averages of both the response and the predictor variables [54], taking the average of a few data-points (i.e., survey waves) [56] or using difference scores over 10-year [19,29] or unspecified time intervals [40,55].
Moving further in the direction of more sensible longitudinal data usage, we turn to the methods widely known among (health) economists as "panel data analysis" or "panel regression". Most often seen were models with fixed effects for country/region and sometimes also time-point (occasionally including a country-specific trend as well), with robust standard errors for the parameter estimates to take into account correlations among clustered observations [20,21,24,28,30,32,34,37,38,41,52,59,60,63,66,69,73,79,81,82]. The Hausman test [83] was sometimes mentioned as the tool used to decide between fixed and random effects [26,49,63,66,73,82]. A few studies considered the latter more appropriate for their particular analyses, with some further specifying that (feasible) GLS estimation was employed [26,34,49,58,60,73]. Apart from these two types of models, the first differences method was encountered once as well [31]. Across all, the error terms were sometimes assumed to come from a first-order autoregressive process (AR(1)), i.e., they were allowed to be serially correlated [20,30,38,[58][59][60]73], and lags of (typically) predictor variables were included in the model specification, too [20,21,37,38,48,69,81]. Lastly, a somewhat different approach to longitudinal data analysis was undertaken in four studies [22,35,48,65] in which multilevellinear or Poisson-models were developed.

Findings
As the methods and not the findings are the main focus of the current review, and because generic checklists cannot discern the underlying quality in this application field (see also below), we opted to pool all reported findings together, regardless of individual study characteristics or particular outcome(s) used, and speak generally of positive and negative effects on health. For this summary we have adopted the 0.05-significance level and only considered results from multivariate analyses. Strictly birth-related factors are omitted since these potentially only relate to the group of infant mortality indicators and not to any of the other general population health measures.
It is important to point out that the above-mentioned effects could not be considered stable either across or within studies. Very often, statistical significance of a given covariate fluctuated between the different model specifications tried out within the same study [20,49,59,66,68,69,73,80,82], testifying to the importance of control variables and multivariate research (i.e., analyzing multiple independent variables simultaneously) in general. Furthermore, conflicting results were observed even with regards to the "core" determinants given special attention, so to speak, throughout this text. Thus, some studies reported negative effects of health expenditure [32,82], social expenditure [58], GDP [49,66], and education [82], and positive effects of income inequality [82] and unemployment [24,31,32,52,66,68]. Interestingly, one study [34] differentiated between temporary and long-term effects of GDP and unemployment, alluding to possibly much greater complexity of the association with health. It is also worth noting that some gender differences were found, with determinants being more influential for males than for females, or only having statistically significant effects for male health [19,21,28,34,36,37,39,64,65,69].

Discussion
The purpose of this scoping review was to examine recent quantitative work on the topic of multi-country analyses of determinants of population health in high-income countries.
Measuring population health via relatively simple mortality-based indicators still seems to be the state of the art. What is more, these indicators are routinely considered one at a time, instead of, for example, employing existing statistical procedures to devise a more general, composite, index of population health, or using some of the established indices, such as disability-adjusted life expectancy (DALE) or quality-adjusted life expectancy (QALE). Although strong arguments for their wider use were already voiced decades ago [84], such summary measures surface only rarely in this research field.
On a related note, the greater data availability and accessibility that we enjoy today does not automatically equate to data quality. Nonetheless, this is routinely assumed in aggregate level studies. We almost never encountered a discussion on the topic. The non-mundane issue of data missingness, too, goes largely underappreciated. With all recent methodological advancements in this area [85][86][87][88], there is no excuse for ignorance; and still, too few of the reviewed studies tackled the matter in any adequate fashion.
Much optimism can be gained considering the abundance of different determinants that have attracted researchers' attention in relation to population health. We took on a visual approach with regards to these determinants and presented a graph that links spatial distances between determinants with frequencies of being studies together. To facilitate interpretation, we grouped some variables, which resulted in some loss of finer detail. Nevertheless, the graph is helpful in exemplifying how many effects continue to be studied in a very limited context, if any. Since in reality no factor acts in isolation, this oversimplification practice threatens to render the whole exercise meaningless from the outset. The importance of multivariate analysis cannot be stressed enough. While there is no "best method" to be recommended and appropriate techniques vary according to the specifics of the research question and the characteristics of the data at hand [89][90][91][92][93], in the future, in addition to abandoning simplistic univariate approaches, we hope to see a shift from the currently dominating fixed effects to the more flexible random/mixed effects models [94], as well as wider application of more sophisticated methods, such as principle component regression, partial least squares, covariance structure models (e.g., structural equations), canonical correlations, time-series, and generalized estimating equations.
Finally, there are some limitations of the current scoping review. We searched the two main databases for published research in medical and non-medical sciences (PubMed and Web of Science) since 2013, thus potentially excluding publications and reports that are not indexed in these databases, as well as older indexed publications. These choices were guided by our interest in the most recent (i.e., the current state-of-the-art) and arguably the highest-quality research (i.e., peer-reviewed articles, primarily in indexed non-predatory journals). Furthermore, despite holding a critical stance with regards to some aspects of how determinants-ofhealth research is currently conducted, we opted out of formally assessing the quality of the individual studies included. The reason for that is two-fold. On the one hand, we are unaware of the existence of a formal and standard tool for quality assessment of ecological designs. And on the other, we consider trying to score the quality of these diverse studies (in terms of regional setting, specific topic, outcome indices, and methodology) undesirable and misleading, particularly since we would sometimes have been rating the quality of only a (small) part of the original studies-the part that was relevant to our review's goal.
Our aim was to investigate the current state of research on the very broad and general topic of population health, specifically, the way it has been examined in a multi-country context. We learned that data treatment and analytical approach were, in the majority of these recent studies, ill-equipped or insufficiently transparent to provide clarity regarding the underlying mechanisms of population health in high-income countries. Whether due to methodological shortcomings or the inherent complexity of the topic, research so far fails to provide any definitive answers. It is our sincere belief that with the application of more advanced analytical techniques this continuous quest could come to fruition sooner.
Supporting information S1 Checklist. Preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (PRISMA-ScR) checklist.