There is ongoing debate about whether education or socioeconomic status (SES) should be inputs into cardiovascular disease (CVD) prediction algorithms and clinical risk adjustment models. It is also unclear whether intervening on education will affect CVD, in part because there is controversy regarding whether education is a determinant of CVD or merely correlated due to confounding or reverse causation. We took advantage of a natural experiment to estimate the population-level effects of educational attainment on CVD and related risk factors.
Methods and findings
We took advantage of variation in United States state-level compulsory schooling laws (CSLs), a natural experiment that was associated with geographic and temporal differences in the minimum number of years that children were required to attend school. We linked census data on educational attainment (N = approximately 5.4 million) during childhood with outcomes in adulthood, using cohort data from the 1992–2012 waves of the Health and Retirement Study (HRS; N = 30,853) and serial cross-sectional data from 1971–2012 waves of the National Health and Nutrition Examination Survey (NHANES; N = 44,732). We examined self-reported CVD outcomes and related risk factors, as well as relevant serum biomarkers. Using instrumental variables (IV) analysis, we found that increased educational attainment was associated with reduced smoking (HRS β −0.036, 95%CI: −0.06, −0.02, p < 0.01; NHANES β −0.032, 95%CI: −0.05, −0.02, p < 0.01), depression (HRS β −0.049, 95%CI: −0.07, −0.03, p < 0.01), triglycerides (NHANES β −0.039, 95%CI: −0.06, −0.01, p < 0.01), and heart disease (HRS β −0.025, 95%CI: −0.04, −0.002, p = 0.01), and improvements in high-density lipoprotein (HDL) cholesterol (HRS β 1.50, 95%CI: 0.34, 2.49, p < 0.01; NHANES β 0.86, 95%CI: 0.32, 1.48, p < 0.01), but increased BMI (HRS β 0.20, 95%CI: 0.002, 0.40, p = 0.05; NHANES β 0.13, 95%CI: 0.01, 0.32, p = 0.05) and total cholesterol (HRS β 2.73, 95%CI: 0.09, 4.97, p = 0.03). While most findings were cross-validated across both data sets, they were not robust to the inclusion of state fixed effects. Limitations included residual confounding, use of self-reported outcomes for some analyses, and possibly limited generalizability to more recent cohorts.
This study provides rigorous population-level estimates of the association of educational attainment with CVD. These findings may guide future implementation of interventions to address the social determinants of CVD and strengthen the argument for including educational attainment in prediction algorithms and primary prevention guidelines for CVD.
Why was this study done?
- Heart disease is a leading cause of mortality in the US, and clinicians are increasingly interested in addressing its social and economic determinants.
- Education is highly correlated with heart disease, but this may be because education and heart disease have common causes like parental socioeconomic position and genetic factors.
- Even if there is an effect, the mechanisms linking education and heart disease are unclear.
What did the researchers do and find?
- This study leveraged a natural experiment—variation in US education policies—to examine the effects of education on heart disease and its risk factors.
- Increased education was consistently associated with improvements in several cardiovascular risk factors: smoking, high-density lipoprotein, and depression.
- Increased education was also associated with higher BMI and total cholesterol.
Citation: Hamad R, Nguyen TT, Bhattacharya J, Glymour MM, Rehkopf DH (2019) Educational attainment and cardiovascular disease in the United States: A quasi-experimental instrumental variables analysis. PLoS Med 16(6): e1002834. https://doi.org/10.1371/journal.pmed.1002834
Academic Editor: Kazem Rahimi, University of Oxford, UNITED KINGDOM
Received: February 25, 2019; Accepted: May 21, 2019; Published: June 25, 2019
Copyright: © 2019 Hamad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data cannot be shared publicly because they include geographic identifiers on study subjects, and access is therefore restricted to those who meet criteria. Access to restricted NHANES data can be obtained by applying at https://www.cdc.gov/nchs/nhanes/index.htm, and access to restricted HRS data can be obtained by applying at http://hrsonline.isr.umich.edu.
Funding: Funding for this study was provided by the National Institutes of Health (UL1 TR001085 via a pilot grant from the Stanford Clinical and Translational Science Award to Spectrum, K08 HL132106 to RH, K01 AG047280 to DHR, and RF1 AG056164 to MMG) (https://www.nih.gov). This work was also supported by a grant from the American Educational Research Association, which receives funds for its AERA Grants Program from the National Science Foundation under NSF Grant #DRL-0941014 (https://www.aera.net). The HRS is sponsored by the National Institute on Aging (U01 AG009740) and is conducted by the University of Michigan. Publication was made possible in part by support from the UCSF Open Access Publishing Fund. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: CRP, C-reactive protein; CSL, compulsory schooling law; CVD, cardiovascular disease; HDL, high-density lipoprotein; HRS, Health and Retirement Study; IV, instrumental variable; LDL, low-density lipoprotein; NHANES, National Health and Nutrition Examination Survey; OLS, ordinary least squares; SES, socioeconomic status
Prior work has suggested that clinicians should incorporate patients’ educational attainment into clinical decision-making, and that patients’ educational attainment could improve the accuracy of clinical predictive models such as the Framingham risk score . Indeed cardiovascular mortality is underestimated in individuals of low socioeconomic status (SES) using the Framingham score, reflecting its focus on biomedical rather than social risk factors . The 2019 guidelines from the American College of Cardiology and American Heart Association and the US Department of Health and Human Services have suggested using patients’ social factors in clinical prediction tools and to risk-adjust physician panels in determining physician payments for performance [3,4]. Yet while numerous studies have linked low educational attainment to risk of cardiovascular disease (CVD), few provide population-level estimates, and many existing studies cannot rule out confounding by unmeasured factors such as genetic endowment or parental SES . These obstacles pose challenges for rigorously estimating the impact of education on CVD, hindering the ability to implement appropriate interventions. A recent review concluded that there is substantial disagreement in the education-health literature due to confounding, warranting additional research on this topic .
There are numerous hypothesized pathways linking education with CVD (Fig 1). Increased educational duration, quality, and credentials are thought to increase employment ; augment psychosocial resources such as literacy, social capital, and decision-making [8–11]; and improve health behaviors like smoking [12,13]. Psychosocial resources and employment, in turn, may increase income and decrease stress. Each of these may then lead to reduced CVD.
CVD, cardiovascular disease; SES, socioeconomic status.
Given challenges in implementing randomized trials in this field, studies increasingly apply “quasi-experimental” methods to examine the links between education and health , taking advantage of natural experiments such as expansions in Head Starts and other social policies . Although several studies randomized children to high-quality early childhood interventions and demonstrated improved cardiovascular health later in life, sample sizes were small, with limited long-term follow-up , and randomization of public education is typically not feasible or ethical. Population-level estimates of the effects of education on CVD are largely lacking.
In this study, we took advantage of a natural experiment in the form of US compulsory schooling laws (CSLs), state policies that legislate the number of years children must attend school. CSLs create differences by state and across time in the duration of education . Numerous studies have exploited this natural experiment to determine the impact of education on economic outcomes. Using CSLs as instrumental variables (IV) for educational attainment, these have found increased earnings and employment, and intergenerational impacts on social outcomes among children of those affected by CSLs [7,18,19]. A recent meta-analysis of studies that examined the health effects of CSLs highlighted the small number of US studies, which have focused primarily on the impact of education on mortality and fertility . Drawing on data from several countries, this meta-analysis found improvements in smoking, obesity, and mortality but insufficient evidence for other outcomes. The examination of other outcomes is critical to understand the pathways through which education may influence CVD, as this would inform subsequent interventions to reduce CVD disparities. While numerous studies on CSLs and CVD and related risk factors have been conducted in Europe [21–25], findings may not generalize to the US due to political- and sociocultural-based differences in the role of education. Two published studies have examined the effects of CSLs on CVD in the US context, with one study finding reductions in self-reported heart attack and diabetes risk, and the other finding reductions in self-reported diabetes and hypertension but no effect for “heart trouble” [26,27]. To our knowledge, no published studies in the US have examined the effects of CSLs on objective biomarkers of CVD.
In this study, we leveraged a natural experiment to test the hypothesis that educational duration affects CVD outcomes and related risk factors, examining multiple pathways through which education may affect CVD. We linked administrative data on CSLs with two large nationally representative US data sets and employ the quasi-experimental method of IV analysis. In addition to estimating rigorous population-level effects of education on CVD, this study contributes evidence on a specific educational policy, thereby guiding future implementation of social and educational interventions to address the social determinants of CVD.
This study involved the integration of several large data sets, with all analyses prespecified (see S1 Analytic Plan). As described below, we conducted a two-sample IV analysis. The first stage of the IV analysis was conducted among US-born individuals in the US Census 5% sample (N = approximately 5.4 million). We used the 1980 Census because demographic questions were comparable to and birth years of participants overlapped with those in the US Health and Retirement Study (HRS) and the National Health and Nutrition Examination Survey (NHANES), the data sets from which CVD outcomes were derived.
To estimate the second stage of the IV analysis, we linked first-stage census estimates with two data sets that included the outcomes of interest: HRS and NHANES. In other words, years of compulsory schooling and predicted years of educational attainment determined in census data were linked to each individual in HRS and NHANES based on his/her birth year, birth state, race, and sex. HRS is a longitudinal nationally representative US study of individuals age 50 or older and their spouses. The first survey wave was conducted in 1992, with biennial interviews subsequently. The second data set was NHANES, a serial cross-sectional US study conducted in 1971–1974 (NHANES I), 1976–1980 (NHANES II), 1988–1994 (NHANES III), and biennially since 1999. For both data sets, we included survey waves through 2012, the most recent data available at the onset of data analysis. We restricted the data sets to US-born individuals with data on state of birth and at least one CVD outcome. In NHANES, we also restricted the data set to white and black individuals, due to inconsistencies in categorization of other races/ethnicities across survey waves. Data on CSLs and state characteristics were compiled using federal reports for 1900–1950 , and health outcomes included markers of CVD more prevalent in adulthood; we therefore restricted the data set to individuals born during 1900–1950 who were at least 18 when surveyed.
Final sample sizes were 30,853 (HRS) and 44,732 (NHANES), although the number of observations was smaller for outcomes not obtained for all participants in all waves (Table 1).
The primary predictor in ordinary least squares (OLS) models was self-reported educational attainment (continuous in census and HRS, categorical in NHANES). This was also the dependent variable in the first stage of the IV analyses, described below.
Outcomes included serum biomarkers, anthropometric measures, and self-reported outcomes of CVD and related risk factors previously correlated with education (Table 1). Each outcome represented one or more mechanistic pathways through which education might influence CVD. For example, diabetes and cholesterol in part reflect health behaviors such as nutrition and physical activity. Meanwhile, C-reactive protein (CRP) and telomere length measure inflammatory pathways and may capture chronic stress [29,30]. Studies suggest that socioeconomic disparities accelerate CVD by heightening stress responses [31,32]. Similarly, depression is a risk factor for mortality among patients with CVD  and was operationalized as a score of 3 or more on the shortened 8-item Center for Epidemiologic Studies Depression scale . For outcomes that were heavily skewed and for which residuals were non-normally distributed—telomere length, CRP, and triglycerides—the natural logarithm was taken. Of note, higher levels of telomere length and high-density lipoprotein (HDL) and lower levels of other biomarkers are considered beneficial.
When possible, we chose outcomes that were similar across NHANES and HRS to cross-validate findings. For example, earlier waves of NHANES included 2-hour glucose testing, while later waves and HRS included hemoglobin A1c (also known as glycated or glycosylated hemoglobin). For consistency, we created a binary measure of whether the level exceeded the cutoff for diabetes (glucose ≥ 200, hemoglobin A1c ≥ 6.5). For CRP, HRS includes a variable for CRP that is constructed to be equivalent to that measured in NHANES . Most self-reported outcomes included similar wording across both surveys, e.g., “Has a doctor ever told you that you have high blood pressure or hypertension?”
For some outcomes, however, parallel questions were not included in both HRS and NHANES, so we only included outcomes from a single data set. For example, NHANES did not include comparable questions on heart disease, and early waves did not include questions on depression, while HRS does not measure triglycerides. In HRS—which consists of repeated surveys of the same individuals over time—self-reported outcomes were coded as 1 if the respondent ever stated that they had the disease (parallel to NHANES question formats), and labs and anthropometric measures represent the first available value of the outcome to minimize survivorship bias.
We controlled for variables that may confound the relationship between exposure to CSL policies and CVD. These included race, gender, and birth year, as well as time-varying state-level characteristics to address potential state-level confounding. These included percentage black, urban, and foreign born; manufacturing jobs per capita; and inflation-adjusted manufacturing wages per manufacturing job. These were compiled from Statistical Abstracts of the US and linearly interpolated for years between reports , and have been similarly included as covariates in prior CSL studies .
We first tabulated HRS and NHANES participant characteristics. We then conducted two sets of analyses: (1) OLS, which is subject to confounding of the relationship between educational attainment and CVD, and (2) IV, which is intended to address this confounding.
In OLS models, we regressed each outcome on self-reported educational attainment in HRS and NHANES. The primary predictor variable in HRS was a continuous variable for self-reported educational attainment in years. For NHANES, the primary predictor was the “more than high school” category of educational attainment (reference: less than high school), as NHANES does not include a continuous variable for education in all survey waves. A similar analysis was carried out in HRS using a categorical education variable, for comparability. We adjusted for individual- and state-level characteristics described above.
Because the treatment—that is, educational quality—is at the state level, we clustered standard errors by state  using the Huber-White heteroscedasticity-robust sandwich estimator to account for correlated observations . Notably, early waves of NHANES employ Fay’s replicate weights for variance estimation , while later waves of NHANES include probability sampling weights. To our knowledge, there is no established method to pool surveys that incorporate these different techniques for sample weighting, so we were unable to incorporate sample weights in our analysis. Regardless, the appropriateness of sample weighting is diminished when the goal of analysis is estimation of treatment effects rather than producing descriptive population statistics , so this is unlikely to introduce bias into the results.
OLS models suffer from confounding by unobserved individual factors like genetic endowment or parental SES. Therefore, we next carried out the quasi-experimental method of IV analysis, a well-established technique in epidemiology and clinical medicine . As shown in S1 Fig, IV methods rely on the presence of a quasi-randomly determined exposure or “instrument” (Z)—in this case, CSLs—that is known to impact the predictor of interest (X, education). This perturbation in X caused by Z is then used to infer the effects of X on the relevant outcomes (Y). This method is particularly useful when X cannot be randomized, and when the relationship between X and Y may be confounded by unmeasured individual characteristics (U1) (see S1 Text for details).
In this study, the IV analysis leveraged the natural experiment created by CSLs to estimate effects of education that are unconfounded by unobserved individual factors. In particular, we employed two-sample IV analysis, in which the first and second stages were carried out in two different data sets [41,42]. Using a two-sample approach allowed for more precise estimation of the first stage, as the census sample size was much larger, thereby alleviating concerns of weak instrument bias resulting from instruments that explain only a small fraction of the variation in the endogenous variable . Two-sample IV analysis is also useful in situations in which a single data set does not include information on all three variables of interest —i.e., the outcome, the predictor (self-reported educational attainment), and the instrument (CSLs). In this case, early waves of NHANES did not include a continuous measure of educational attainment, highlighting the utility of the two-sample IV approach. Additional details on the two-sample IV analyses, including equations, are provided in S1 Text.
We used two IVs to capture the number of minimum years of compulsory schooling in an individual’s state of birth [17,18,28]. The first was the difference between compulsory enrollment age in the state of birth when the respondent was 6 and minimum dropout age when the respondent was 14, and the second was the difference between compulsory enrollment age when the respondent was 6 and minimum work age when the respondent was 14. We assumed that individuals remained in their state of birth until age 18; prior studies have shown that cross-state migration was low during this period and that it was uncorrelated to the implementation of CSLs, so any measurement error (i.e., misclassification) would likely bias our results to the null [17,45].
Robust standard errors were calculated using a bootstrapping technique, again clustered at the state level to account for correlated observations (see S1 Text for details).
Fixed effects analyses
Prior work has shown that IV estimates of the effects of education may be sensitive to the inclusion of fixed effects (i.e., indicator variables) for state of birth [27,28], which control for unobserved time-invariant state-level confounders but reduce statistical power. We conducted an additional set of OLS and IV analyses that included state fixed effects. The Durban-Wu-Hausman test demonstrated that there were no systematic differences between OLS and fixed effects models (p > 0.05 for all outcomes) , and fewer than 5% of coefficients for state of birth were statistically significant. Thus, we have little empirical evidence that state of birth was a confounder in these analyses. Nevertheless, we present fixed effects models alongside OLS models, given that state-level characteristics may still be considered confounders on theoretical grounds, although it should be noted that these models reduce power and the amount of variation in the exposure because fixed effects models only leverage variation in the exposure within rather than between states.
Less than 3% of covariates were missing. Complete case analysis is unlikely to introduce bias at such low levels of missingness [47–50]. We did not impute missing outcomes, as this is thought to add noise to subsequent estimates .
Multiple hypothesis testing
To account for the examination of multiple outcomes, we calculated adjusted p-values using the Dubey/Armitage-Parmar method, a modification of the Bonferroni method that accommodates correlated outcomes [52,53].
Participant characteristics were similar across HRS and NHANES, with slightly over half of the participants female and about three quarters white (Table 1). About two thirds of individuals had completed high school education or less. CVD measures were generally worse in HRS, which includes older individuals than NHANES. Because most of the outcomes were obtained using similar questions and laboratory methods, these differences therefore likely represent age and cohort effects, which we account for in our models by adjusting for birth year.
Higher educational attainment was associated with improvements for all outcomes except total cholesterol (Table 2). Except for telomere length in HRS, all of these associations were robust to the adjustment of p-values for multiple hypothesis testing. Coefficients for NHANES were roughly comparable to those in HRS when using a comparable categorical variable for education as the primary exposure, although the HRS estimate for telomere length was no longer statistically significant at p < 0.05, likely due to the conversion of the primary predictor from continuous to categorical.
The F statistic for the first stage of IV models using census data was 793.7. This was above the standard cutoff of 10, indicating that CSLs are a strong instrument for education .
In HRS (Table 3), increased education was associated with reduced heart disease (β −0.025; 95%CI: −0.04, −0.002; p = 0.01), smoking (β −0.036; 95%CI: −0.06, −0.02; p < 0.01), and depression (β −0.049; 95%CI: −0.07, −0.03; p < 0.01); improved HDL (β 1.50; 95%CI: 0.34, 2.49; p < 0.01); and worsened total cholesterol (β 2.73; 95%CI: 0.09, 4.97; p = 0.03) and BMI (β 0.20; 95%CI: 0.002, 0.40; p = 0.05). The estimates for smoking, depression, and HDL were robust to the adjustment of p-values for multiple hypothesis testing.
In NHANES, increased education was associated with reduced smoking (β −0.032; 95%CI: −0.05, −0.02; p < 0.01) and triglycerides (β −0.039; 95%CI: −0.06, −0.01; p < 0.01) and improved HDL (β 0.86; 95%CI: 0.32, 1.48; p < 0.01), but higher BMI (β 0.13; 95%CI: 0.01, 0.32; p = 0.05). Each of these except for BMI was robust to the adjustment of p-values for multiple hypothesis testing.
Fixed effects models
When adjusting for state fixed effects, OLS estimates were similar to models without fixed effects in both HRS and NHANES (Table 4), with improvements in all outcomes except for total and low-density lipoprotein (LDL) cholesterol. Except for telomere length in HRS, all of these associations were robust to the adjustment of p-values for multiple hypothesis testing. As in OLS models without fixed effects, coefficients for NHANES were roughly comparable to those in HRS when using a comparable categorical variable for education as the primary exposure, although the HRS estimate for telomere length was again no longer statistically significant at p < 0.05, likely due to the conversion of the primary predictor from continuous to categorical.
For IV analyses (Table 5), in HRS the confidence intervals for each estimate included the null, although all (including total cholesterol) had point estimates suggesting improvement. When adjusting IV models for state fixed effects in NHANES, all confidence intervals included the null, and there was no consistent direction of effect estimates.
In this study, we exploited a natural experiment—variation in US CSLs—to estimate the effects of educational attainment on CVD in later life, examining several outcomes to better understand the different pathways through which education may influence CVD. This study provides population-level estimates of the effects of education on CVD for inclusion in clinical prediction and risk adjustment models. It also provides more rigorous estimates of the causal effect of educational attainment on CVD to inform future interventions to address this important social determinant, because correlational estimates like our OLS models may suffer from confounding or reverse causation. While the OLS models suggested improvements in virtually all health outcomes, IV models suggested improvements only for smoking, depression, heart disease, and HDL, and possibly worsened total cholesterol and BMI.
Overall, the evidence from HRS indicates that education is associated with reduced heart disease. Based on our exploration of pathways below, this may be driven by improvements in smoking, HDL, and depression. These findings suggest that the relationship between education and CVD risk factors may be causal, making it a potentially important predictor to target in clinical and policy interventions. Unfortunately, the self-reported measure of heart disease in HRS does not specify the type of medical condition included in “heart disease”—e.g., myocardial infarction, heart failure, or atrial fibrillation—which limits our ability to fully understand the pathways at play.
Our results also highlight the importance of incorporating social determinants into clinical prediction algorithms. For example, prior work has found that incorporating a marker of neighborhood-level social deprivation into a cardiovascular risk score greatly reduced socioeconomic disparities in identification of disease relative to the Framingham score; including neighborhood deprivation in CVD risk algorithms is therefore increasingly incorporated into guidelines in numerous international settings [55–57]. Yet there remains controversy over the incorporation of education into risk adjustment algorithms. For example, adjusting physician payment based on the distribution of social factors in their patient panels may encourage physicians to care for low-SES patients, because they will not be penalized for the generally worse outcomes that occur in this group of patients. Yet, it may lead to decreased quality of care and potentially increased disparities for low-SES patients, because standards of care will be different (i.e., probably lower) for these groups . On the other hand, not risk-adjusting may lead physicians to cherry-pick higher-SES (and likely healthier) patients to meet performance guidelines, thereby worsening disparities .
Insight into mechanisms
In terms of the mechanisms linking education and CVD, correlational OLS models suggested improvements in virtually all outcomes, yet IV models found that education was associated with only a handful, including worsening of some risk factors. Two of these—reduced smoking and improved HDL cholesterol—were observed in both HRS and NHANES. A recent meta-analysis of international CSLs also found improvements in smoking ; no prior study to our knowledge has examined the effects of CSLs on HDL. As HDL is linked to physical activity, education may influence HDL by increasing physical activity. Alternately, these improvements may be due to improved medical care, because education is known to increase employment opportunities [7,18], and in the US employment is linked to health insurance and healthcare access. We were unable to reject the null hypothesis that there was no benefit for most outcomes linked to nutrition or healthcare access (e.g., LDL cholesterol, hyperglycemia); wide confidence intervals suggest that these analyses were underpowered, because there were fewer individuals who participated in biomarker testing.
In contrast, education was associated with increased total cholesterol in HRS and increased BMI in both samples, which may represent the health behavior pathway. This contradicts findings from a recent meta-analysis suggesting that CSLs in international settings lead to reduced obesity . While low-SES individuals in modern times tend to be more obese, the early 20th century was a time of an epidemiologic “nutrition transition” in the US , when higher-SES individuals were more likely to consume more obesogenic food, perhaps explaining our findings. Alternately, prior work suggests that education’s effects on reduced smoking may lead to increased obesity . Future studies should replicate these findings in more recent cohorts as they age. Of note, estimates of the effects on total cholesterol and BMI were not robust to the adjustment of p-values for multiple hypothesis testing, so these results should be interpreted cautiously.
For most outcomes in the stress and inflammatory pathway, we were unable to rule out the null hypothesis that education had no effect; the exception was reduced depression in HRS. This may reflect improvements in coping or social support that result from increased education, or it may be due to changes in foundational skills like literacy . For the null findings for the inflammatory outcomes, it may be that prior correlational studies suffered from confounding, e.g., due to difference in infectious exposures. It may also be that these analyses were underpowered, because sample sizes were smaller for biomarkers.
The null associations in some IV models may be due to the larger sample size required for this type of analysis to attain comparable power relative to OLS models. For example, our analyses for telomere length were conducted on 3,500–5,000 participants in each data set. While meta-analysis does not produce stable estimates when combining only two effect estimates , future studies could consider conducting meta-analyses across additional data sets for these outcomes. Alternately, the null IV findings may suggest that some of these associations are confounded in OLS analyses by unobserved individual factors, and that IV models are able to better adjust for this bias.
Of note, none of the IV associations remained statistically significant when adjusting for state fixed effects. Several prior studies have demonstrated similar sensitivity of CSL IV models to the inclusion of fixed effects [27,63]. One possible explanation is that the observed associations may be confounded by other unobserved state-level policies or characteristics, such as labor market conditions. Alternatively, it may be that state fixed effects greatly reduce variation in the exposure, which hinges on state and year differences in CSLs, so that these models have limited statistical power. In nearly all cases, the confidence intervals for estimates from fixed effects models included both the IV estimates from models without fixed effects and the OLS estimates. The Durban-Wu-Hausman test we conducted suggests that fixed effects may not be warranted on empirical grounds, although there is disagreement on whether the Durban-Wu-Hausman test should be used to justify the omission of fixed effects. Ultimately, these inconsistencies imply that the results of our main models should be interpreted cautiously and replicated in future studies, although attaining sample sizes larger than those of this study will be challenging in the absence of meta-analyses.
Strengths and limitations
This study has several strengths. It employed a natural experiment to produce rigorous estimates of the effects of education on CVD. It examined objectively measured biomarkers of disease in addition to self-reported health. Our examination of multiple outcomes allowed us to provide a more comprehensive picture of the mechanisms linking education and CVD. Additionally, analyses were replicated across two large nationally representative data sets, although differences in participant characteristics—e.g., NHANES was conducted during earlier years and included younger participants than HRS—means that estimates across the two studies may not be directly comparable in spite of adjustment for relevant sociodemographic variables and birth year.
In terms of limitations, self-reporting may have resulted in measurement error or reporting bias that could have been different by educational attainment, although biomarkers are not subject to this bias. Future studies could link diagnostic codes from death certificates or healthcare claims data to examine a wider scope of objective measures of disease. Second, a limitation of all IV analyses is the inability to test the assumption that no other factors confound the instrument-outcome association; here, state-level characteristics may influence both CSLs and CVD. We attempted to minimize this potential confounding by adjusting for state-level characteristics. IV analyses also only provide estimates of a “local average treatment effect” for individuals whose exposure is affected by the instrument, i.e., those who increased their educational attainment as a result of CSL implementation. This limits the generalizability of the resulting estimates, but these estimates also provide evidence on a specific policy to inform future interventions. Properly used, IV models tend to account for confounding more robustly than other observational techniques , although future studies could incorporate other forms of quasi-experimental or matching techniques that do not suffer from similar limitations. Additionally, findings may not generalize to the effects of education on CVD in the 21st century. This study may also be limited by selection bias, in that participants may be different from those who did not survive long enough to participate in HRS; our inclusion of younger participants from NHANES helps to strengthen our results. Relatedly, the use of linear rather than survival models is biased by differential follow-up time, because individuals with longer follow-ups are more likely to have the event. Unfortunately, NHANES is cross-sectional and does not include data on age of diagnosis, precluding us from carrying out survival models. Finally, while our study examines biomarkers that may be along the pathway linking education and CVD, future studies could undertake more formal mediation analyses to examine the direct and indirect effects through which education influences CVD, similar to prior studies examining other outcomes [10,11,61].
This study employed quasi-experimental methods to provide rigorous estimates of the effect of education on CVD for the US context. Our findings support the established associations between education and reduced smoking, depression, and heart disease and improved HDL, suggesting that both health behaviors and stress are important mechanisms. Our study thereby contributes new knowledge on potential pathways through which education may influence CVD, and it adds to the evidence supporting broader implementation of interventions to target this key social determinant of health.
S1 Analytic Plan. Analytic plan.
S1 STROBE Checklist. STROBE checklist.
S1 Text. Supplemental methods.
S1 Fig. IV design.
CSL, compulsory schooling law; IV, instrumental variable; SES, socioeconomic status.
Disclaimer: Opinions reflect those of the authors and do not necessarily reflect those of the granting agencies.
- 1. Adler NE, Glymour M. Why we need to know patients’ education. JAMA Internal Medicine. 2017;177(8):1172–4. pmid:28604918
- 2. Brindle PM, McConnachie A, Upton MN, Hart CL, Smith GD, Watt GC. The accuracy of the Framingham risk-score in different socioeconomic groups: a prospective study. Br J Gen Pract. 2005;55(520):838–45. pmid:16281999
U.S. Department of Health and Human Services. Social Risk Factors and Performance Under Medicare’s Value-Based Purchasing Programs. Washington, D.C.: 2016.
- 4. Arnett DK, Blumenthal RS, Albert MA, Michos ED, Buroker AB, Miedema MD, et al. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease. Journal of the American College of Cardiology. 2019:26029.
Grossman M. Chapter 10 Education and Nonmarket Outcomes. In: Hanushek E, Welch F, editors. Handbook of the Economics of Education. Volume 1: Elsevier; 2006. p. 577–633.
Grossman M. The Relationship between Health and Schooling: What’s New? Working Paper 21609. Cambridge, Massachusetts: National Bureau of Economic Research, 2015.
- 7. Oreopoulos P. Do dropouts drop out too soon? Wealth, health and happiness from compulsory schooling. Journal of Public Economics. 2007;91(11–12):2213–29.
- 8. Ross CE, Wu C-l. The Links Between Education and Health. American Sociological Review. 1995;60(5):719–45.
- 9. Reynolds AJ, Temple JA, Ou S-R. Preschool education, educational attainment, and crime prevention: Contributions of cognitive and non-cognitive skills. Children and Youth Services Review. 2010;32(8):1054–63. pmid:27667885
- 10. Heckman J, Pinto R, Savelyev P. Understanding the Mechanisms through Which an Influential Early Childhood Program Boosted Adult Outcomes. American Economic Review. 2013;103(6):2052–86. pmid:24634518
- 11. Nguyen TT, Tchetgen EJT, Kawachi I, Gilman SE, Walter S, Glymour M. Comparing alternative effect decomposition methods: the role of literacy in mediating educational effects on mortality. Epidemiology. 2016;27(5):670. pmid:27280331
Brunello G, Fort M, Schneeweis N, Winter-Ebmer R. The causal effect of education on health: What is the role of health behaviors? Vienna, Austria: Institute for Advanced Studies, 2012 Contract No.: Economic Series No. 280.
- 13. Vable AM, Glymour MM, Nguyen TT, Rehkopf D, Hamad R. Differential associations between state-level educational quality and cardiovascular health by race: early-life exposures and late-life health. Social Science and Medicine—Population Health. 2019 May 30;100418.
- 14. Sekikawa A, Horiuchi BY, Edmundowicz D, Ueshima H, Curb JD, Sutton-Tyrrell K, et al. A “natural experiment” in cardiovascular epidemiology in the early 21st century. Heart. 2003;89(3):255–7. pmid:12591821
- 15. Frisvold DE, Lumeng JC. Expanding Exposure: Can Increasing the Daily Duration of Head Start Reduce Childhood Obesity? Journal of Human Resources. 2011;46(2):373–402.
- 16. Campbell F, Conti G, Heckman JJ, Moon SH, Pinto R, Pungello E, et al. Early Childhood Investments Substantially Boost Adult Health. Science. 2014;343(6178):1478–85. pmid:24675955
- 17. Lleras‐Muney A. Were Compulsory Attendance and Child Labor Laws Effective? An Analysis from 1915 to 1939. Journal of Law and Economics. 2002;45(2):401–35.
Acemoglu D, Angrist JD. How large are the social returns to education? Evidence from compulsory schooling laws. Working Paper 7444. Cambridge, Massachusetts: National Bureau of Economic Research, 1999.
- 19. Oreopoulos P, Page ME, Stevens AH. The intergenerational effects of compulsory schooling. Journal of Labor Economics. 2006;24(4):729–60.
- 20. Hamad R, Elser H, Tran D, Rehkopf DH, Goodman SN. How and Why Studies Disagree About the Effects of Education on Health: A Systematic Review and Meta-analysis of Studies of Compulsory Schooling Laws Social Science & Medicine. 2018;212:168–78.
- 21. Powdthavee N. Does Education Reduce the Risk of Hypertension? Estimating the Biomarker Effect of Compulsory Schooling in England. Journal of Human Capital. 2010;4(2):173–202.
- 22. Jürges H, Kruk E, Reinhold S. The effect of compulsory schooling on health—evidence from biomarkers. Journal of Population Economics. 2013;26(2):645–72.
- 23. Kemptner D, Jürges H, Reinhold S. Changes in compulsory schooling and the causal effect of education on health: Evidence from Germany. Journal of Health Economics. 2011;30(2):340–54. pmid:21306780
- 24. Ljungdahl S, Bremberg SG. Might extended education decrease inequalities in health?—a meta-analysis. The European Journal of Public Health. 2015;25(4):587–92. pmid:25618830
- 25. Brunello G, Fabbri D, Fort M. The causal effect of education on body mass: Evidence from Europe. Journal of Labor Economics. 2013;31(1):195–223.
- 26. Fletcher J. New evidence of the effects of education on health in the US: Compulsory schooling laws revisited. Social Science & Medicine. 2015;127:101–7.
- 27. Mazumder B. Does education improve health? A reexamination of the evidence from compulsory schooling laws. Economic Perspectives. 2008;32(2).
- 28. Glymour MM, Kawachi I, Jencks CS, Berkman LF. Does childhood schooling affect old age memory or mental status? Using state schooling laws as natural experiments. J Epidemiol Community Health. 2008;62(6):532–7. pmid:18477752
- 29. Hamad R, Walter S, Rehkopf DH. Telomere length and health: a two-sample genetic instrumental variables analysis. Experimental Gerontology. 2016;82:88–94. pmid:27321645
- 30. Grau AJ, Buggle F, Becher H, Werle E, Hacke W. The association of leukocyte count, fibrinogen and c-reactive protein with vascular risk factors and ischemic vascular diseases. Thrombosis Research. 1996;82(3):245–55. pmid:8732628
- 31. Cohen S, Janicki-Deverts D, Miller GE. Psychological stress and disease. JAMA. 2007;298(14):1685–7. pmid:17925521
- 32. Lichtman JH, Bigger JT, Blumenthal JA, Frasure-Smith N, Kaufmann PG, Lespérance F, et al. Depression and coronary heart disease: recommendations for screening, referral, and treatment: a science advisory from the American Heart Association Prevention Committee of the Council on Cardiovascular Nursing, Council on Clinical Cardiology, Council on Epidemiology and Prevention, and Interdisciplinary Council on Quality of Care and Outcomes Research: endorsed by the American Psychiatric Association. Circulation. 2008;118(17):1768–75. pmid:18824640
- 33. Radloff LS. The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement. 1977;1(3):385–401.
Crimmins E, Faul J, Kim JK, Guyer H, Langa K, Ofstedal MB, et al. Documentation of biomarkers in the 2006 and 2008 Health and Retirement Study. Ann Arbor, Michigan: Survey Research Center, University of Michigan, 2013.
- 35. Lleras-Muney A. The Relationship Between Education and Adult Mortality in the United States. The Review of Economic Studies. 2005;72(1):189–221.
Abadie A, Athey S, Imbens GW, Wooldridge J. When Should You Adjust Standard Errors for Clustering? Working Paper 24003. Cambridge, Massachusetts: National Bureau of Economic Research, 2017.
- 37. White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48(4):817–38.
- 38. Judkins DR. Fay’s method for variance estimation. Journal of Official Statistics. 1990;6(3):223–39.
- 39. Solon G, Haider SJ, Wooldridge JM. What Are We Weighting For? Journal of Human Resources. 2015;50(2):301–16.
- 40. McClellan M, McNeil BJ, Newhouse JP. Does more intensive treatment of acute myocardial infarction in the elderly reduce mortality? Analysis using instrumental variables. JAMA. 1994;272(11):859–66. pmid:8078163
- 41. Angrist JD, Krueger AB. The effect of age at school entry on educational attainment: an application of instrumental variables with moments from two samples. Journal of the American Statistical Association. 1992;87(418):328–36.
- 42. Inoue A, Solon G. Two-sample instrumental variables estimators. The Review of Economics and Statistics. 2010;92(3):557–61.
- 43. Hahn J, Hausman J. Weak instruments: Diagnosis and cures in empirical econometrics. The American Economic Review. 2003;93(2):118–25.
- 44. Lawlor DA. Commentary: Two-sample Mendelian randomization: opportunities and challenges. International Journal of Epidemiology. 2016;45(3):908–15. pmid:27427429
- 45. Card D, Krueger A. Does school quality matter? Returns to education and the characteristics of public schools in the United States. The Journal of Political Economy. 1992;100(1):1–40.
Davidson R, MacKinnon JG. Section 8.7. Durbin-Wu-Hausman Tests. Econometric theory and methods. New York: Oxford University Press; 2004. p. 338–41.
- 47. Bennett DA. How can I deal with missing data in my study? Aust N Z J Public Health. 2001;25(5):464–9. pmid:11688629
- 48. Dong Y, Peng C-YJ. Principled missing data methods for researchers. SpringerPlus. 2013;2:222. pmid:23853744
- 49. Langkamp DL, Lehman A, Lemeshow S. Techniques for Handling Missing Data in Secondary Analyses of Large Surveys. Academic Pediatrics. 2010;10(3):205–10. pmid:20338836
Allison PD. Missing data. In: Millsap RE, Maydeu-Olivares A, editors. Handbook of Quantitative Methods in Psychology. Thousand Oaks, California: Sage Publications; 2009. p. 72–89.
- 51. Von Hippel PT. Regression with missing Ys: An improved strategy for analyzing multiply imputed data. Sociological Methodology. 2007;37(1):83–117.
- 52. Blakesley RE, Mazumdar S, Dew MA, Houck PR, Tang G, Reynolds CF III, et al. Comparisons of methods for multiple hypothesis testing in neuropsychological research. Neuropsychology. 2009;23(2):255. pmid:19254098
- 53. Sankoh AJ, Huque MF, Dubey SD. Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine. 1997;16(22):2529–42. pmid:9403954
- 54. Staiger D, Stock J. Instrumental variables regression with weak instruments. Econometrica. 1997;65(3):557–86.
- 55. Woodward M, Brindle P, Tunstall-Pedoe H. Adding social deprivation and family history to cardiovascular risk assessment: the ASSIGN score from the Scottish Heart Health Extended Cohort (SHHEC). Heart. 2007;93(2):172–6. pmid:17090561
- 56. Hippisley-Cox J, Coupland C, Brindle P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ. 2017;357:j2099. pmid:28536104
- 57. Tunstall-Pedoe H, Woodward M. By neglecting deprivation, cardiovascular risk scoring will exacerbate social gradients in disease. Heart. 2006;92(3):307–10. pmid:16166099
- 58. Chien AT, Chin MH, Davis AM, Casalino LP. Pay for performance, public reporting, and racial disparities in health care. Medical Care Research and Review. 2007;64(5_suppl):283S–304S. pmid:17881629
- 59. Popkin BM, Gordon-Larsen P. The nutrition transition: worldwide obesity dynamics and their determinants. International Journal Of Obesity. 2004;28:S2. pmid:15543214
- 60. Munafo MR, Tilling K, Ben-Shlomo Y. Smoking status and body mass index: a longitudinal study. Nicotine Tob Res. 2009;11(6):765–71. pmid:19443785
- 61. Nguyen TT, Tchetgen EJT, Kawachi I, Gilman SE, Walter S, Glymour MM. The role of literacy in the association between educational attainment and depressive symptoms. Social Science & Medicine—Population Health. 2017;3:586–93.
- 62. Valentine JC, Pigott TD, Rothstein HR. How many studies do you need? A primer on statistical power for meta-analysis. Journal of Educational and Behavioral Statistics. 2010;35(2):215–47.
- 63. Stephens M, Yang D-Y. Compulsory Education and the Benefits of Schooling. The American Economic Review. 2014;104(6):1777–92.
- 64. Glymour MM, Hamad R. Causal Thinking as a Critical Tool for Eliminating Social Inequalities in Health. American Journal of Public Health. 2018;108(5):623-. pmid:29617596