• Loading metrics

Counterfactual analysis of differential comorbidity risk factors in Alzheimer’s disease and related dementias

  • Yejin Kim ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliations School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America, Institute for Stroke and Cerebrovascular Disease, University of Texas Health Science Center at Houston, Houston, Texas, United States of America

  • Kai Zhang,

    Roles Formal analysis, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing

    Affiliation School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America

  • Sean I. Savitz,

    Roles Conceptualization, Formal analysis, Investigation, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Institute for Stroke and Cerebrovascular Disease, University of Texas Health Science Center at Houston, Houston, Texas, United States of America

  • Luyao Chen,

    Roles Data curation, Investigation, Software, Writing – original draft

    Affiliation School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America

  • Paul E. Schulz,

    Roles Conceptualization, Investigation, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Neurology, University of Texas Health Science Center at Houston, Houston, Texas, United States of America

  • Xiaoqian Jiang

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliations School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America, Institute for Stroke and Cerebrovascular Disease, University of Texas Health Science Center at Houston, Houston, Texas, United States of America


Alzheimer’s disease and related dementias (ADRD) is a multifactorial disease that involves several different etiologic mechanisms with various comorbidities. There is also significant heterogeneity in the prevalence of ADRD across diverse demographics groups. Association studies on such heterogeneous comorbidity risk factors are limited in their ability to determine causation. We aim to compare counterfactual treatment effects of various comorbidity in ADRD in different racial groups (African Americans and Caucasians). We used 138,026 ADRD and 1:1 matched older adults without ADRD from nationwide electronic health records, which extensively cover a large population’s long medical history in breadth. We matched African Americans and Caucasians based on age, sex, and high-risk comorbidities (hypertension, diabetes, obesity, vascular disease, heart disease, and head injury) to build two comparable cohorts. We derived a Bayesian network of 100 comorbidities and selected comorbidities with potential causal effect to ADRD. We estimated the average treatment effect (ATE) of the selected comorbidities on ADRD using inverse probability of treatment weighting. Late effects of cerebrovascular disease significantly predisposed older African Americans (ATE = 0.2715) to ADRD, but not in the Caucasian counterparts; depression significantly predisposed older Caucasian counterparts (ATE = 0.1560) to ADRD, but not in the African Americans. Our extensive counterfactual analysis using a nationwide EHR discovered different comorbidities that predispose older African Americans to ADRD compared to Caucasian counterparts. Despite the noisy and incomplete nature of the real-world data, the counterfactual analysis on the comorbidity risk factors can be a valuable tool to support the risk factor exposure studies.

Author summary

Alzheimer’s disease (AD) is the sixth leading cause of death in the United States, affecting 6 million Americans aged 65 and older. AD risk develops over the long course of a lifetime and involves various etiologies such as genetic, vascular, and psychosocial factors, of which the complex biological mechanisms are still under investigation. Putative risk factors include race/ethnicity, low educational attainment, socioeconomic status, and comorbidities (hypertension, diabetes). These risk factors may interact with each other and further increase the risk of AD. Most studies find older African Americans are more likely than older Whites to develop AD. Comorbidity risk factors and socioeconomic status are believed to partially account for these differences, as they are more prevalent in African Americans. To disentangle the multifactorial effects of factors predisposing older adults to AD, we quantified counterfactual effect of high-risk comorbidities mediating the AD risk using nationwide electronic health records. We particularly focused on differential counterfactual effects between matched African Americans and Caucasians. Our extensive counterfactual analysis discovered different comorbidities that predispose older African Americans to AD compared to Caucasian counterparts. This differential risk between racial groups will contribute to developing targeted treatment to AD.


Alzheimer’s disease (AD) is the 6th leading cause of death in the United States and it is the only one of the top 10 leading causes of death that cannot be cured [13]. Alzheimer’s disease and related dementias (ADRD) is a multifactorial disease that involves several different etiologic mechanisms with highly heterogeneous phenotypes [4,5]. Moreover, prior studies suggest that there is significant heterogeneity in the prevalence of ADRD across diverse demographic groups [6,7]. For example, most studies find that older African Americans are more likely than older non-Hispanic Caucasians to be diagnosed with ADRD [710]. Comorbidity risk factors such as cardiovascular disease, diabetes, and obesity, as well as socioeconomic status, are believed to account for these differences, as they are more prevalent in African Americans [1,2].

Association studies on such multifactorial and heterogeneous comorbidity risk factors are limited in their ability to determine causation. For example, although obesity is associated with increasing ADRD risk [11], its effect may be mediated by comorbidities such as hypertension, cardiovascular disease, and diabetes [1214]. Counterfactual analysis, on the other hand, uses a methodology to estimate the outcome for an individual who had been exposed to a risk factor (factual) under alternative exposure scenarios (counterfactual) of if the individual had not been exposed. A confounder is a variable causing exposure to the risk factors and also outcomes. It is a major source of bias that can mislead us to draw wrong conclusions that the risk factor causes the outcome when it does not [15]. The gold standard to avoid such bias is to randomize the exposure in randomized clinical trials, but such a randomized study is not feasible in studying risk factors, particularly when the exposure is unethical (e.g., exposing subjects to putative risk factors) [16,17]. Alternatively, the counterfactual analysis with observational data aims to reduce the bias by adjusting the distribution of conditions that affect the exposure to the risk factor, such as via propensity score matching or weighting [18].

The statistical inferences to estimate the causality and counterfactuals from observational data in medicine have been long discussed but not yet widely used [16,1825]. For example, previous research proposes a framework for emulating randomized trials from big observational data [17,18,26]. This framework simulates randomized clinical trials via controlling baseline characteristics, identifying time zero (baseline) to the outcome, adjusting the confounders by matching, and estimating treatment effect using potential outcome models [17]. A challenge here is that latent confounders can lead to selection bias; the causal structure can delineate the relationship between these confounders and help reduce the selection bias [27]. Indeed there is a separate line of causal analysis studies utilizing causal structure learning to investigate conditional independence among comorbid conditions [2833], mainly with a few predetermined selected variables due to super-exponential complexity in structure learning. To date, comorbidity risk factor studies in ADRD often focus on the association [1,2], rather than causation [24,34]. Similar to the emulation of randomized clinical trials [17], the goal of this study was to investigate the counterfactual effect of comorbidity risk factors in ADRD, particularly focusing on racial heterogeneity (Fig 1).

Fig 1. Study overview.

Our goal is to assess the counterfactual effect of comorbidities that predispose each racial group to ADRD. We focused on African Americans and non-Hispanic Caucasians. (a) We used Cerner EHRs from more than 600 Cerner client hospitals. (b,c) To select cohort, we matched age and sex in ADRD and non-ADRD subjects. To make African American and Caucasian cohorts comparable, we matched race on the known ADRD risk factors (age, sex, hypertension, diabetes, obesity, heart disease, vascular disease, and head injury). Hidden confounders (such as socioeconomic status) were presented for clarification. (d) Age is the strongest risk factor in ADRD. We matched the age of ADRD and non-ADRD subjects. (e) To disentangle the multifactorial effect of comorbidities, we derived a Bayesian network of comorbidities and ADRD using constraint-based algorithms. (f) We used inverse probability of treatment weighting to estimate the counterfactual effect of comorbidities that have a direct edge to ADRD in each racial group’s Bayesian network. We performed the permutation test to validate the counterfactual effect. (g) We examined the difference in comorbidities paths of the Bayesian network.

One challenge to counterfactual comorbidity risk factor studies is the lack of data capturing comprehensive health conditions before ADRD onset. As ADRD is a heterogeneous disease with various etiologies, counterfactual analysis requires an extensive set of comorbidities to investigate how one disease might contribute to ADRD. Voluminous electronic health records (EHRs) from nationwide hospitals are a rich source for providing comprehensive data on the risk of ADRD. Nationwide EHRs also have larger sample sizes even in the minority populations compared to the sample size in data in clinical trials or observational studies, in which a participation rate is significantly low in the minority populations [35,36]. EHR data, however, are mainly collected for billing, not for scholarly study, and thus diagnosis billing codes in EHRs are sometimes incomplete and lack important details such as socioeconomic factors (e.g., education, literacy, life course exposures), which are one of the main causes of racial disparities in ADRD [37,38]. Despite the potential limitation due to these unobserved confounders, EHRs can extensively cover a large population’s long medical history in breadth and provide us a unique opportunity to investigate the counterfactual effect of comorbidity risk factors for ADRD. We undertook this study to provide unbiased insights on the racial differences of comorbidity risk factors in ADRD.

Materials and study design


We utilized Cerner Health Facts, a large clinical database covering EHRs from more than 600 Cerner client hospitals, from 2000–2017, with a total of 49,826,000 inpatients and outpatients (Fig 1A) [39]. The Cerner Health Facts database contains a de-identified EHR and is subscribed by the University of Texas Health Science Center for research use [39]. These nationwide multi-center EHRs can increase generalizability of our findings.

Observation period

We summarized our study design in Table 1. We included subjects with observation after the age of 65 and observations longer than 6 months. Age is the strongest risk factor for ADRD. Non-ADRD subjects were either not old enough to have ADRD onset (e.g., average ADRD onset age was 79.99 for African Americans and 81.57 for Caucasians) or old enough but censored (e.g., median observation length was 1.0 years). To avoid bias caused by different age distributions in ADRD and non-ADRD subjects, we matched age in their observation period (Fig 1B, 1C Matching 1). That is, for the ADRD subjects, the observation window started from when any diagnosis code was first recorded and ended when the first ADRD onset was recorded (Fig 1D). For the non-ADRD subjects, we selected subjects that had the closest age at the observation starts and ends. We truncated non-ADRD observations after the age when matched ADRD observations ended (Fig 1D).

Table 1. Summary of the study design based on Target Trial framework [17].

We performed the analysis for African American and Caucasian counterparts, respectively.

Exposure to comorbidities

Comorbidities of interest were all the diseases (identified as diagnosis codes) that were diagnosed within the observation period, which might have potential risk to predispose to ADRD onset. EHRs are inherently noisy and sparse. To accurately identify the most direct putative risk factors and better understand root causes, it is important to condense the sparse diagnosis codes into clinically meaningful comorbidities and disentangle the effects of multifactorial comorbidities. Our approach to addressing this challenge was to group ICD9 or ICD 10 diagnosis codes by PheWas hierarchy to increase clinical relevance of the billing codes [40]. PheWas code is a hierarchical grouping of ICD codes based on statistical co-occurrence, code frequency, and human review. For more detail, see reference [40]. We included 100 PheWas disease codes that appeared within the observation window in more than 5% of the subjects. We counted the occurrence of each disease code that appears during the observation and converted them to a logarithm scale (i.e., log2(1+counts)) as the count distributions are skewed.


The outcome of interest was ADRD onset, which we detected as having either ADRD diagnosis code or medication. The ADRD diagnosis codes were PheWas codes for 290.11 (Alzheimer’s disease); 290.12 (Frontotemporal dementia, Pick’s disease, Senile degeneration of brain); 290.13 (Senile dementia); and 290.16 (Vascular dementia, Vascular dementia with delirium/delusions/depressed mood). The ADRD medications were acetylcholinesterase inhibitors (Donepezil, Galantamine, and Rivastigmine) or memantine. The definition of ADRD in EHRs can be controversial considering the fact that EHRs are for billing purposes. We compared our definition of ADRD onset with other potential definitions and discussed our rationale behind choosing our definition considering racial bias in S1 Text A.


We investigated and compared differences in the comorbidities predisposing each racial group to ADRD. Our approach for counterfactual effects analysis has three steps: i) cohort selection by matching sex, age, and known comorbidity risk factors, ii) disentangle dependency among comorbidities and ADRD to identify core skeleton that accounts for increasing ADRD risk by Bayesian network, and iii) measure and validate the counterfactual effects of the identified comorbidities with a direct relationship to ADRD by inverse probability of treatment weighting. Code is publicly available at

Matching to obtain comparable racial cohorts

We matched African Americans and non-Hispanic Caucasians in terms of age, sex, and known comorbidity risk factors to create comparable racial cohorts (Fig 1C). Direct comparison on the risk of ADRD between African Americans and Caucasians would produce biases due to confounders in the large-scale heterogeneous EHRs [41,42]. For example, in Cerner Health Facts EHRs, 68.1% were female in African Americans, whereas 62.8% were female in Caucasians (Table 2). Careful cohort matching is needed to build comparable cohorts that are similar in terms of known risk factors. Theoretically, this cohort matching is not necessary for the Bayesian network (discussed in the next section) because the Bayesian network captures local interaction between variables, but we found that this cohort matching is helpful to the unconfoundedness assumption [16].

Table 2. Cohort demographics.

Distribution of comorbidities with a known risk before and after matching between African Americans and Caucasians. Standardized bias = difference in the mean of a given variable between African Americans and Caucasians divided by the standard deviation in African Americans.

We first matched ADRD subjects and non-ADRD subjects based on age and sex for each racial group. The comorbidities (with either known or unknown risk) incidence differs by race. Our focus is to identify the effect of comorbidities with an unknown risk that disproportionately affects racial groups because we already know the differential effect of comorbidities with known risk. The comorbidities that are known to increase ADRD risk disproportionately among racial groups were hypertension, diabetes, obesity, heart disease, vascular disease, and head injury (specific definition of each comorbidity is in S1 Table) [1,2]. So, we matched African Americans and Caucasians based on the comorbidities with known risk using propensity scores (Matching 2 in Fig 1C), so that the remaining potential risk factors are isolated. We also matched African Americans and Caucasians based on age and sex. We used the nearest neighbor matching with radius and caliper [43]. We reported standardized bias to evaluate the balance between the two groups. Detailed cohort selection process with the structural equation is available in S1 Text B.

Identify comorbidities with potential causal effect

Using the matched cohorts of African Americans and non-Hispanic Caucasians, we aimed to identify comorbidities that predispose each racial group to ADRD with potential causal effects. We derived the Bayesian network (Fig 1E), a directed acyclic graph of comorbidities and ADRD that has directed edges implying causation. Learning Bayesian networks is a principled approach to identifying and analyzing multifactorial effects, as it takes other confounders into account to determine possible causal effects via considering conditional independence. We built two Bayesian networks for African Americans and Caucasian counterparts respectively. Nodes were all comorbidities and ADRD. We set three tiers: tier1 = comorbidities with known risk, tier2 = comorbidities with unknown risk, and tier3 = ADRD. The comorbidities in tier1 were mostly chronic diseases that do not have a direct or immediate effect on ADRD (e.g., hypertension, diabetes) but have an indirect effect on ADRD by mediating through other subsequent comorbidities. The PC algorithm is one of the principled causal structure learning algorithms (by Peter and Clark) that can be applied to find Bayesian Networks (details in S1 Text C). [44,45] PC has been implemented by various software/libraries such as TETRAD [46], pcalg,[46,47] bnlearn,[48] and speed-up versions by Zarebavani and Zhang [49,50].

The causal structure, however, can vary by input data, which hinders the robustness of the structure. Bootstrapping and graph combination methods are usually adopted to increase robustness of the inference results [51,52,53]. We used the voting-based causal graph combination to obtain a robust and unbiased estimation of the causal graph, which significantly reduce the false positives and increase the overall robustness of the graph learning [54]. We leveraged the majority voting technique by randomly splitting the entire data into ten sub-datasets while withholding 10% of the original data each time. We applied the PC algorithm (with a significance level of 0.05) on each of the ten sub-datasets and aggregated the ten results. A directed edge presented in the final ensembled causal graph if it appears in more than half of all the causal graphs. We repeated the same procedure on each racial group and derived the final causal graphs of the two racial groups.

After we obtained the robust causal graphs, we investigated whether the two causal graphs are distinct enough. To compare the difference of the causal graphs, we used two metrics: structural hamming distance (SHD) and the graph edit distance (GED). The SHD measures the number of edges in which the two compared graphs do not coincide. The GED measures length of the shortest graph edit path, which is a sequence of node and edge edit operations (including substitutions, deletions, and insertions) transforming graph G1 to graph isomorphic to G2 [55].

Treatment effects of identified comorbidities on ADRD risk

After we identified unique comorbidities that predispose older African Americans and Caucasian counterparts to ADRD respectively, we quantified the causal effect of the identified comorbidities on ADRD risk by measuring the average treatment effect on increasing (or decreasing) ADRD risk (Fig 1F). That is, we would like to answer this question: “If a subject who had been exposed to the comorbidity in fact did not have the comorbidity (counterfactual), would the subject have had a lower level of ADRD risk?” The treatment effect analysis is to identify the difference in potential outcomes (e.g., ADRD risk) when the subject is exposed and not exposed to the comorbidity. Let us denote Yi(1) subject i’s outcome (i.e., ADRD onset) when the subject has certain comorbidity (Ti = 1) and Yi(0) subject i’s outcome when the subject does not have the comorbidity (Ti = 0). The treatment effect τi of having this comorbidity is then defined as: τi = Yi(1)−Yi(0) based on Neyman-Rubin’s potential outcome models [56]. Obviously, it is impossible to observe factual and counterfactual outcomes at the same time (e.g., it is impossible that a subject does and does not have the comorbidity, Yi(1) and Yi(0)). An approach to mitigate this missing counterfactual outcome is to average out the potential outcomes in the exposed and the unexposed respectively and estimate the average treatment (ATE) effect by E[Y(1)]−E[Y(0)]. The ATE is an unbiased estimate of the treatment effect (i.e., effect of comorbidity exposure) if the subjects are randomly assigned to either exposure or control group (such as in randomized clinical trials). However, in real-world data the exposure to the comorbidity is not random; subjects with and without the comorbidity differ systematically. The way to reduce the bias between the two groups is to match the subjects or weight their outcomes Y based on the likelihood of having the comorbidity T so that the likelihood distributions are similar [19,20]. Here we denote the likelihood of having the comorbidity T given subject’s condition X as propensity score. Several strategies to estimate the unbiased treatment effect include propensity score matching (to directly obtain counterfactual outcome by identifying propensity-matched neighbors), propensity score stratification (to stratify subjects into groups with a similar level of propensity score and directly compare the outcomes within each group), and inverse probability of treatment weighting (IPTW) [19,20]. Specifically we used the IPTW [57], which down-weights over-sampled patients and up-weights under-sampled subjects so that the two groups with and without the comorbidity are similar. The ATE can be then estimated as where e(Xi) is the propensity scores at Ti = 1 given subject’s features Xi. We are more interested in treatment effect among those who already had the comorbidity Ti = 1, which is the so-called average treatment effect among treated (ATT). We can estimate ATT by: where n1 refers to the number of subjects with Ti = 1. We can similarly define the average treatment effect among untreated or control (ATC) among those who did not have the comorbidity Ti = 0. In all, the treatment effect measures the amount of difference in outcome Y due to exposure T given the similarly weighted conditions X. For example, ATE = τ>0 means that the outcome when subjects are intervened to have the comorbidity is greater by τ than the outcomes when subjects are intervened not to have the comorbidity, implying the comorbidity exposure increases the outcomes on average. Similarly, ATT = τ>0 means that the outcome of subjects already having the comorbidity is greater by τ than the outcomes when the subjects are intervened not to have the comorbidity, implying the subjects with the comorbidity would have decreased the outcome (ADRD onset) by τ if they are without the comorbidity.

We obtained the propensity score e(Xi) of having the comorbidity T given subject’s features X using logistic regression. To avoid high variance of propensity scores due to overfitting, we used the self-normalized propensity estimator (or Hàjek estimator) [58]. The subject’s features Xi to infer propensity scores were the all other remaining comorbidities that have directed edge to T (the comorbidity of interest) and Y (ADRD) in the derived Bayesian network. Here we can also measure ATE and ATT of all the pairs of comorbidities with the directed edges in the Bayesian network to quantify the causal effect of one comorbidity to the others. We calculated the 95% confidence interval of the ATE and ATT with 100 bootstraps by random selection. We measure the ATE and ATT for each racial group. We used Dowhy, a publicly available package to measure the treatment effects [59].

Confirm the estimated treatment effect via the permutation test.

We confirmed the estimated treatment effect via the permutation test (Fig 1F) [60,61]. The permutation test (or randomization test) is to assess if an ATE estimate is statistically significant by testing for Fisher’s Sharp Null, H0: Yi(1) = Yi(0), ∀i, which states that there is no treatment effect for all subjects [62]. A rejection of Fisher’s Sharp Null means there is a significant treatment effect [61]. We randomly shuffled the treatment variable (binary indicator whether the subject has the comorbidity or not) to make the treatment variable independent and observed the treatment effect as repeating the random permutation. We set 100 repetitions and used Dowhy to implement the permutation tests [59].


Study subjects

We first built matched cohorts of ADRD and non-ADRD subjects. Of the 49,826,000 patients, there were 157,620 subjects with ADRD diagnosis codes; 163,320 subjects with ADRD medication codes; and 235,912 subjects with either the diagnosis or medication codes. After excluding subjects without diagnosis/medication codes, timestamp, and observation length less than 6 months, we selected 138,026 ADRD subjects and matched 138,026 non-ADRD subjects based on age and sex.

Cohort identification

We then matched African American and non-Hispanic Caucasian groups to build comparable cohorts of the two racial groups. After the extensive matching and reducing confounding effects, the final cohort was 7,662 ADRD and 7,418 non-ADRD for African Americans; 7,869 ADRD and 7,357 non-ADRD for non-Hispanic Caucasians. We calculated the standardized bias of each variable between African Americans and Caucasians to check whether the variables are balanced. The standardized bias < 0.10 was used as the cutoff value to confirm the balance after matching [63]. As a result, the age and comorbidity distributions were similar between the racial groups (Table 2, Fig 2); Most variables had standardized bias <0.10 after matching.

Fig 2. Subject’s age and known risk factor distributions.

(a) Onset age distribution of ADRD patients after matching Caucasians to African Americans. The onset age of (original) Caucasians tends to be older than that of African Americans. The matched Caucasian follows a similar distribution. (b) Distribution of known risk factors after matching. The matched Caucasian follows a similar distribution with African Americans in terms of the confounding risk factors. Ca = Caucasians, Paired Ca = Caucasians that are matched to African Americans. AF = African Americans.

Bayesian Network and counterfactual effect of comorbidities

We derived the Bayesian network of comorbidities and ADRD for each racial group separately (Fig 3, S2 Table). The two causal structures from African Americans and their matched Caucasian cohort shared similar but also distinctive comorbidities. The SHD and GED of the two structures was 1,277 and 895, respectively. Among all the edges in the Caucasian causal graph, only 46.81% edges are present in the African American causal graph; and only 36.59% of the edges in the African American causal graph are present in the Caucasian causal graph. We focused on the comorbidities that have edges to ADRD in all ten bootstraps (Table 3) and measured the counterfactual treatment effect of the comorbidities to ADRD (Table 4, Fig 3). The identified comorbidities were grouped into three types: cerebrovascular disease, mental disorders, and inflammation/infection.

Fig 3.

Bayesian Networks developed from the Cerner EHR Dataset that illustrates the differential risks for ADRD in African Americans (a) and Caucasians (b). We highlighted the major identified variables that contribute to the risk for developing the late effects of CVD and how CVD contributes to the risk for ADRD in older African Americans (a). Cerebral artery occlusion and cerebral infarction lead to late effects of CVD, and the late effect of CVD leads to transient cerebral ischemia or ADRD directly (a). The full Bayesian network is in S2 Table.

Table 3. Comorbidities (PheWas disease code) with a direct edge to ADRD in Bayesian Network.

We selected edges that appeared more than nine times out of ten repetitions. More edges in Bayesian Network are at S2 Table.

Table 4. Treatment effect of comorbidity on ADRD onset (95% confidence interval).

ATE = Average treatment effect, ATT = Average treatment effect among treated (ATE among the subjects with the comorbidity), ATC = Average treatment effect among controls (ATE among the subjects without the comorbidity). IPTW = Inverse probability of treatment weighting.

Cerebrovascular disease (CVD).

In contrast to Caucasians, African Americans had an edge from the late effects of CVD (PheWas Code = 433.8, specific diseases it covers are in S3 Table) to ADRD. Its ATE (by IPTW [64]) was 0.2715 (Table 4), which implies that having this comorbidity increases the risk for ADRD by 27.15% point on average. The ATT of 0.1908 implies that African Americans with the late effect of CVD would have decreased the risk for ADRD by 19.08% point if their CVD was treated. We confirmed the ATE estimate via the 100 repeated permutation tests [62]. The estimated ATE of the permuted variable was -0.0062 (p-value = 0.34), which means rejection of Fisher’s Sharp Null; therefore, the treatment effect is present. We examined the Bayesian network of the late effects of CVD and highlighted a few paths to the late effects of CVD and ADRD (Fig 3A). In addition, the African American’s Bayesian network had an edge from transient cerebral ischemia (PheWas Code = 433.31, specific diseases it covers are in S3 Table) to ADRD, with the number of appearances of seven (S2 Table). Transient cerebral ischemia had ATE of 0.1421 and ATT of 0.0242, implying a similar treatment effect to the late effects of CVD.

Mental disorders.

Both racial groups had multiple edges from mental disorders to ADRD: presenile dementia (290.1), other persistent mental disorders (290.3), memory loss (292.3), altered mental status (292.4), and Parkinson’s disease (332). These disorders might represent prodromal status (e.g., mild cognitive impairment) before ADRD onset.

The Caucasian counterparts had a direct edge from depression (296.2) to ADRD, whereas African Americans did not. The ATE (IPTW) of depression on the Caucasian counterparts was ATE = 0.1557 (Table 4), which implies that having depression increases the ADRD risk by 15.60% point on average. Similarly, the ATT of 0.1154 implies that Caucasian counterparts with depression would have decreased the risk for ADRD by 11.92% point if their depression was treated. During the ATE confirmation via the permutation test, we found -0.0005 (p-value = 0.48) for the permuted variables. We examined the Bayesian network of depression and highlighted a few paths to depression and ADRD (Fig 3B).


African Americans had edges from urinary tract infection (591) and acute upper respiratory infections (465) to ADRD. For the urinary tract infection, ATE = 0.1149 with 95% confidence interval (CI) = (0.0918, 0.1371) means urinary tract infection increases the ADRD risk by 11.49% point on average; ATT = 0.0573 with CI = (0.0343, 0.0903) means treating urinary tract infection decreases the ADRD risk by 5.73% points. For acute upper respiratory infections, ATE = -0.2554 with CI = (-0.2984, -0.2373) means acute upper respiratory infections decrease the ADRD risk by 25.54% point on average; ATT = -0.2662 with CI = (-0.2827,-0.2327) means treating acute upper respiratory infections increase the ADRD risk by 26.62% points. Further investigation is needed for the mixed effects of inflammation, such as considering the confounding effect of inflammation itself and/or medications related to inflammation. For example, Zileuton, a leukotriene biosynthesis inhibitor that is widely used for chronic inflammation (asthma), has shown a significantly reduced level of neuroinflammation and in tau phosphorylation in the tau transgenic mice [65].

Discussion and conclusions

The objective of this study was to investigate differential effects of comorbidities that increase the risk of ADRD in African Americans compared to non-Hispanic Caucasians using counterfactual analysis of a nationwide EHR. We matched African Americans and Caucasians based on age, sex, and high-risk common comorbidities. Using the learned Bayesian network, average treatment effect estimation, and permutation test confirmation, we identified that the late effects of CVD predispose older African Americans to ADRD, but we did not find the same strength of this counterfactual effect in Caucasians. Our counterfactual approach to identify ADRD risk factors has methodological innovation in that we first address causation (not association) in comorbidities by integrating an ensemble causal structure model and counterfactual treatment effect estimation to better understand the root cause of ADRD risk. Our findings have clinical implications in that the identified ADRD risk factors (late effects of CVD) and the factors leading towards the later stages of CVD could potentially be modifiable.

In our study, we found that there are several significant differences between the comorbidities contributing to the development of ADRD among the two racial groups. African Americans had statistically significant counterfactual effects from CVD on ADRD risk, which implies that the late effects of CVD in African Americans are more likely to predispose patients to ADRD than in Caucasians. CVD in African Americans may lead to vascular dementia, Alzheimer’s, or both to a greater extent than Caucasians. For the same disease burden, CVD may directly lead to the advancement of the pathology intrinsic to ADRD or subsequent transient ischemic attack (TIA) may accelerate the disease process towards ADRD. It is known that African Americans have higher incidences of CVD, small vessel disease, and vascular risk factors than Caucasians but even when controlling for vascular risk factors, African Americans have a higher risk for vascular end organ complications compared with Caucasians. TIAs by themselves are manifestations of uncontrolled cerebrovascular disease such as atherosclerosis of the extracranial and intracranial arterial circulation and may indicate harbingers of advancing disease towards vascular dementia, Alzheimer’s, or both. We considered other possibilities. African Americans could be under-treated with antiplatelets and statins leading to more TIAs and advancement towards ADRD. Although we do not yet understand the mechanisms of how TIAs would lead to ADRD, these findings suggest that TIAs may be a controllable risk factor that can reduce the risk of ADRD onset in African Americans. Furthermore, the late effects of CVD could still be associated with ADRD in Caucasians. Our findings indicate that the late effects of CVD do not have a counterfactual treatment effect in Caucasians.

In our analysis of matched Caucasian counterparts, we found a very different set of comorbidities pathways leading to ADRD. Depression was one of the major factors that contributed to ADRD [6668]. We identified several pathways leading to depression in the EHR dataset. For Caucasians, depression and ADRD are commonly observed together and share many symptoms in older adults. Depression has been associated with poor cognitive functions [69] and increases the risk of ADRD [67,70,71]. The actual causal relationships between depression and ADRD are still controversial [67,72]. For African Americans it has been long known that there are racial disparities in depression and quality of care [73]. In general, African Americans have poorer mental health than non-Hispanic Caucasians [74]. Nonetheless, depressed African Americans are less likely to utilize mental health care than non-Hispanic Caucasians [73,75,76], due to socio-cultural factors including racism, misdiagnosis, and clinician bias [77]. We believe that the depression which is only observed in Caucasian counterparts in our Bayesian network analysis confirms racial disparities in seeking mental care. Therefore, we acknowledge that depression could be a more significant risk factor for ADRD than we were able to detect.

There are several limitations of this study. The data in EHR are observations that cannot fully represent the entire population, for example, the population in the Cerner EHR is mostly from the middle class that is privately insured. Another limitation is that ADRD is a progressive disease that slowly develops over several years to decades. EHRs might not be able to fully capture such a long-life course progression. The median observation period was 1.0 years (min was 0.5 years, max 11 years), implying the non-ADRD subjects might be censored too early to observe their ADRD risk. Late ADRD onset age (due to social determinants) also shows the limitation of EHRs. In addition, there have been long debates whether biological characteristics such as race or comorbidities should be evaluated as causes even if intervention to such variables is difficult [24]. From the methodology perspective, the cohort matching method can potentially cause bias as the unmatched patients were excluded in the analysis. Quantitative evaluation of the Bayesian network was not available due to the lack of ground truth. A potential way to build a better causal structure is to incorporate prior knowledge mined from literature. We obtained the causal structure in a data-driven way without incorporating the literature-based prior knowledge. Injecting such prior knowledge into the causal structure might allow us to seamlessly harmonize existing knowledge and new data-driven causality. Notwithstanding these limitations, this study offers some important insights into racially differential comorbidity risk factors in ADRD. In addition, despite the noisy and incomplete nature of the real-world data, the counterfactual analysis on the comorbidity risk factors can be a valuable tool to support the risk factor exposure studies.

Supporting information

S1 Text.

A. Definition of ADRD in EHRs, B. Cohort matching methods, C. Disentangle the dependency among comorbidities and ADRD.


S1 Table.

Selected known comorbidities that we controlled to build a balanced cohort.


S2 Table.

Causal structure of African Americans and Caucasian counterparts.


S3 Table.

Definition of late effect of cerebrovascular disease and transient cerebral ischemia.



  1. 1. Lines L, RTI International. Racial and Ethnic Disparities Among Individuals with Alzheimer’s Disease in the United States: A Literature Review. 2014.
  2. 2. Glymour MM, Manly JJ. Lifecourse social conditions and racial and ethnic patterns of cognitive aging. Neuropsychol Rev. 2008;18: 223–254. pmid:18815889
  3. 3. Alzheimer’s Statistics. In: [Internet]. [cited 11 Jun 2019]. Available:
  4. 4. Qiu C, Kivipelto M, von Strauss E. Epidemiology of Alzheimer’s disease: occurrence, determinants, and strategies toward intervention. Dialogues Clin Neurosci. 2009;11: 111–128. pmid:19585947
  5. 5. Iqbal K, Grundke-Iqbal I. Alzheimer’s disease, a multifactorial disorder seeking multitherapies. Alzheimers Dement. 2010;6: 420–424. pmid:20813343
  6. 6. (cdc) USD of H&. HSCFDC, US Department of Health & Human Services; Centers for Disease Control (CDC). Racial and Ethnic Disparities in Health Status. PsycEXTRA Dataset. 2002. pmid:11808619
  7. 7. Mayeda ER, Glymour MM, Quesenberry CP, Whitmer RA. Inequalities in dementia incidence between six racial and ethnic groups over 14 years. Alzheimers Dement. 2016;12: 216–224. pmid:26874595
  8. 8. Bulatao RA, National Research Council of the National Academies; Committee on Population; Division of Behavioral and Social Sciences and Education, Anderson NB, National Research Council of the National Academies; Panel on Race, Ethnicity, and Health in Later Life. Understanding Racial and Ethnic Differences in Health in Late Life: A Research Agenda. PsycEXTRA Dataset. 2004. pmid:20669466
  9. 9. Demirovic J, Prineas R, Loewenstein D, Bean J, Duara R, Sevush S, et al. Prevalence of Dementia in Three Ethnic Groups. Annals of Epidemiology. 2003. pp. 472–478. pmid:12875807
  10. 10. Steenland K, Goldstein FC, Levey A, Wharton W. A Meta-Analysis of Alzheimer’s Disease Incidence and Prevalence Comparing African-Americans and Caucasians. Journal of Alzheimer’s Disease. 2015. pp. 71–76. pmid:26639973
  11. 11. Ma Y, Ajnakina O, Steptoe A, Cadar D. Higher risk of dementia in English older individuals who are overweight or obese. Int J Epidemiol. 2020;49: 1353–1365. pmid:32575116
  12. 12. Pegueroles J, Jiménez A, Vilaplana E, Montal V, Carmona-Iragui M, Pané A, et al. Obesity and Alzheimer’s disease, does the obesity paradox really exist? A magnetic resonance imaging study. Oncotarget. 2018;9: 34691. pmid:30410669
  13. 13. Luchsinger JA, Gustafson DR. Adiposity and Alzheimerʼs disease. Current Opinion in Clinical Nutrition and Metabolic Care. 2009. pp. 15–21. pmid:19057182
  14. 14. Beydoun MA, Beydoun HA, Wang Y. Obesity and central obesity as risk factors for incident dementia and its subtypes: a systematic review and meta-analysis. Obes Rev. 2008;9: 204–218. pmid:18331422
  15. 15. Tyler J. VanderWeele IS. On the definition of a confounder. Ann Stat. 2013;41: 196.
  16. 16. Ohlsson H, Kendler KS. Applying Causal Inference Methods in Psychiatric Epidemiology: A Review. JAMA Psychiatry. 2020;77: 637–644. pmid:31825494
  17. 17. Miguel A. Hernán JMR. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am J Epidemiol. 2016;183: 758. pmid:26994063
  18. 18. Prosperi M, Guo Y, Sperrin M, Koopman JS, Min JS, He X, et al. Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence. 2020;2: 369–375.
  19. 19. Pearl J. Causality: Models, Reasoning, and Inference. Cambridge University Press; 2000.
  20. 20. Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Elsevier; 2014.
  21. 21. James J. McGough SVF. Estimating the Size of Treatment Effects: Moving Beyond P Values. Psychiatry. 6: 21.
  22. 22. Parascandola M, Weed DL. Causation in epidemiology. J Epidemiol Community Health. 2001;55. pmid:11707485
  23. 23. Zenil H, Kiani NA, Zea AA, Tegnér J. Causal deconvolution by algorithmic generative models. Nature Machine Intelligence. 2019;1: 58–66.
  24. 24. Glymour MM, Spiegelman D. Evaluating Public Health Interventions: 5. Causal Inference in Public Health Research-Do Sex, Race, and Biological Factors Cause Health Outcomes? Am J Public Health. 2017;107: 81–85. pmid:27854526
  25. 25. Ling Y, Upadhyaya P, Chen L, Jiang X, Kim Y. Heterogeneous Treatment Effect Estimation using machine learning for Healthcare application: tutorial and benchmark. 2021. Available:
  26. 26. Chen Z, Zhang H, Guo Y, George TJ, Prosperi M, Hogan WR, et al. Exploring the feasibility of using real-world data from a large clinical data research network to simulate clinical trials of Alzheimer’s disease. NPJ digital medicine. 2021;4. pmid:33990663
  27. 27. Leite WL, Stapleton LM, Bettini EF. Propensity Score Analysis of Complex Survey Data with Structural Equation Modeling: A Tutorial with Mplus. Structural Equation Modeling: A Multidisciplinary Journal. 2019. pp. 448–469.
  28. 28. Kim Y, Jeong J-E, Cho H, Jung D-J, Kwak M, Rho MJ, et al. Personality Factors Predicting Smartphone Addiction Predisposition: Behavioral Inhibition and Activation Systems, Impulsivity, and Self-Control. PLoS One. 2016;11: e0159788. pmid:27533112
  29. 29. Park YH, Kim Y, Yu H, Choi IY, Byun S-S, Kwak C, et al. Is lymphovascular invasion a powerful predictor for biochemical recurrence in pT3 N0 prostate cancer? Results from the K-CaP database. Sci Rep. 2016;6: 25419. pmid:27146602
  30. 30. Faruqui SHA, Alaeddini A, Jaramillo CA, Potter JS, Pugh MJ. Mining patterns of comorbidity evolution in patients with multiple chronic conditions using unsupervised multi-level temporal Bayesian network. PLoS One. 2018;13: e0199768. pmid:30001371
  31. 31. McNally RJ, Mair P, Mugno BL, Riemann BC. Co-morbid obsessive-compulsive disorder and depression: a Bayesian network approach. Psychol Med. 2017;47. pmid:28052778
  32. 32. Upadhyaya P, Zhang K, Li C, Jiang X, Kim Y. Scalable Causal Structure Learning: New Opportunities in Biomedicine. 2021. Available:
  33. 33. Wittmann L, Moergeli H, Martin-Soelch C, Znoj H, Schnyder U. Comorbidity in posttraumatic stress disorder: a structural equation modelling approach. Compr Psychiatry. 2008;49. pmid:18702929
  34. 34. Glass TA, Goodman SN, Hernán MA, Samet JM. Causal inference in public health. Annu Rev Public Health. 2013;34: 61–75. pmid:23297653
  35. 35. Olson NL, Albensi BC. Race- and Sex-Based Disparities in Alzheimer’s Disease Clinical Trial Enrollment in the United States and Canada: An Indigenous Perspective. J Alzheimers Dis Rep. 2020;4: 325–344. pmid:33024940
  36. 36. Salazar CR, Hoang D, Gillen DL, Grill JD. Racial and ethnic differences in older adults’ willingness to be contacted about Alzheimer’s disease research participation. Alzheimer’s & Dementia: Translational Research & Clinical Interventions. 2020. pmid:32399482
  37. 37. 2018 Alzheimer’s disease facts and figures. Alzheimer’s & Dementia. 2018. pp. 367–429.
  38. 38. Yaffe K, Falvey C, Harris TB, Newman A, Satterfield S, Koster A, et al. Effect of socioeconomic disparities on incidence of dementia among biracial older adults: prospective study. BMJ. 2013;347: f7051. pmid:24355614
  39. 39. Cerner—Cerner Health Facts ®—Data Sets—SBMI Data Service—The University of Texas Health Science Center at Houston (UTHealth) School of Biomedical Informatics. [cited 24 Feb 2019]. Available:
  40. 40. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013;31: 1102–1110. pmid:24270849
  41. 41. Terry RD, Katzman R, Bick KL. ALZHEIMER DISEASE. Alzheimer Disease & Associated Disorders. 1995. p. 121. pmid:7546597
  42. 42. Chui HC, Gatz M. Cultural diversity in Alzheimer disease: the interface between biology, belief, and behavior. Alzheimer Dis Assoc Disord. 2005;19: 250–255. pmid:16327354
  43. 43. Caliendo M, Kopeinig S. Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys. 2008. pp. 31–72.
  44. 44. Spirtes P, Glymour C, Scheines R. Causation, Prediction, and Search. Lecture Notes in Statistics. 1993.
  45. 45. Pearl J. An Introduction to Causal Inference. CreateSpace; 2015.
  46. 46. Scheines R, Spirtes P, Glymour C, Meek C, Richardson T. The TETRAD Project: Constraint Based Aids to Causal Model Specification. Multivariate Behav Res. 1998;33: 65–117. pmid:26771754
  47. 47. Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P. Causal Inference Using Graphical Models with theRPackagepcalg. Journal of Statistical Software. 2012.
  48. 48. Scutari M, Denis J-B. Bayesian Networks: With Examples in R. CRC Press; 2021.
  49. 49. Zarebavani B, Jafarinejad F, Hashemi M, Salehkaleybar S. cuPC: CUDA-Based Parallel PC Algorithm for Causal Structure Learning on GPU. IEEE Transactions on Parallel and Distributed Systems. 2020. pp. 530–542.
  50. 50. Zhang K, Tian C, Zhang K, Johnson T, Jiang X. A Fast PC Algorithm with Reversed-order Pruning and A Parallelization Strategy. arXiv [cs.LG]. 2021. Available:
  51. 51. Tillman RE, Danks D, Glymour C. Integrating Locally Learned Causal Structures with Overlapping Variables. NIPS. Citeseer; 2008. pp. 1665–1672.
  52. 52. Claassen T, Heskes T. Causal discovery in multiple models from different experiments. 2010. Available:
  53. 53. Triantafillou S, Tsamardinos I. Constraint-based Causal Discovery from Multiple Interventions over Overlapping Variable Sets. J Mach Learn Res. 2015;16: 2147–2205.
  54. 54. Sinha M, Tadepalli P, Ramsey SA. Voting-based integration algorithm improves causal network learning from interventional and observational data: An application to cell signaling network inference. PLoS One. 2021;16: e0245776. pmid:33556096
  55. 55. Sanfeliu A, Fu K-S. A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Syst Man Cybern. 1983;SMC-13: 353–362.
  56. 56. Rubin DB. Causal Inference Using Potential Outcomes. Journal of the American Statistical Association. 2005. pp. 322–331.
  57. 57. Austin PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011;46: 399–424. pmid:21818162
  58. 58. Swaminathan A, Joachims T. The Self-Normalized Estimator for Counterfactual Learning. Adv Neural Inf Process Syst. 2015;28. Available:
  59. 59. Amit Sharma EK. DoWhy: A Python package for causal inference. 2019. Available:
  60. 60. Rosenbaum PR. Conditional Permutation Tests and the Propensity Score in Observational Studies. Journal of the American Statistical Association. 1984. pp. 565–574.
  61. 61. Branson Z, Miratrix LW. Randomization Tests that Condition on Non-Categorical Covariate Balance. Journal of Causal Inference. 2019;7. pmid:32405450
  62. 62. Rubin DB. Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment. Journal of the American Statistical Association. 1980. p. 591.
  63. 63. Harder VS, Stuart EA, Anthony JC. Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. Psychol Methods. 2010;15: 234. pmid:20822250
  64. 64. Rosenbaum PR. Model-Based Direct Adjustment. Journal of the American Statistical Association. 1987. pp. 387–394.
  65. 65. Giannopoulos PF, Chiu J, Praticò D. Learning Impairments, Memory Deficits, and Neuropathology in Aged Tau Transgenic Mice Are Dependent on Leukotrienes Biosynthesis: Role of the cdk5 Kinase Pathway. Mol Neurobiol. 2019;56: 1211–1220. pmid:29881943
  66. 66. Jorm AF. History of depression as a risk factor for dementia: an updated review. Aust N Z J Psychiatry. 2001;35: 776–781. pmid:11990888
  67. 67. Ownby RL, Crocco E, Acevedo A, John V, Loewenstein D. Depression and risk for Alzheimer disease: systematic review, meta-analysis, and metaregression analysis. Arch Gen Psychiatry. 2006;63: 530–538. pmid:16651510
  68. 68. Diniz BS, Butters MA, Albert SM, Dew MA, Reynolds CF. Late-life depression and risk of vascular dementia and Alzheimer’s disease: systematic review and meta-analysis of community-based cohort studies. British Journal of Psychiatry. 2013. pp. 329–335. pmid:23637108
  69. 69. Yaffe K, Blackwell T, Gore R, Sands L, Reus V, Browner WS. Depressive symptoms and cognitive decline in nondemented elderly women: a prospective study. Arch Gen Psychiatry. 1999;56: 425–430. pmid:10232297
  70. 70. Kokmen E, Beard CM, Chandra V, Offord KP, Schoenberg BS, Ballard DJ. Clinical risk factors for Alzheimer’s disease: A population-based case-control study. Neurology. 1991. pp. 1393–1393. pmid:1891088
  71. 71. Green RC, Cupples LA, Kurz A, Auerbach S, Go R, Sadovnick D, et al. Depression as a risk factor for Alzheimer disease: the MIRAGE Study. Arch Neurol. 2003;60: 753–759. pmid:12756140
  72. 72. Varghese M, Muliyala K. The complex relationship between depression and dementia. Annals of Indian Academy of Neurology. 2010. p. 69. pmid:20436753
  73. 73. McGuire TG, Miranda J. New evidence regarding racial and ethnic disparities in mental health: policy implications. Health Aff. 2008;27: 393–403. pmid:18332495
  74. 74. Williams DR. The Health of U.S. Racial and Ethnic Populations. The Journals of Gerontology: Series B. 2005. pp. S53–S62. pmid:16251591
  75. 75. Bailey RK, Mokonogho J, Kumar A. Racial and ethnic differences in depression: current perspectives. Neuropsychiatr Dis Treat. 2019;15: 603–609. pmid:30863081
  76. 76. Alegría M, Chatterji P, Wells K, Cao Z, Chen C-N, Takeuchi D, et al. Disparity in depression treatment among racial and ethnic minority populations in the United States. Psychiatr Serv. 2008;59: 1264–1272. pmid:18971402
  77. 77. Hankerson SH, Suite D, Bailey RK. Treatment disparities among African American men with depression: implications for clinical practice. J Health Care Poor Underserved. 2015;26: 21–34. pmid:25702724