Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Assessment of medical management in Coronary Type 2 Diabetic patients with previous percutaneous coronary intervention in Spain: A retrospective analysis of electronic health records using Natural Language Processing

  • Carlos González-Juanatey ,

    Roles Conceptualization, Funding acquisition, Investigation, Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Lucus Augusti, Lugo, Spain

  • Manuel Anguita-Sá́nchez,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Reina Sofía, Córdoba, Spain

  • Vivencio Barrios,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Ramón y Cajal, Madrid, Spain

  • Iván Núñez-Gil,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Clínico Universitario San Carlos, Madrid, Spain

  • Juan Josá Gómez-Doblas,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Virgen de la Victoria, Málaga, Spain

  • Xavier García-Moll,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Santa Creu i Sant Pau, Barcelona, Spain

  • Carlos Lafuente-Gormaz,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario de Albacete, Albacete, Spain

  • María Jesús Rollán-Gómez,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Río Hortega, Valladolid, Spain

  • Vicente Peral-Disdie,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Son Espases, Palma de Mallorca, Spain

  • Luis Martínez-Dolz,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario La Fe, Valencia, Spain

  • Miguel Rodríguez-Santamarta,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario de León, León, Spain

  • Xavier Viñolas-Prat,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Santa Creu i Sant Pau, Barcelona, Spain

  • Toni Soriano-Colomé,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Vall d’Hebron, CIBERCV, Barcelona, Spain

  • Roberto Muñoz-Aguilera,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Infanta Leonor, Madrid, Spain

  • Ignacio Plaza,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Infanta Sofía, Madrid, Spain

  • Alejandro Curcio-Ruigómez,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario de Fuenlabrada, Madrid, Spain

  • Ernesto Orts-Soler,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital General Universitario de Castellón, Castellón, Spain

  • Javier Segovia,

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario Puerta de Hierro, Madrid, Spain

  • Claudia Maté,

    Roles Data curation, Investigation, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Savana, Madrid, Spain

  • SAVANA Research Group ,

    SAVANA Research Group are: Alberto Porras, Miren Taberna, Stephanie Marchesseau, Carlos Del Rio-Bermudez, Ignacio Salcedo, Enrique Josué Álvarez, Víctor Fanjul, Florinda Meléndez, and Natalia Polo.

    Affiliation Savana, Madrid, Spain

  •  [ ... ],
  • Ángel Cequier

    Roles Supervision, Validation, Writing – review & editing

    Affiliation Hospital Universitario de Bellvitge and Universidad de Barcelona, IDIBELL, Barcelona, Spain

  • [ view all ]
  • [ view less ]


Introduction and objectives

Patients with type 2 diabetes (T2D) and stable coronary artery disease (CAD) previously revascularized with percutaneous coronary intervention (PCI) are at high risk of recurrent ischemic events. We aimed to provide real-world insights into the clinical characteristics and management of this clinical population, excluding patients with a history of myocardial infarction (MI) or stroke, using Natural Language Processing (NLP) technology.


This is a multicenter, retrospective study based on the secondary use of 2014–2018 real-world data captured in the Electronic Health Records (EHRs) of 1,579 patients (0.72% of the T2D population analyzed; n = 217,632 patients) from 12 representative hospitals in Spain. To access the unstructured clinical information in EHRs, we used the EHRead® technology, based on NLP and machine learning. Major adverse cardiovascular events (MACE) were considered: MI, ischemic stroke, urgent coronary revascularization, and hospitalization due to unstable angina. The association between MACE rates and the variables included in this study was evaluated following univariate and multivariate approaches.


Most patients were male (72.13%), with a mean age of 70.5±10 years. Regarding T2D, most patients were non-insulin-dependent T2D (61.75%) with high prevalence of comorbidities. The median (Q1-Q3) duration of follow-up was 1.2 (0.3–4.5) years. Overall, 35.66% of patients suffered from at least one MACE during follow up. Using a Cox Proportional Hazards regression model analysis, several independent factors were associated with MACE during follow up: CAD duration (p < 0.001), COPD/Asthma (p = 0.021), heart valve disease (p = 0.031), multivessel disease (p = 0.005), insulin treatment (p < 0.001), statins treatment (p < 0.001), and clopidogrel treatment (p = 0.039).


Our results showed high rates of MACE in a large real-world series of PCI-revascularized patients with T2D and CAD with no history of MI or stroke. These data represent a potential opportunity to improve the clinical management of these patients.


Type 2 diabetes (T2D) has reached epidemic proportions globally due to a steady increase in life expectancy, high prevalence of obesity and sedentary lifestyle, and pervasive unhealthy eating habits [1]. In 2019, the global prevalence of diabetes was estimated to be around 9%. By 2030, the disease is expected to reach 700 million people [2].

In T2D patients, the progressive atherosclerotic disease leads to a twofold increased risk for cardiovascular diseases (CVD), including myocardial infarction (MI), stroke, peripheral vascular disease, and coronary artery disease (CAD) [35]. In addition, admission hyperglicemia is a strong predictor of short- and long-term adverse outcomes in patients with acute MI [6]. Notably, CAD is the main cause of mortality in patients with T2D, and diabetes leads to a 2- to 4-fold increased risk of death due to heart disease [7, 8]. Indeed, approximately one third of patients undergoing percutaneous coronary intervention (PCI) are diabetic [8]. PCI procedures, particularly with drug-eluting stents, have proven successful in the management of stable angina and improving quality of life in patients with diabetes and CAD [9].

Patients with T2D and stable CAD who have been revascularized with PCI are at high risk of ischemic events. In these patients, international guidelines recommend the use of antiplatelet therapy to improve cardiovascular outcomes following intervention [1012]. However, evidence supporting the long-term use of dual antiplatelet regimens in patients with T2D and CAD but without a history of MI or stroke has been inconclusive. In this context, the THEMIS-PCI trial was part of a recent large phase 3 randomized, double-blinded, placebo-controlled trial designed to evaluate the efficacy and safety of ticagrelor 60 mg bid added to background acetylsalicylic acid (ASA) therapy for the prevention of MACE in this population [13]. This trial showed that the incidence of ischemic cardiovascular events over a 3.3-year follow-up was significantly lower in the ticagrelor group (7.7%) than in the placebo group (8.6%). On the other hand, in a recent population-based cohort study aimed at comparing the risk of MACE with ticagrelor vs clopidogrel in patients with acute coronary syndrome (ACS) and previous PCI intervention, ticagrelor was not associated with a reduction in MACE in the year following revascularization. as compared with clopidogrel [14].

In summary, the available evidence indicates that a) T2D is an important risk factor for CAD and that diabetic status may worsen clinical outcomes after PCI and other revascularization procedures [1518], b) revascularized patients are at high risk of ischemic events, and c) further research regarding treatment outcomes regarding MACE in these patients is warranted. Thus, a thorough and updated clinical characterization of these patients in real-world settings becomes critical to design early intervention strategies, improve prognosis, and ultimately reduce cardiovascular events.

The present study aimed to provide real-world insights into the clinical characteristics and management of patients with CAD, T2D, and a previous history of PCI (but no prior MI or stroke) in Spain by analyzing readily available information in the Electronic Health Records (EHRs) of the Spanish National Healthcare System.

Materials and methods

The present study was classified as a ‘non-post-authorization study’ by the Spanish Agency of Medicines and Health Products (AEMPS) and was approved by the Institutional Review Board of each participating hospital. This study was conducted in compliance with legal and regulatory requirements and followed generally accepted research practices described in the Helsinki Declaration in its latest edition, Good Pharmacoepidemiology Practices, and applicable local regulations. Patient consent was not required in this study since data were retrospectively captured from patients’ EHRs in an anonymized, and aggregated in an irreversible dissociated manner.

Data source and study design

This was a real-world, multicenter, and retrospective study based on the secondary use of the unstructured data captured in the EHRs of 12 representative hospitals from 6 major regions (namely Madrid, Catalonia, Valencia, Balearic Islands, Castilla-La Mancha, and Castilla León) within the Spanish National Healthcare System (Fig 1). Data were collected between January 1, 2014 and December 31, 2018 from all available services and departments in each participating site (including inpatient hospital, outpatient hospital, and emergency room).

Fig 1. Study design and timeline.

The Index Date (i.e., Baseline) was defined as the timepoint when diagnostic criteria for both T2D and CAD were first identified in patients who underwent PCI. All available EHRs prior to January 2014 were considered to extract information regarding the clinical history of patients (dotted blue line). The follow-up period ranged from the Index Date to the end of the study period or the last data point available. Unstructured data from patients’ EHRs was extracted and organized with the EHRead® technology. See the Methods section for further details. *Estimated prevalence data for T2D calculated over the total patient population at midpoint of the study period minus both patients who died in the hospital during the study period (N = 41,747) and patients without follow-up data in the 12 months prior to midpoint (N = 1,460,161).

A cross-sectional analysis of the study variables (including demographic and clinical characteristics, comorbidities, medical therapy, and treatment) was performed at Index Date (Fig 1), defined as the timepoint when mentions of both T2D and CAD are first found in EHRs. The rates and incidence of MACE (MI, ischemic stroke, urgent coronary revascularization, and hospitalization due to unstable angina) was analyzed during follow up, which comprised the time between the baseline and end of the study period (Fig 1).

Study population and eligibility criteria

Inclusion criteria.

To be eligible for inclusion in the study, patients had to fulfil all of the following inclusion criteria:

  • ≥18 years old
  • Diagnosis of both T2D and CAD
  • Revascularization with PCI
  • Documented ongoing use of glucose-lowering drugs (oral hypoglycemic agents) for at least 6 months
  • Available follow-up information spanning at least 6 months

Exclusion criteria.

Patients were excluded from the study if they met any of the following criteria:

  • Prior MI
  • Prior stroke (except Transient Ischemic Attack)
  • Prior intracranial bleeding
  • Gastrointestinal bleeding within the last 6 months
  • Renal failure requiring dialysis
  • History of liver cirrhosis or liver cancer

Extracting the unstructured free text from EHRs

To access the unstructured clinical information in EHRs, we used Savana’s EHRead® technology [1924]. Based on NLP and machine learning, this technology facilitates the extraction of information from patients’ EHRs and subsequent standardization of extracted clinical concepts to a common terminology. The clinical corpora used by EHRead is based on SNOMED Clinical Terminology and includes more than 300,000 medical concepts, acronyms, and laboratory parameters. These concepts are later organized based on EHR sections (medical history, laboratory results, prescriptions, procedures, diagnoses, etc.), hospital service, and other specifications.

EHRead’s ability to correctly identify patient records containing key variables associated with the study disease were assessed according to previously published procedures [21], summarized in S1 File. Briefly, the evaluation of EHRead’s performance consists of a comparison between EHRead’s reading output and an annotated corpus of EHRs by expert physicians (‘gold standard’). The result of this comparison is expressed in terms of the standard metrics of accuracy (P), recall (R), and their harmonic mean F1-score. As shown in S1 Table in S1 File, our evaluation yielded a F1-Score of ≥90% in most analyzed variables, showing a near-optimal performance in EHRead’s ability to properly identify most records that contain T2D, CAD, and related variables.

Data analyses

Frequency and summary tables were used to display information for categorical and continuous variables, respectively. The association between MACE rates and the variables included in this study was evaluated following two different approaches. In the univariate approach, a model was fitted to the study population for each variable at baseline. The association between MACE and categorical variables was assessed with Fisher’s exact tests. Independent samples two-tailed T-tests were performed to assess statistical differences between patients with and without MACE for each continuous numeric variable. Welch’s adjustment was incorporated for unequal variances. Mann-Whitney U tests were performed instead if the normality assumption (Shapiro-Wilk test) was not met. Significant differences were considered when p < 0.05 in two-tailed tests. In the multivariate-survival approach, a data-driven variable and model selection was performed. First, variables with missing values (all laboratory tests had >20% missing values) or with zero variance (CAD and stroke) were excluded. Then, multicollinearity was assessed and variables with a high variance inflation factor (VIF) were excluded (atrial fibrillation; VIF > 5). The remaining variables (VIF < 2) were used to fit a Cox proportional hazards (PH) survival model in the study population. Then, using Akaike information criterion (AIC), variable selection and model evaluation were performed in a stepwise manner until reaching a model with the optimal explanatory variables. Significant differences were considered when p < 0.05 in two-tailed tests.


EHRs from 2,185,060 patients were processed from 12 participating hospitals between January 1, 2014 and December 31, 2018. The estimated prevalence of T2D in the hospital population was 9.96% (n = 217,632). The target population (patients diagnosed with T2D, CAD, and documented PCI revascularization with no previous history of MI or stroke) comprised a total of 1,579 patients (0.72% of the T2D population; Fig 1). The demographic and clinical characteristics of patients at time of inclusion in the study are shown in Table 1. Most patients were male (72.13%; n = 1,139), with a mean age of 70.5±10 years.

Table 1. Demographics, substance use, vital signs, and comorbidities at baseline.

Cardiovascular diseases (other than CAD) and endocrine/metabolic disorders were the most common comorbidities in the target population (Table 1); 88.41% (n = 1,369) of patients suffered from hypertension, 68.59% (n = 1,083) angina, 41.74% (n = 659) valvular disease, 30.53% (n = 482) atrial fibrillation/atrial flutter, and 22.86% (n = 361) heart failure. A diagnosis of hyperlipidemia was found in 40.41% (n = 638) of the patients and tobacco use in 25% (n = 394). Regarding respiratory disorders, COPD/Asthma was present in 17.04% of patients (n = 269) and sleep apnea in 12.86% (n = 203). Chronic kidney disease was diagnosed in 17.04% (n = 269) of the patients, peripheral artery disease in 17.48% (n = 276), diabetic retinopathy in 7.47% (n = 118), and diabetic neuropathy in 2.6% (n = 41).

As shown in Table 2, most patients were non-insulin-dependent T2D (61.75%; n = 975). Regarding CAD type, more than half of patients had been diagnosed with MVD (58.52%; n = 924), a third of the population had single-vessel CVD (33.69%; n = 532,) and 1.37% (n = 20) of patients had a diagnosis of left main artery coronary disease (LMACD). Table 2 also shows the time passed since PCI, CABG, and coronary angiography were last performed at time of analysis.

We also used the NLP system to extract unstructured information on laboratory results captured by physicians in their clinical notes. Despite the relatively high proportion of patients with missing unstructured laboratory information in their EHRs, we obtained data for several laboratory parameters for more than half of the study sample (S2 Table in S1 File). At inclusion in the study, the median (Q1-Q3) HbA1c was 7.1% (6.4–8), HDL was 40 mg/dl (33–47), and LDL 77 mg/dl (62–96.6).

The pharmacological treatments prescribed for the management of T2D and CAD are show in Fig 2A and 2B, respectively. As for oral hypoglycemic agents, the most used single-drug treatments were metformin (79.1%; n = 1,249), sulfonylureas (23.12%; n = 365), and DPP4i (19.38%; n = 306). Insulin treatment was documented in 26.16% (n = 413) of patients (Fig 2A). As depicted in Fig 2B, we gathered information regarding cardiovascular treatments: statins (90.5%; n = 1,429), ACE inhibitors or ARBs (86.26%; n = 1,362) and beta blockers (76.69%; n = 1,211). As for oral antiplatelet agents, the most prescribed were ASA (85.88%; n = 1,356) and dual antiplatelet therapy (50.28%; n = 794).

Fig 2. T2D- and CAD-related medication at baseline.

Percentage of patients prescribed with different medications for T2D (A) and CAD (B). Numbers within bars represent number of patients. *Any fixed combination of two of the above oral hypoglycemic agents (i.e., two or more active substances combined in one single prescription). **Other include clopidogrel, prasugrel, ticagrelor, and other lipid-lowering drugs. ***DAPT refers to ASA plus other anti-platelet drug. #K-vitamin antagonists include warfarin (n = 2; 0.13%) and acenocumarol (n = 185; 11.72%) Non-K-vitamin antagonists include heparins (n = 121; 7.66%), direct thrombin inhibitors (n = 34; 2.15%), direct factor Xa inhibitors (n = 23; 1.46%), and fondaparinux (n = 14; 0.89%). ACE = angiotensin-converting enzyme inhibitors; ARB = angiotensin II receptor blockers; ASA = acetylsalicylic acid; DAPT = dual antiplatelet therapy; DPP4i = dipeptidyl peptidase 4 inhibitors; SGLT2i = sodium-glucose cotransporter 2 inhibitors; GLP1 = Glucagon-like peptide-1; FA = Fast acting; IA = Intermediate acting; LA = Long acting.

The median (Q1, Q3) follow-up duration was 1.2 (0.3–4.5) years. During this period, this study aimed to document the cumulative incidence and rates of MACE (MI, ischemic stroke, hospitalization for unstable angina, and urgent coronary revascularizations). Because MACE were aligned to the nature of the data source and methodology used, all-cause or CV death cannot be included in the analysis (EHRs only capture in-hospital death). S3 Table in S1 File shows the overall incidence rates and cumulative incidence of MACE; 35.66% (n = 563) of patients suffered from at least one MACE event during follow up. The probability of suffering any MACE as well as MACE subtypes over the follow-up period is shown in Fig 3.

Fig 3. Probability of MACE over time during the follow-up period.

Probability for any MACE event (black), myocardial infarction (red), ischemic stroke (green), hospitalization for unstable angina (orange), and urgent revascularization (blue) during follow up. The number of patients at risk (same for all categories) across the follow-up period is indicated below.

Finally, we sought to determine the clinical characteristics associated with MACE during follow up. In a univariate analysis, T2D and CAD disease durations (both p = 0.001), heart failure (p = 0.004), MVD (p = 0.005), diabetic retinopathy (p = 0.012), and COPD/asthma (p = 0.014) showed statistically significant association with MACE in the follow-up period (S4 Table in S1 File). Regarding treatments, MACE was associated with prescription of insulin (p < 0.001), antiplatelet agents (p = 0.001), diuretics (p = 0.006), and statins (p = 0.002; S4 Table in S1 File).

Using a Cox Proportional Hazards regression model analysis, we found 7 independent predictors of MACE occurrence during follow up, including CAD duration (HR = 0.77; 95% CI = 0.72–0.84; p < 0.001), COPD/Asthma (HR 1.28; 95% CI = 1.04–1.59; p = 0.021), heart valve disease (HR = 1.22; CI 95% = 1.02–1.46; p = 0.031), MVD (HR = 1.27; 95% CI = 1.07–1.51; p = 0.005), insulin treatment (HR = 1.53; CI 95% = 1.26–1.85; p < 0.001), statins treatment (HR = 0.62; CI 95% = 0.48–0.81; p < 0.001), and clopidogrel treatment (HR = 0.83; CI 95% = 0.70–0.99; p = 0.039) (Table 3).

Table 3. Multivariate model of factors associated with the occurrence of MACE during follow up.


Using NLP and machine learning techniques, we were able to access and analyze the free-text clinical information in the EHRs of a large series of patients with T2D and CAD with no history of MI or stroke who underwent PCI revascularization in Spain. Our results provide a thorough characterization of these patients, including demographics, disease characteristics, comorbidities, medical management, and clinical factors associated with the occurrence of MACE.

Our study sample was extracted from a target population of 217,632 T2D patients. This number represents an estimated prevalence of diagnosed T2D of 9.9%. This estimate, calculated in the hospital population, is slightly higher than those previously reported in the general Spanish population yet falls within the 7%-14% age-adjusted range in the last ten years using classic epidemiological methods [2530].

The patients in our series were predominantly male, older than 70 years, and almost 60% of them suffered from MVD. Most patients were being treated with cardiovascular prevention medication (statins: 90.5% of patients, ACEi/ARBs: 86.2%, beta blockers: 76.6%, and antiplatelet therapy: 85.8%), while 50.2% were treated with dual antiplatelet therapy. These demographics and treatment data are aligned with the recent observational study using 2013–2014 data from the Diabetes Collaborative Registry linked to Medicare administrative claims (ATHENA study) [31]. In this cohort, the distribution of cardiovascular prevention medication was as follows: statins 84.2%, ACEi/ARBs 80%, beta blockers 79.2%, at least one antiplatelet agent 91.3%, and dual antiplatelet therapy in 32% of the patients. In addition, a drug-eluting stent was implanted in more than half of the patients (thus comparable to the 60% in the THEMIS-PCI trial) and the time since the most recent PCI was 3.2 years, again very similar to the THEMIS-PCI trial patients [13]. These findings indicate that the PCI population represents a very-high-risk patient group among those with CAD and concomitant T2D and suggest similar levels of care documented in both clinical trials and our real-world study.

The reported treatment data must be interpreted considering existing clinical recommendations. Current clinical guidelines recommend the use of antiplatelet therapies in patients with T2D and prior CV disease, but not in those with low CV risk [11]. However, guidelines are less clear on their recommendations for the use of antiplatelet therapies in patients with T2D and established CV disease without previous ischemic events. Recent clinical trials have addressed whether antiplatelet therapies reduce the incidence of ischemic events in patients with T2D. The ASCEND trial (A Study of Cardiovascular Events iN Diabetes) assessed the absolute benefits of ASA in patients with T2D and no evident CV disease and showed that the prevention of serious vascular events was largely counterbalanced by bleeding risk [32]. On the other hand, dual antiplatelet regimens demonstrated a clear benefit in patients with T2D and a previous history of MI exceeding 1 year. In the PEGASUS-TIMI 54 trial (Prevention of Cardiovascular Events in Patients With Prior Heart Attack Using Ticagrelor Compared to Placebo on a Background of Aspirin-Thrombolysis In Myocardial Infarction 54), patients with T2D had a greater absolute reduction in the risk of major cardiovascular events (cardiovascular death, MI or stroke) than patients without T2D when treated with a combination of ticagrelor and ASA [33]. Finally, the THEMIS trial, was designed to evaluate the efficacy and safety of ticagrelor 60mg bid added to ASA therapy for the prevention of major CV events in patients with T2D and established CAD and without a history of previous MI or stroke [34]. This study showed that the incidence of ischemic events was lower in patients receiving ASA and ticagrelor than in those receiving ASA and placebo. However, the incidence of major bleeding events was significantly higher in the ticagrelor group than in the placebo group [34]; in spite of these results, a significantly net clinical benefit was demonstrated in the pre-specified subgroup analysis of patients who underwent PCI (THEMIS-PCI) [13]. Thus, the long-term dual antiplatelet therapy could be beneficial in T2D patients with stable CAD and previous PCI, with low bleeding risk and high ischemic risk.

Available evidence indicates that T2D leads to impaired microvascular function [35] and increased risk for MACE in patients with stable manifestations of atherothrombosis. In addition, although adjunctive procedures such as thrombus aspiration prior to PCI intervention may improve ST-elevation MI (STEMI) outcomes in hyperglycemic patients [36], revascularized patients with T2D and stable CAD are still at high risk of ischemic events. Indeed, hyperglycemic-STEMI patients can experience adverse cardiovascular events such as restenosis and no-reflow despite PCI intervention [37, 38]. Notably, a recent study pointed to the involvement of the miR33/SIRT1 protein pathway in the inflammatory and coagulative processes of hyperglycemic coronary thrombi, which was in turn associated with rehospitalization and mortality in these patients at 1-year follow up [39]. Here, at least one of the MACE considered (MI, ischemic stroke, hospitalization due to unstable angina, and urgent coronary revascularization) was documented in over a third of patients during follow up. The 5-year cumulative incidence of MACE was 35.6% (rate of 225.7 per 1,000 person-year), with an incidence of MI of 17.5% (rate of 63.7 per 1,000 person-year), and ischemic stroke of 5% (rate of 16.9 per 1,000 person-year). In this line, the ATHENA study analyzed two cohorts of patients with T2D, namely patients at high cardiovascular risk (THEMIS-like cohort; n = 56,040) and patients at high cardiovascular risk or taking P2Y12 inhibitors (CAD-T2D cohort; n = 69,790). In ATHENA, the event rates in 100 person-years (THEMIS-like vs. CAD-T2D cohorts) for the composite outcome were 16.34 (95% CI: 16.31–16.37) vs. 17.64 (17.61–17.67), for MI 5.2 (5.19–5.23) vs. 5.5 (5.49–5.52), and for ischemic stroke 5.3 (5.37–5.41) vs 6.1 (6.11–6.15) [31]. Patients in the THEMIS-like cohort and the broader CAD-T2D population had substantial cardiovascular event rates, in turn indicating these patients are at an increased cardiovascular risk [13]. However, the high rates of ischemic events (MI and ischemic stroke) in the REACH and ATHENA studies, as well as in the present study, are likely attributable to the clinical profile of the patients enrolled, with a high proportion of elderly patients with multiple comorbidities. In this regard, it is important to note that the nature of our study methodology did not allow us to provide realistic mortality data since we only had access to in-hospital mortality.

In our study, we found that the risk of MACE in T2D-THEMIS-like patients according to the multivariate model was associated with the extent and severity of atherothrombosis (MVD) and with other risk factors such as history of heart valve disease and respiratory disorders (COPD/asthma). These risk factors have previously been shown to affect outcomes in general population and diabetic patients at high risk for ischemic events, specifically in patients with T2D and CAD with no history of MI or stroke and previous PCI revascularization [4043]. Finally, the multivariate approach also revealed that such treatments as clopidogrel and statins were associated to a lower risk for MACE after adjusting for other factors in this clinical population; further research is warranted to confirm these findings in future studies.


The results of the present study should be interpreted in light of the following limitations. First, our results are based on data captured directly from the unstructured, free-text narratives in patients’ EHRs. These findings are limited by the availability and accuracy of EHRs, and by the actual information that is jotted down by physicians in their routine practice. In this context, it is difficult to differentiate between “true zero” values, missing data, or unspecified information. Second, unlike classical studies or clinical trials, this was a retrospective analysis based on real-world data. The availability of an electronic record for a given patient does not guarantee that all desired variables will be present. Similarly, we captured information from patients with both single and multiple hospital visits, which may contribute to the heterogeneity of the data. Third, regarding laboratory results and other variables with substantial missing datapoints, it should be noted that physicians might not explicitly include this information as free text, but instead capture general assessments or an overall conclusion regarding the patients’ health status. Finally, the MACE events included in the analyses were limited by the information available in EHRs. In this line, all-cause and cardiovascular mortality were not reported since we only had access to in-hospital mortality.


This retrospective, observational, and real-world study represents the first attempt to combine NLP and machine learning to explore the unstructured information from EHRs in such a large series of PCI-revascularized patients with T2D and CAD with no history of MI or stroke in Spain. Using a multicenter approach, we were able to collect large amounts of patients’ longitudinal information, describe the clinical profile of these patients, and establish associations between MACE and clinical variables. Our results showed substantial rates of cardiovascular events in THEMIS-PCI-like Spanish patients. Regarding current management and risk factors for cardiovascular events, we replicated previously published findings based on traditional research approaches while offering new insights and hypothesis that could be explored in clinical trials and routine clinical practice studies.


  1. 1. Ogurtsova K, da Rocha Fernandes JD, Huang Y, Linnenkamp U, Guariguata L, Cho NH, et al. IDF Diabetes Atlas: Global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes Res Clin Pract. 2017;128:40–50. pmid:28437734
  2. 2. Saeedi P, Petersohn I, Salpea P, Malanda B, Karuranga S, Unwin N, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9(th) edition. Diabetes Res Clin Pract. 2019;157:107843. pmid:31518657
  3. 3. Rao Kondapally Seshasai S, Kaptoge S, Thompson A, Di Angelantonio E, Gao P, Sarwar N, et al. Diabetes mellitus, fasting glucose, and risk of cause-specific death. N Engl J Med. 2011;364(9):829–41. pmid:21366474
  4. 4. Shah AD, Langenberg C, Rapsomaniki E, Denaxas S, Pujades-Rodriguez M, Gale CP, et al. Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1·9 million people. The Lancet Diabetes & Endocrinology. 2015;3(2):105–13.
  5. 5. Cavender MA, Steg PG, Smith SC, Eagle K, Ohman EM, Goto S, et al. Impact of Diabetes Mellitus on Hospitalization for Heart Failure, Cardiovascular Events, and Death. Circulation. 2015;132(10):923–31. pmid:26152709
  6. 6. Paolisso P, Foà A, Bergamaschi L, Angeli F, Fabrizio M, Donati F, et al. Impact of admission hyperglycemia on short and long-term prognosis in acute myocardial infarction: MINOCA versus MIOCA. Cardiovasc Diabetol. 2021;20(1):192. pmid:34560876
  7. 7. Aronson D, Edelman ER. Coronary artery disease and diabetes mellitus. Cardiol Clin. 2014;32(3):439–55. pmid:25091969
  8. 8. Berry C, Tardif J-C, Bourassa MG. Coronary Heart Disease in Patients With Diabetes: Part II: Recent Advances in Coronary Revascularization. Journal of the American College of Cardiology. 2007;49(6):643–56. pmid:17291929
  9. 9. Mavromatis K, Samady H, King SB 3rd. Revascularization in patients with diabetes: PCI or CABG or none at all. Curr Cardiol Rep. 2015;17(3):565. pmid:25676827
  10. 10. Guía ESC 2019 sobre diabetes, prediabetes y enfermedad cardiovascular, en colaboración con la European Association for the Study of Diabetes (EASD). Revista Española de Cardiología. 2020;73(5):404.e1–.e59.
  11. 11. Fihn SD, Blankenship JC, Alexander KP, Bittl JA, Byrne JG, Fletcher BJ, et al. 2014 ACC/AHA/AATS/PCNA/SCAI/STS Focused Update of the Guideline for the Diagnosis and Management of Patients With Stable Ischemic Heart Disease. Circulation. 2014;130(19):1749–67. pmid:25070666
  12. 12. Neumann F-J, Sousa-Uva M, Ahlsson A, Alfonso F, Banning AP, Benedetto U, et al. 2018 ESC/EACTS Guidelines on myocardial revascularization. European Heart Journal. 2018;40(2):87–165.
  13. 13. Bhatt DL, Steg PG, Mehta SR, Leiter LA, Simon T, Fox K, et al. Ticagrelor in patients with diabetes and stable coronary artery disease with a history of previous percutaneous coronary intervention (THEMIS-PCI): a phase 3, placebo-controlled, randomised trial. The Lancet. 2019;394(10204):1169–80. pmid:31484629
  14. 14. Turgeon RD, Koshman SL, Youngson E, Har B, Wilton SB, James MT, et al. Association of Ticagrelor vs Clopidogrel With Major Adverse Coronary Events in Patients With Acute Coronary Syndrome Undergoing Percutaneous Coronary Intervention. JAMA Intern Med. 2020;180(3):420–8. pmid:31930361
  15. 15. Sohrabi B, Ghaffari S, Habibzadeh A, Chaichi P. Outcome of diabetic and non-diabetic patients undergoing successful percutaneous coronary intervention of chronic total occlusion. J Cardiovasc Thorac Res. 2011;3(2):45–8. pmid:24250951
  16. 16. Yang Y, Park G-M, Han S, Kim Y-G, Suh J, Park HW, et al. Impact of diabetes mellitus in patients undergoing contemporary percutaneous coronary intervention: Results from a Korean nationwide study. PLOS ONE. 2018;13(12):e0208746. pmid:30532214
  17. 17. Bittl JA. Percutaneous coronary interventions in the diabetic patient: where do we stand? Circ Cardiovasc Interv. 2015;8(4):e001944. pmid:25788342
  18. 18. Verma S, Farkouh ME, Yanagawa B, Fitchett DH, Ahsan MR, Ruel M, et al. Comparison of coronary artery bypass surgery and percutaneous coronary intervention in patients with diabetes: a meta-analysis of randomised controlled trials. The Lancet Diabetes & Endocrinology. 2013;1(4):317–28. pmid:24622417
  19. 19. Izquierdo JL, Ancochea J, Soriano JB. Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing. J Med Internet Res. 2020;22(10):e21801. pmid:33090964
  20. 20. Del Rio-Bermudez C, Medrano IH, Yebes L, Poveda JL. Towards a symbiotic relationship between big data, artificial intelligence, and hospital pharmacy. Journal of Pharmaceutical Policy and Practice. 2020;13(1):75. pmid:33292570
  21. 21. Canales L, Menke S, Marchesseau S, D’Agostino A, Del Rio-Bermudez C, Taberna M, et al. Assessing the Performance of Clinical Natural Language Processing Systems: Development of an Evaluation Methodology. JMIR Med Inform. 2021;9(7):e20492. pmid:34297002
  22. 22. Gomollón F G JP, Guerra I. (…) and Montoto C. Clinical Characteristics and Prognostic Factors for Crohn’s Disease Relapses using Natural Language Processing and Machine Learning–a Pilot Study. European Journal of Gastroenterology & Hepatology. in press.
  23. 23. Izquierdo JL, Almonacid C., González Y., Del Rio-Bermudez C., Ancochea J., Cárdenas R., et al. The impact of COVID-19 on patients with asthma. European Respiratory Journal. 2020;In press.
  24. 24. Ancochea J, Izquierdo JL., Medrano IH., Porras A., Serrano M., Lumbreras S., et al. Evidence of gender differences in the diagnosis and management of COVID-19 patients: an analysis of Electronic Health Records using Natural Language Processing and machine learning. Journal of Women Health. 2020;In press. pmid:33416429
  25. 25. Lopez-Bastida J, Boronat M, Moreno JO, Schurer W. Costs, outcomes and challenges for diabetes care in Spain. Global Health. 2013;9:17. pmid:23635075
  26. 26. López Rey MJ, Docampo García M. Change over time in prevalence of diabetes mellitus (DM) in Spain (1999–2014). Endocrinol Diabetes Nutr. 2018;65(9):515–23. pmid:30078493
  27. 27. Soriguer F, Goday A, Bosch-Comas A, Bordiú E, Calle-Pascual A, Carmena R, et al. Prevalence of diabetes mellitus and impaired glucose regulation in Spain: the Study. Diabetologia. 2012;55(1):88–93. pmid:21987347
  28. 28. Goday A, Delgado E, Díaz Cadórniga F, De Pablos P, Vázquez JA, Soto E. Epidemiología de la diabetes tipo 2 en España. Endocrinología y Nutrición. 2002;49(4):113–26.
  29. 29. Huerta JM, Tormo M-J, Chirlaque M-D, Gavrila D, Amiano P, Arriola L, et al. Risk of type 2 diabetes according to traditional and emerging anthropometric indices in Spain, a Mediterranean country with high prevalence of obesity: results from a large-scale prospective cohort study. BMC Endocrine Disorders. 2013;13(1):7. pmid:23388074
  30. 30. Wild S, Roglic G, Green A, Sicree R, King H. Global Prevalence of Diabetes. Diabetes Care. 2004;27(5):1047. pmid:15111519
  31. 31. Wittbrodt E, Bhalla N, Andersson Sundell K, Gao Q, Dong L, Cavender MA, et al. Assessment of the high risk and unmet need in patients with CAD and type 2 diabetes (ATHENA): US healthcare resource utilization, cost and burden of illness in the Diabetes Collaborative Registry. Endocrinology, Diabetes & Metabolism. 2020;3(3):e00133. pmid:32704557
  32. 32. Bowman L, Mafham M, Stevens W, Haynes R, Aung T, Chen F, et al. ASCEND: A Study of Cardiovascular Events iN Diabetes: Characteristics of a randomized trial of aspirin and of omega-3 fatty acid supplementation in 15,480 people with diabetes. Am Heart J. 2018;198:135–44. pmid:29653635
  33. 33. Bonaca MP, Bhatt DL, Cohen M, Steg PG, Storey RF, Jensen EC, et al. Long-Term Use of Ticagrelor in Patients with Prior Myocardial Infarction. New England Journal of Medicine. 2015;372(19):1791–800.
  34. 34. Steg PG, Bhatt DL, Simon T, Fox K, Mehta SR, Harrington RA, et al. Ticagrelor in Patients with Stable Coronary Disease and Diabetes. New England Journal of Medicine. 2019;381(14):1309–20. pmid:31475798
  35. 35. Gallinoro E, Paolisso P, Candreva A, Bermpeis K, Fabbricatore D, Esposito G, et al. Microvascular Dysfunction in Patients With Type II Diabetes Mellitus: Invasive Assessment of Absolute Coronary Blood Flow and Microvascular Resistance Reserve. Front Cardiovasc Med. 2021;8:765071. pmid:34738020
  36. 36. Sardu C, Barbieri M, Balestrieri ML, Siniscalchi M, Paolisso P, Calabrò P, et al. Thrombus aspiration in hyperglycemic ST-elevation myocardial infarction (STEMI) patients: clinical outcomes at 1-year follow-up. Cardiovasc Diabetol. 2018;17(1):152. pmid:30497513
  37. 37. Marfella R, Siniscalchi M, Esposito K, Sellitto A, De Fanis U, Romano C, et al. Effects of stress hyperglycemia on acute myocardial infarction: role of inflammatory immune process in functional cardiac outcome. Diabetes Care. 2003;26(11):3129–35. pmid:14578250
  38. 38. Singh K, Hibbert B, Singh B, Carson K, Premaratne M, Le May M, et al. Meta-analysis of admission hyperglycaemia in acute myocardial infarction patients treated with primary angioplasty: a cause or a marker of mortality? European Heart Journal—Cardiovascular Pharmacotherapy. 2015;1(4):220–8. pmid:27532445
  39. 39. D’Onofrio N, Sardu C, Paolisso P, Minicucci F, Gragnano F, Ferraraccio F, et al. MicroRNA-33 and SIRT1 influence the coronary thrombus burden in hyperglycemic STEMI patients. J Cell Physiol. 2020;235(2):1438–52. pmid:31294459
  40. 40. Cano-García M, Millán-Gómez M, Sánchez-González C, Alonso-Briales JH, Muñoz-Jiménez LD, Carrasco-Chinchilla F, et al. Impacto de la revascularización coronaria percutánea de lesiones coronarias graves en ramas secundarias. Revista Española de Cardiología. 2019;72(6):456–65. pmid:29859894
  41. 41. Ho CH, Chen YC, Chu CC, Wang JJ, Liao KM. Postoperative Complications After Coronary Artery Bypass Grafting in Patients With Chronic Obstructive Pulmonary Disease. Medicine (Baltimore). 2016;95(8):e2926. pmid:26937939
  42. 42. Lin WC, Chen CW, Lu CL, Lai WW, Huang MH, Tsai LM, et al. The association between recent hospitalized COPD exacerbations and adverse outcomes after percutaneous coronary intervention: a nationwide cohort study. Int J Chron Obstruct Pulmon Dis. 2019;14:169–79. pmid:30655664
  43. 43. Almagro P, Lapuente A, Pareja J, Yun S, Garcia ME, Padilla F, et al. Underdiagnosis and prognosis of chronic obstructive pulmonary disease after percutaneous coronary intervention: a prospective study. Int J Chron Obstruct Pulmon Dis. 2015;10:1353–61. pmid:26213464