Agreement between cardiovascular disease risk assessment tools: An application to the United Arab Emirates population

Introduction Evidence regarding the performance of cardiovascular disease (CVD) risk assessment tools is limited in the United Arab Emirates (UAE). Therefore, we assessed the agreement between various externally validated CVD risk assessment tools in the UAE. Methods A secondary analysis of the Abu Dhabi Screening Program for Cardiovascular Risk Markers (AD-SALAMA) data, a large population-based cross-sectional survey conducted in Abu Dhabi, UAE during the period 2009 until 2015, was performed in July 2019. The analysis included 2,621 participants without type 2 Diabetes and without history of cardiovascular diseases. The CVD risk assessment tools included in the analysis were the World Health Organization for Middle East and North Africa Region (WHO-MENA), the systematic coronary risk evaluation for high risk countries (SCORE-H), the pooled cohort risk equations for white (PCRE-W) and African Americans (PCRE-AA), the national cholesterol education program Framingham risk score (FRAM-ATP), and the laboratory Framingham risk score (FRAM-LAB). Results The overall concordance coefficient was 0.50. The agreement between SCORE-H and PCRE-W, PCRE-AA, FRAM-LAB, FRAM-ATP and WHO-MENA, were 0.47, 0.39, 0.0.25, 0.42 and 0.18, respectively. PCRE-AA classified the highest proportion of participants into high-risk category of CVD (16.4%), followed by PCRE-W (13.6%), FRAM-LAB (6.9%), SCORE-H (4.5%), FRAM-ATP (2.7%), and WHO-MENA (0.4%). Conclusions We found a poor agreement between various externally validated CVD risk assessment tools when applied to a large data collected in the UAE. This poses a challenge to choose any of these tools for clinical decision-making regarding the primary prevention of CVD in the country.


Introduction
Cardiovascular disease (CVD) is a leading cause of morbidity and mortality, contributing to 203 million disability-adjusted life years and 31% of deaths worldwide [1]. The number of CVD-related deaths is expected to increase from 17.7 million in 2015 to 23.6 million in 2030 [1]. In the United Arab Emirates (UAE), CVD contributes to 40% of all deaths and its incidence is increasing in young adults mainly because of an increase in the burden of its known risk factors such as abdominal obesity (71.5%), dyslipidemia (74.0%), hypertension (43.0%), and diabetes (32.4%) [2,3].
Primary prevention of CVD is considered as the most beneficial and cost-effective intervention strategy because of the high burden and treatment cost of CVD. Therefore, international clinical guidelines recommend incorporating CVD risk assessment tools into routine clinical practice for accurate identification and subsequent evidence-based treatment of individuals at a high risk of developing CVD [4].
At present, various externally validated CVD risk assessment tools are being used for primary and secondary prevention of CVD in the UAE. The list of these tools includes, but is not limited to, the systematic coronary risk evaluation (SCORE) charts used by the European Society of Cardiology and the European Atherosclerosis Society guidelines [5], the Pooled Cohort risk equation (PCRE) used by the American College of Cardiology/American Heart Association guidelines [6], the Framingham risk score for hard CHD used by the National Cholesterol Education Program-Adult Treatment Panel-III (FRAM-ATP) [7], the laboratory Framingham risk score for CVD (FRAM-LAB) [8], and the World Health Organization risk charts for Eastern Mediterranean and North Africa Region [9]. Except the WHO-MENA CVD risk chart, no other tool has yet been developed, validated, or calibrated in the UAE [10,11]. In addition, there is insufficient data on the performance and agreement between different externally validated CVD risk assessment tools that are currently used in the UAE. To the best of our knowledge, no study has been yet conducted in the UAE to explore this issue. Therefore, we conducted this analysis to assess the agreement between a variety of externally validated CVD risk assessment tools in the UAE. We also classified subjects into the high-risk category according to the thresholds of each risk assessment tool and then compared the percentages of high-risk subjects between different risk assessment tools.

Study design
We performed a secondary analysis of the Abu Dhabi Screening Program for Cardiovascular Risk Markers (AD-SALAMA) data. The AD-SALAMA is a large population-based cross-sectional survey, designed and conducted by the Cardiac Sciences Institute of Sheikh Khalifa Medical City (SKMC) in Abu Dhabi, UAE. The primary objective of this survey was to estimate the burden of major risk factors of CVD, to identify individuals at high risk for developing CVD, and to improve primary prevention of CVD. The study population included adult (25-75 years) men and women of different ethnic backgrounds who were working at various governmental and non-governmental entities in Abu Dhabi. Subjects were randomly invited to attend an educational event about cardiovascular disease risk factors where they were asked to voluntarily participate in the study.

Data collection and measurements
After obtaining the permission from participating entities, trained nurses of SKMC screened volunteer employees at either mobile clinics or visiting each entity on scheduled days. A total of 6069 subjects were screened during the period 2009 until 2015 at different governmental and non-governmental entities. Demographic and clinical data included age (years), sex, smoking, diabetes, height, weight, body mass index (BMI), systolic and diastolic blood pressure, hypertension, and use of hypertensive drugs. Blood samples were also collected to measure total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), triglyceride (TG), and glucose levels.

Ethical considerations
The study adhered to the ethical principles of the Declaration of Helsinki and was approved by Sheikh Khalifa Medical City (SKMC) human ethics research committee. Both verbal and written Informed consents were obtained from the study participants before data collection.

Data extraction
We computed the agreement in a sub-population of AD-SALAMA subjects without diabetes, without established CVD and those aged between 40 and 70 years. The reason for excluding diabetic subjects from the analysis is because they are already considered as high-risk subjects by most guidelines and also in order to make a fair comparison between different risk assessment tools as some of them do not use type 2 Diabetes as a predictor. The subjects with established CVD were excluded because we are investigating primary not secondary prevention. Therefore, we excluded 521 with type 2 diabetes mellitus, 1708 aged less than 30 or more than 70 years, and 1219 subjects with missing values in predictor's variables. This led to 2621 subjects with complete data on risk factors required for computing CVD risks. None of the subjects considered for the final analysis were under lipid lowering therapy.

CVD risk assessment tools
The CVD risk assessment tools included in the analysis were the WHO risk chart for the Eastern Mediterranean and North Africa region (WHO-MENA) [2], SCORE for high-risk countries (SCORE-H) [5], PCRE for White (PCRE-W), PCRE-AA (PCRE for African American) [6], FRAM-ATP (used in NCEP-ATP-III) [7], and the laboratory Framingham risk score for CVD (FRAM-LAB) [8], Details on the geographical location, age range, risk factors used for the derivation of these risk assessment tools, the corresponding guidelines, and the types of outcomes are reported in Table 1.

Statistical analysis
Descriptive statistics for quantitative and qualitative variables were provided to describe the characteristics of the population under study. Predicted 10-year CVD risks were computed using the original equations provided in their corresponding published papers. CVD risk equations derived from original publications were calculated using an online application developed by the corresponding author. The online application is based on the R Package Shiny and can be freely accessed through the following link [https://comparecvdandguide lines.shinyapps.io/internal-private-test/]. Overall and pairwise Lin's concordance correlation coefficients (CCC) were calculated to measure the degree of agreement overall and between each pair of risk assessment tools [12,13]. Lin's concordance coefficient quantifies the level of agreement between risk tools and ranges between -1 and 1 with the upper limit considered as a perfect agreement. The rule of thumb for a good level of agreement is a value of CCC over 0.9 [14]. We also calculated the percentage of subjects assigned to the high-risk category according to the threshold provided in the corresponding original guideline for each risk assessment tool. The thresholds used were 20% for WHO-MENA, 5% for SCORE-H, 7.5% for PCRE-W and PCRE-AA, 20% for FRAM-ATP, and 20% for FRAM-LAB [5][6][7][8]10]. In this analysis, we also assessed discrepancies between different risk assessment tools by reporting misclassification rates. This was investigated by calculating the percentage of low-risk subjects who were re-classified as high-risk according to other tools. In order to do this, the European risk assessment tool SCORE-H was used as a reference and was compared against all other risk assessment tools. The current analysis involves individuals with complete data on all risk factors required for the calculation of different risk scores. As sensitivity analysis, we used multiple imputation to make sure that missing data excluded from the current analysis does not affect the validity of the results. We re-run the whole analysis by imputing missing data using the "Multivariate Imputation by Chained Equations" method implemented in the mice package in R. All statistical analysis and data manipulation were performed in December 2019 using the R software version 3.5.2.

Characteristics of participants
Of the total number of subjects (N = 2,621), 558 (21%) were females and 221 (8.4%) were hypertensive. The demographic and clinical characteristics of this population are summarized overall and by gender in Table 2. Summaries of the 10-year risks of experiencing a CVD event according to each risk assessment tool discussed earlier are presented in Table 3. Except the WHO-MENA, all other risk assessment tools assigned significantly (p < 0.001) higher 10-year CVD risk scores to men compared to women. This difference in the average risk between men and women was expected because the CVD risk assessment tools used in this study assigned, in their original equations, a higher risk for men compared to women.

Agreement between risk assessment tools
Despite high values of Pearson's correlation coefficients between different risk assessment tools, except the WHO-MENA (Fig 1), we found low-to-moderate overall agreement between different risk assessment tools with an overall CCC equal to 0.50. This illustrates the structural difference between the correlation and agreement concepts. In fact, agreement is used for variables that are meant to measure the same construct, while correlation is used for variables that measure completely different constructs. When compared with SCORE-H, the CCCs for FRA-M-ATP, PCRE-AA, PCRE-W, FRAM-LAB, and WHO-MENA, were 0.42, 0.39, 0.47, 0.25 and 0.18, respectively. More information on all pairwise Lin's concordance coefficients is available in Table 4.

Identification of high CVD risk subjects according to different risk assessment tools
When each of the risk tools was used with thresholds specified in the guidelines, PCRE-AA classified the highest proportion of participants under the high-risk category (16.4%, 95%  (Fig 2). Furthermore, among the 2502 participants who were classified as low-risk subjects according to SCORE-H, 12.47% (n = 312), 9.63% (n = 241), 3.20% (n = 80), 0.96% (n = 24), and 0.12% (n = 3) were re-classified as high-risk subjects according to PCRE-AA, PCRE-W, FRAM-LAB, FRAM-ATP, and WHO-MENA, respectively (Fig 3).

Sensitivity analysis
To check whether the exclusion of patients with diabetes from the current analysis led to any bias, we analyzed the data by including these patients. The average risk estimates increased for all risk tools due to the inclusion of diabetic patients. However, the overall agreement (0.49) between risk tools remained similar to the one estimated in patients without diabetes (0.50).
We also re-run the whole analysis by imputing missing data using the "Multivariate Imputation by Chained Equations" method implemented in the mice package in R. The number of imputed datasets was set to 5 and the predictive mean matching was selected as the imputation method. The estimated overall agreement (CCC) in the 5 imputed data sets were very close to . Furthermore, all other quantities estimated in the complete case analysis (current analysis) were also very similar in the 5 data sets.

Discussion
In this study, we assessed the agreement between six CVD risk assessment tools that are commonly used in the UAE population. These risk assessment tools were compared by analyzing the data generated from a large screening program in Abu Dhabi, UAE. The analysis showed a low agreement between risk assessment tools in estimating the 10-year risk of CVD. The probability of classifying subjects into high-risk category of CVD varied substantially between risk assessment tools. As mentioned earlier, we found low overall agreement (50%) and low-to-moderate pairwise agreement (from 17% to 78%) between various tools. To the best of our knowledge, this is the first study comparing the agreement between CVD risk assessment tools in the UAE. Nonetheless, studies conducted in different populations have also yielded similar results regarding the agreement between CVD risk assessment tools. A systematic review of 20 studies reported wide variations in the level of agreement (from 22% to 100%) between various calculators [15]. Another study showed 14% to 44% agreement between different CVD risk assessment tools in the Peruvian population [16] . The analysis of hypothetical data from eight countries demonstrated the overall agreement of 64% between 25 CVD risk assessment tools [17]. We also noted that the WHO-MENA and the European-based SCORE-H had the lowest pairwise agreement with the US-based FRAM, PCRE, and ATP-III tools. Similar results were also reported in the Peruvian population and the systematic review [15,16].
In this study, the estimated 10-year risk of CVD varied in average from 1.1% to 7.6% between tools. FRAM-LAB and PCRE estimated a higher risk of CVD, whereas SCORE-H, FRAM-ATP-III, and WHO-MENA estimated a lower risk. Consistent with our findings, the average CVD risk estimation was the highest using FRAM and the lowest using WHO in the Indian population [18]. Conversely, PCRE showed the highest CVD risk, followed by FRAM, and SCORE estimated the lowest risk in Northern Iran [19]. Like CVD risk estimation, the probability of assigning participants to the high-risk category for CVD also varied significantly between risk assessment tools in our study. As such, the proportion of subjects classified into the high-risk category as per PCRE was 6 times higher than those as per FRAM-ATP-III and WHO-MENA. Similar to our results, the highest proportion of individuals were classified to be at high risk of CVD as per PCRE, followed by FRAM (non-lab) and SCORE in the Peruvian and Iranian population [16,19]. According to another study in India, the FRAM tool assigned more individuals to high CVD risk category compared to PCRE [18,20]. Despite the inconsistencies in CVD risk estimation among tools, the WHO tool has consistently produced the worst CVD risk estimation and classification in Malaysian, Indian, Peruvian, and the UAE populations [16,18,21].
A vast number of validation studies has confirmed the inconsistency in CVD risk evaluation between tools. The FRAM tool over-estimated the risk of fatal and non-fatal CVD when used in non-US populations [15,22,23]. Likewise, the PCRE has been reported to over-estimate the risk of CVD in Chinese and under-estimate the risk in South Asian American populations [22,24] . Consequently, these tools assign more individuals to lipid-lowering therapy compared with other tools. In contrast, both SCORE-H and WHO tools tend to under-estimate the risk of CVD in various populations worldwide, making a small proportion of individuals eligible to lipid-lowering therapy [15,23]. These disagreements suggest that clinicians should be wary of estimation differences presented by each tool before using them in clinical practice. In addition, it also means that there is a need of validating and/or recalibrating CVD tools in target populations to enhance the accuracy of these tools.
Many factors may have been responsible for the poor agreement between CVD risk assessment tools in our study. One such factor is the risk classification threshold, which differs among tools. For instance, the threshold to assign subjects to the high-risk category of CVD is 20% for FRAM-LAB, WHO-MENA, and FRAM-ATP, 7.5% for PCRE and 5% for SCORE-H [5][6][7][8]10]. The tools also differ from each other with regard to their CVD endpoints. FRAM--LAB and PCRE incorporate the risk for both fatal and non-fatal CVD events in their risk equations. SCORE-H estimates the risk of fatal CVD events only, FRAM-ATP-III predict the occurrence of hard CHD events, and the WHO tool includes both fatal and non-fatal myocardial infarction and/or stroke [5][6][7][8]10]. As CHD is a subset of CVD, the tools measuring only CHD have lower estimated risks than those measuring total CVD. This may explain why SCORE-H and ATP-III predicted lower CVD risks than those predicted by FRAM and WHO in our study. But since the agreement is based on the ranks rather than the absolute values of the risks, this argument might be not plausible. Poor agreement between CVD risk assessment tools can also occur because of source cohorts from which the individuals risk assessment tools are generated. The SCORE tool is derived from multiple European cohorts, whereas FRAM, NCEP-ATP-III, and PCRE are US-based tools generated from Framingham Heart Study, CARDIA, ARIC, and CHS cohorts [5][6][7][8]10]. These baseline cohorts differ vastly with respect to burden of underlying risk factors and CVD outcomes, therefore, resulting probably in a poor agreement of risk estimation and classification among tools. In our study, this may have been one of many reasons for a low level of agreement between the European SCORE-H tool and the US-based tools and moderate agreement among US-based tools. In comparison, WHO-MENA is based on a hypothetical cohort generated from the research evidence in Eastern Mediterranean countries [2]. Therefore, the WHO tool has shown poor risk estimation, correlation, and agreement with other tools in this study and previous research [15,23]. Type of risk factors included in risk equations are also responsible for the variation in risk prediction across CVD tools. In addition to other factors, FRAM-LAB, FRAM-ATP-III, and PCRE include hypertension (HTN) treatment, HDL cholesterol, and total cholesterol in their equations, whereas WHO does not include HTN treatment and HDL cholesterol and SCORE does not include HTN treatment in its risk equation [5][6][7][8]10].

Limitations
We used a cross-sectional data to compare the agreement between risk assessment tools. Therefore, we could not estimate the actual 10-year risk of CVD in the UAE. The low observed level of disagreement might be due to the heterogeneity in the primary outcomes between risk assessment tools. In fact, the FRAM-LAB and PCRE estimate the risk for both fatal and nonfatal CVD events whereas the SCORE-H estimates the risk of fatal CVD events and FRA-M-ATP-III estimates the risk for hard CHD. However, we believe that the agreement measure is not sensitive to this issue as it is based on the rank of the risks rather than their absolute values.

Conclusion
Our analysis has shown poor agreement between six commonly used tools for estimating and classifying the risk of CVD in the UAE population. This demonstrates the difficulty of choosing any of these tools for clinical decision-making and public health interventions with regard to the primary prevention of CVD in the country. Applying these risk assessment tools in clinical practice could lead to a considerable variability in identifying subjects at high risk of CVD and consequently those eligible for therapeutic interventions. Owing to the high burden of CVD in the UAE, there is an urgent need to improve the accuracy of the existing CVD risk assessment tools either by recalibrating and validating them or by generating newer tools based on the longitudinal data from this region.