Metabolic syndrome in Xinjiang Kazakhs and construction of a risk prediction model for cardiovascular disease risk

Background The high prevalence of metabolic syndrome (MetS) and cardiovascular diseases (CVD) is observed among Kazakhs in Xinjiang. Because MetS may significantly predict the occurrence of CVD, the inclusion of CVD-related indicators in metabolic network may improve the predictive ability for a CVD-risk model for Kazakhs in Xinjiang. Methods The study included 2,644 subjects who were followed for 5 years or longer. CVD cases were identified via medical records of the local hospitals from April 2016 to August 2017. Factor analysis was performed in 706 subjects (267 men and 439 women) with MetS to extract CVD-related potential factors from 18 biomarkers tested in a routine health check-up, served as a synthetic predictor (SP). We evaluated the predictive ability of the CVD-risk model using age and SP, logistic regression discrimination for internal validation (n = 384; men = 164, women = 220) and external validation (n = 219; men = 89, women = 130), calculated the probability of CVD for each participant, and receiver operating characteristic curves. Results According to the diagnostic criteria of JIS, the prevalence of MetS in Kazakh was 30.9%. Seven potential factors with a similar pattern were obtained from men and women and comprised the CVD predictors. When predicting CVD in the internal validation, the area under the curve (AUC) were 0.857 (95%CI 0.807–0.898) for men and 0.852 (95%CI 0.809–0.889) for women, respectively. In the external validation, the AUC to predict CVD were 0.914 (95%CI 0.832–0.963) for men and 0.848 (95%CI 0.774–0.905) for women. It is suggested that SP might serve as a useful tool in identifying CVD with in Kazakhs, especially for Kazakhs men. Conclusions Among 7 potential factors were extracted from 18 biomarkrs in Kazakhs with MetS, and SP may be used for CVD risk assessment.

Introduction completed in 2,644 permanent residents older than 18 years (1,085 men and 1,559 women). Between April 2016 and August 2017, 2,286 subjects from the baseline population who participated for 5 years or more in the follow-up survey were evaluated for the incidence of CVD; the follow-up rate was 86.46%. Of the them in the baseline population, 706 were first diagnosed with MetS according to the Joint Interim Statement [6], with a prevalence of 30.88% (the ageadjusted prevalence of 28.61%), including 267 men and 439 women. Exclusion criteria at baseline identified 281 subjects with CVD and the remaining 2005 were analyzed. Of 706 subjects with MetS, 191 (82 men and 109 women) had CVD diagnosed by physicians, as case group. Controls (n = 384; men = 164, women = 220) were randomly selected from individuals without MetS or CVD in the case-control study design. These subjects were used for internal verification. Another Kazakh research site, Halabula Township, was the source of 243 external verification controls, selected in August 2017 at the end of the information collection period. Of these, 24 were excluded because of CVD or lack of information at baseline, and 219 (89 men, 130 women) were included for analysis. By the end of the study, CVD developed in 30 of the 219. The cumulative incidence rate was 13.7% (12.4% for men and 14.6% for women).

Methods
All participants filled out questionnaires covering demographic data and the history of disease. Physical examination including height, weight, WC, hip circumference and blood pressure, height and weight requirements of wearing light clothes to measure. BAI was calculated as [hip circumference (cm) / (height (m)) 1.5 -18]. After each subject was rested for 15 min, each subject measured the blood pressure three times using a mercury sphygmomanometer and then calculated the average value. Each participant signed an informed consent form. The study was approved by the Institutional Ethics Review (IERB) for the First Affiliated Hospital of Shihezi University (IERB No.SHZ2010LL01). Outcome information was obtained directly by examining the medical records of the local hospital discharge records and medical insurance.

Definition of MetS and CVD
Based on the diagnostic criteria recommended by the JIS [6], MetS was defined as presence of three or more of the following five risk factors: 1) WC!85 and 80 cm for male and female, respectively; 2) Triglyceride (TG)!1.70mmol/L; 3) HDL-C<1.00 mmol/L in male and HDL-C<1.30 mmol/L in female; 4) SBP!130mmHg or DBP!85mmHg; 5) FPG!5.6mmol/L.
The following event is identified as an ending event: 1) Hospitalizations for CVD at followup; coronary intervention (cardiac catheterization or coronary bypass), angina (or nitroglycerin after cohort study) and CVD death (ICD9: codes 390-495); 2) According to the hospital medical records of hospitalized dued to the following reasons: coronary artery atherosclerosis, coronary heart disease, unstable angina, myocardial infarction, heart failure, stroke, transient cerebral ischemia and peripheral vascular disease (abdominal aortic aneurysm, peripheral vascular surgery or carotid endarterectomy); 3) The CVD events were recorded according to the hospitalization records and questionnaires. If two or more events of the same class of subjects occur, the first occurrence was the outcome event.

Statistical analysis
Descriptive analysis. For patients with MetS, Student's t test (for continuous variables) was used to evaluate significant differences between men and women for 18 biomarkers.
Steps of the development of synthetic predictor. EFA with principal component algorithm and varimax rotation from correlation matrix was performed to extract independent factors of MetS from above 18 manifest biomarkers for male and female MetS groups respectively. The criteria for retaining factors were set up as eigenvalue>1 as well as accounting for 75% of the total variation. Only variables that shared at least 15% of the factor variance, corresponding to a factor loading of at least 0.45 were used for further analytical interpretation. After EFA, the clinical significance of each latent factor was named, and Synthetic predictor (SP) [19] was created using a weighted approach: SP = γ 1 F 1 +γ 2 F 2 +. . . + γ n F n , where F 1 , F 2 , . . .Fn were the extracted independent factors with specific clinical significance from the 18 manifest biomarkers, and γ 1 , γ 2 , . . .γ n denoted their risks to CVD, which were partial regression coefficients in LRD regression models described below.
The logistic regression discrimination (LRD), on the basis of the extreme case-control design, was performed using the model, where P was the probability of CVD. B denoted the discriminant vector estimated by logistic regression, where B = β 0 + β 1 age + γ 1 F 1 + γ 2 F 2 Á Á Á γ n F n or B = β 0 + β 1 age + SP.
According to the logistic regression discrimination, the results of the internal verification population and the external verifying crowd were evaluated respectively. The probability of each participants with CVD was calculated and the ROC curve was analyzed with the actual CVD follow-up outcome. Calculated the sensitivity and specificity of the disease probability at each cut-off point, and determine the cut-off point with superior sensitivity and specificity.
All statistical analysis for the predictive models was performed using SPSS version 19.0 for Windows. The area under the curve (AUC) for the receiver operating characteristic (ROC) curve analysis, together with the sensitivity, specificity, and cutoff P values, were determined using MedCalc software. A two-sided P value <0.05 was considered statistically significant.

Results
In our study, the prevalence of MetS was 30.88% (the standardized prevalence rate was 28.61%) at baseline. 359 CVD events were diagnosed, with a prevalence of 13.87%. Table 1 shows the distribution of age and 18 biomarkers between men and women with MetS. All variables were significantly different between genders. Of these biomarkers, age, weight, WC, SBP, DBP, FBG, FMN, ALT, AST, TBIL, IBIL, ALB, UA, CREA and BUN were higher in men than in women, while BAI, HDL-C, APOA and α-HBDH were higher in women than in men. S1 Table showed the baseline population distribution of age and the eighteen biomarkers between male and female groups, indicating that all variables except FMN, α-HBDH and MS were significantly different between men and women. S2 Table shows significant correlations for most biomarkers. In all subjects, weight, WC, SBP, DBP, HDL-C, APOA, ALT, AST, α-HBDH, TBIL, IBIL, ALB, UA, and CREA were significantly correlated with most variables analyzed. The BAI, FBG, FMN and CREA were correlated with approximately half of the variables analyzed. The Kaiser-Meyer-Olkin test values in the male and female groups were 0.636 and 0.632 respectively, and Bartlett's sphericity test for 2 groups had a significance of P<0.001. This suggested that the variables were not independent and were strongly correlated, making them suitable for EFA. Table 2 shows the explained variance, cumulative variance, and loading of the first 7 factors. The results suggested that total variance of 74.02% and 73.74% was explained by the first 7 factors for men and women, respectively. Following the criteria of interpretation mentioned in the statistical analysis section, 7 independent factors with their specific clinical significance were retained and named for 2 groups respectively. For the male group, the first factor was named Obesity factor (OF) and Although the order of decline in interpretation of differences between men and women is slightly different, their factor patterns were similar. S3 Table shows the standardized scoring coefficients of each factor for male and female groups. Figs 1 and 2 show the AUC with sensitivity, specificity, and the cut-off points of P values (criterion) for the LRD model from the internal validation design using age and SP as discriminant factors (for men: B = 0.733-0.039Age + SP; for women: B = 0.077-0.016Age + SP). In males, the AUC was 0.857 (95% confidence interval [CI]: 0.807-0.898), with a Youden's index value of 0.622 (Sen = 80.49%, Spe = 81.71%), and an optimal cut-off of 0.30. In females, the AUC was 0.852 (95% CI: 0.809-0.889), with a Youden's index value of 0.544 (Sen = 88.07%, Spe = 66.36%), and an optimal cut-off of 0.23. Figs 3 and 4 show the AUC with sensitivity, specificity, and cut-off points of P value (criterion) by LRD model from the external validation design using age and SP as discriminant factors (for men: B = -7.55 + 0.079Age + SP; for women: B = -5.104 + 0.051Age + SP). In males, the AUC was 0.914 (95% CI: 0.832-0.963), with a Youden's index value of 0.0.703 (Sen = 81.82%, Spe = 88.48%), and an optimal cut-off of 0.16. In females, the AUC was 0.848 (95% CI: 0.774-0.905), with a Youden's index value of 0.552 (Sen = 89.47%, Spe = 65.77%), and an optimal cut-off of 0.09. The predictive effect of the CVD model built by SP and age was better, and the predictive ability for CVD was better in Kazakhs men (Table 3 and Table 4).

Discussion
The standardized prevalence rate of MetS in Kazakhs was 28.61%, which was much higher than the national MetS prevalence rate of 16.5% [24]. Studies have shown that individuals with MetS are more likely to develop CVD [25,26], and a higher prevalence of MetS suggests that the risk of developing CVD in the Kazakh population may be higher. Note: Factors were named as Obesity factor(OF),Hepatic function factor (HFF), Lipid factor (LF), Enzyme metabolic factor(EMF),Blood pressure factor (BPF), Renal metabolic factor(RMF), Glucose metabolism factor(GMF). Bold indicates that the absolute value of the factor loading was >0.45.
Based on the MetS multi-etiology hypothesis [27], this study used EFA to extract 7 principal components according to sex from18 risk factors that represent 7 potential factors for MetS occurrence in Kazakhs. Of the 7 potential factors, the first 3 factors for men and women were OF (WC & weight & BAI), HFF (TBIL & IBIL), and LF (HDL-C & APOA). Among these, obesity and dyslipidemia are classic components of MetS [28,29]. Body mass index (BMI) and WC are commonly used as indicators of obesity, and some studies have suggested that body fat content is a major risk factor for MetS, because the BAI is better than the BMI in estimating body fat content [30,31]. In this study, weight, WC, and BAI were used as the obesity factors, providing more comprehensive results than the W Zhang [32] study using BMI and WC as the obesity factors. The main characteristic of blood lipids in this study was the high prevalence of decreased levels of HDL-C [33]. Therefore, we extracted HDL-C and APOA as blood lipid factors in MetS, and the contribution of 2 lipid variables was greater than the contribution of the 4 variables reported in a previous study [34], indicating that the lipids extracted in this study are more accurate, and that these factors can better reflect the lipid accumulation characteristics in Kazakhs with MetS. In Kazakh MetS patients with HFF as the dominant factor, male sex as the second factor, and female sex as the first factor, the results are different from those in the Han Chinese population [34]. HFF may be an important factor in the assessment of MetS in Kazakhs. The contribution of EMT (ALT & AST & α-HBDH) was not high in our study, but was an important indicator for the diagnosis of non-alcoholic fatty liver disease. Non-alcoholic fatty liver disease has been shown to increase the risk of developing MetS, and the prevalence is higher in Kazakhs than in local Han Chinese [35,36]. Therefore, ALT, AST and α-HBDH were selected as Kazakh MetS liver enzyme factors. In our study, the contribution of RMF (UA & CREA & BUN) was higher in men (9.00%) than in women (7.04%). In Wen-chao Zhang's study [15], RMF was not extracted for men, but the female RMF contribution rate (6.81%) was similar to the rate in this study. It may be due to the reason that this was not included in CREA, leading to a weak RMF aggregation in men. Blood pressure and glucose metabolism in this study were male's 6th and 7th factors, and female's 4th and 7th factors, both contributing less than 10%. Consistent with the findings of A Ghosh [37], the contribution of blood pressure and glycemic factors was least. If a study subject was diabetic [38,39], blood pressure and blood glucose were 1st and 2nd factors, suggesting that due to differences in study populations, the contribution of each factor to MetS was also different. Albumin is a common biochemical indicator and fluctuates in the human body due to disease. In this study, albumin was incorporated into male LF and female HFF, respectively. However, it was highly correlated with the routine physical examination indexes included in this study, and is retained for the accuracy of factor analysis.
In this study, a CVD predictive model was constructed using age and SP as predictors in the Kazakh population, with a predictive ability higher than that of the FRS (AUC approximately 0.80) [40]. This was also higher than the predictive ability achieved while using a Chinese evaluation method for 10-year risk of ischemic CVD (hereinafter referred to as the "Chinese evaluation method") in the optimal model (for men: AUC = 0.796; for women: AUC = 0.791) and the simple model (for men: AUC = 0.792; for women: AUC = 0.783) [41]. The Chinese evaluation method only used 8 risk factors as model predictors, which is the same as the number of predictive factors in our study. However, the potential factors in this study include obesity, blood pressure abnormalities, dyslipidemia, and glucose metabolism disorders. These cover the risk factors for CVD more comprehensively, and enhance the predictive ability of the model. Li Jiqing's findings showed that the male AUC in training samples was lower than that in the current study (0.837 vs 0.857) [16]. However, the AUC in females was better than the AUC in this study (0.897 vs 0.852). Although the study still used the Chinese evaluation method, it also incorporated electrocardiographic information that was highly correlated with CVD. These data directly reflect CVD status, and can significantly improve the predictive and diagnostic capabilities of the model. Using MetS as a predictor of CVD, the current study only includes the classic components [42,43], and overlooks the metabolic network of liver enzymes, renal function, bilirubin, and other components that affect CVD. However, Zhenxin Zhu's study [20] incorporated 8 factors of the MetS network into a CVD prediction model, with an AUC as high as 0.994 and 0.998 for men and women. We speculate that the present predictive model can fully cover the risk factors of CVD, which is key to improvement of the predictive ability. Therefore, predictors should cover both the risk factors and the incidence of CVD.
External verification is important for the assessment of the feasibility of the CVD model. The classic FRS model predicts a large predictive error for Hispanic, Asian, and elderly American populations, and significantly overestimates risk in a Han Chinese population [44], but has a better predictive ability for the Mongolian population [45]. Wang Fang's study of a Chinese health check-up group using the Chinese external validation method obtained an AUC of 0.717, which was significantly lower than the AUC in the optimal model and simple model [46]. It may be that because the outcome measured was carotid atherosclerotic plaque, the perdictive power was reduced. Using external verification to validate the established model, the AUC of men was 0.914 (95% CI: 0.832-0.963) and the AUC of women was 0.848 (95% CI: 0.774-0.905), both of which were greater than or equal to 0.85. This shows that the CVD model can be applied to local Kazakhs, especially for male Kazakh CVD prediction. Based on age and SP, this study can be used to predict the incidence of CVD in the local Kazakh population, which is conducive to early screening in primary health institutions, and can thus help reduce the overall incidence of CVD by controlling risk factors.
Our study had the following advantages. First, the study is based on the characteristics of local Kazakh disease. The prevalence rates of MetS, hypertension, and obesity in Kazakhs are significantly obviously higher than those in other ethnic groups residing in the same area. MetS was used to construct a CVD prediction model for the Kazakh population, as similar studies have not been reported. Second, the results of this study are based on indicators used in routine physical examinations. The indicators are all derived from national health checkup projects supported by the Chinese government. These indicators are easy to determine and are cost effective. Third, this study used EFA to extract 7 potential factors from 18 routine physical examination indicators to form a SP, and to a greater extent to strengthen the predictive ability of the model. This was not only a simple summation of factors, but also showed a correlation between CVD prediction results and potential factors. Finally, the model underwent internal and external validations to illustrate the applicability. Therefore, this model can be applied to local Kazakh residents for early prevention of CVD and for screening of high-risk individuals.
This study had some limitations. First, routine physical examination indicators cannot cover all risk factors for CVD, resulting in limited predictive ability for the model. Second, the sample size is not big enough. Since the Kazakh lifestyle is mainly nomadic, data were not collected for CVD patients who did not go to the hospital or those whose hospitalization records were missing. Therefore, this study may underestimate the cumulative incidence of CVD. Subsequent research can supplement follow-up information, increase the number of routine physical examination indicators, extract potential factors that can cover more comprehensive information, and improve the predictive ability of the model. Supporting information S1