Cumulative Effect and Predictive Value of Genetic Variants Associated with Type 2 Diabetes in Han Chinese: A Case-Control Study

Background Genome-wide association studies (GWAS) have identified dozens of single nucleotide polymorphisms (SNPs) associated with type 2 diabetes risk. We have previously confirmed the associations of genetic variants in HHEX, CDKAL1, VEGFA and FTO with type 2 diabetes in Han Chinese. However, the cumulative effect and predictive value of these GWAS identified SNPs on the risk of type 2 diabetes in Han Chinese are largely unknown. Methodology/Principal Findings We conducted a two-stage case-control study consisting of 2,925 cases and 3,281controls to examine the association of 30 SNPs identified by GWAS with type 2 diabetes in Han Chinese. Significant associations were found for proxy SNPs at KCNQ1 [odds ratio (OR) = 1.41, P = 9.91 × 10–16 for rs2237897], CDKN2A/CDKN2B (OR = 1.30, P = 1.34 × 10–10 for rs10811661), CENTD2 (OR = 1.28, P = 9.88 × 10-4 for rs1552224) and SLC30A8 (OR = 1.19, P = 1.43 × 10-5 for rs13266634). We further evaluated the cumulative effect on type 2 diabetes of these 4 SNPs, in combination with 5 SNPs at HHEX, CDKAL1, VEGFA and FTO reported previously. Individuals carrying 12 or more risk alleles had a nearly 4-fold increased risk for developing type 2 diabetes compared with those carrying less than 6 risk alleles [adjusted OR = 3.68, 95% confidence interval (CI): 2.76–4.91]. Adding the genetic factors to clinical factors slightly improved the prediction of type 2 diabetes, with the area under the receiver operating characteristic curve increasing from 0.76 to 0.78. However, the difference was statistically significant (P < 0.0001). Conclusions/Significance We confirmed associations of SNPs in KCNQ1, CDKN2A/CDKN2B, CENTD2 and SLC30A8 with type 2 diabetes in Han Chinese. The utilization of genetic information may improve the accuracy of risk prediction in combination with clinical characteristics for type 2 diabetes.


Introduction
Type 2 diabetes is a major health problem that affects more than 300 million individuals worldwide [1], and its prevalence is continuously increasing in many countries, especially in China [2]. Genetic factors contribute to the pathogenesis of type 2 diabetes. Identifying its relevant genetic variants is critical in the risk prediction and targeting of preventive interventions for type 2 diabetes. Success in identifying type 2 diabetes associated genetic variants leads to suggestions that they may be useful in predicting an individual's risk of developing the disease.
Here, we tested the association of 39 SNPs from 30 genes GWAS identified with type 2 diabetes in a two-stage case-control study consisting of 2,925 cases and 3,281 controls in Han Chinese. We further evaluated the joint effect of related genetic variants and the performance of these SNPs on risk prediction for type 2 diabetes.

Ethics statement
Written informed consent was obtained from every participant and this study was approved by the Ethical Committee of Nanjing Medical University.

Study subjects
The study was a two-stage (i.e. the discovery stage and the replication stage) case-control study and a total of 6,206 participants were included, which has been previously described [28]. All participants were derived from two community-based cross -sectional surveys. Subjects were considered to be type 2 diabetes cases if they had a history of type 2 diabetes or if their fasting blood glucose (FBG) was 7.0 mmol/l. Those without history of diabetes, hypertension, coronary heart disease, stroke, cancer and with FBG < 5.6 mmol/l were selected as controls. The discovery stage contained 1,200 type 2 diabetes cases and 1,200 controls while the replication stage contained 1,725 type 2 diabetes cases and 2,081 controls. All participants were unrelated and of Chinese Han ancestry residing in Jiangsu Province, China. allele frequency (MAF) < 0.05 in Han Chinese based on the HapMap database. For the rest 61 SNPs, as for multiple SNPs with strong linkage disequilibrium (LD) (r 2 0.8) in the same region, those frequently reported or residing in a functional region were selected in priority. As a result, 48 SNPs from 34 genes were selected. As 9 SNPs (i.e. rs7756992, rs6931514, rs4712524, rs4712523 in CDKAL1, rs9472138 in VEGFA, rs1111875, rs7923837, rs5015480 in HHEX, rs8050136 in FTO) had been reported in our previous studies [28][29][30], the other 39 SNPs from 30 genes were included in this study (S1 Table).
In the discovery stage, 39 SNPs of 1200 type 2 diabetes cases and 1200 controls were genotyped using the TaqMan OpenArray Genotyping System (LifeTechnologies, Carlsbad, USA). DNA samples with standardized concentration were loaded and amplified on 48-sample arrays according to the manufacturer's manual. In every chip, the equal amounts of cases and controls and two no template controls (NTCs) were simultaneously detected. The overall call rate was 98.7%. In the replication stage, 10 SNPs with the P value less than 0.05 in the discovery stage were further genotyped with the iPLEX Sequenom MassARRAY platform (Sequenom, Inc). For quality control, there were two NTCs in each 384-sample plate and genotyping was blindly conducted. The overall call rate of this stage was 99.5%, with the call rate > 99.0% for each SNP. The concordance rate calculated based on 150 duplicate samples for each SNP was 100%.

Statistical analyses
Hardy-Weinberg equilibrium was assessed using a likelihood ratio test. Student's t test was used to compare mean values of clinical characteristics between cases and controls. Unconditional logistic regression analysis was used to examine the association between each SNP and type 2 diabetes risks with adjustment for gender, age, and body mass index (BMI) as covariates under the additive genetic model. Conditional regression analyses on each SNP were conducted by using logistic regression with adjustment for age, sex, BMI and any of the other SNPs from the same gene under the additive genetic model. Two methods were used to calculate the genetic score. One method treated each risk allele equally and combined them based on the count of risk alleles (each SNP was coded as 0, 1, and 2 for the number of risk alleles carried). Another method assessed the effects of the SNPs using a risk score analysis with a linear combination of the SNP genotypes weighted by their individual OR. The former was used for description and association while the latter was used for prediction. We reported the prediction value in terms of both the area under the receiver operating characteristic curve (AUC) and classification rates. Improvement in prediction was assessed after adding the weighted genetic scores to the environmental risk factors. All the statistical analyses were performed using the PLINK 1.07 and Stata software (version 11.1; Stata-Corp LP, College Station, Texas).

Results
Clinical characteristics of the 6,206 participants were shown in S2 Table. There were no significant differences in the distributions of gender between cases and controls in both stages. Overall, those with type 2 diabetes were older than the controls (P< 0.01). The levels of BMI, FBG, triglyceride (TG), and total cholesterol (TC) were significantly higher, whereas the level of high density lipoprotein-cholesterol (HDL-C) was significantly lower in type 2 diabetes cases than they were in controls in both stages and combined analysis (P < 0.001). All SNPs in controls were in Hardy-Weinberg equilibrium (P > 0.05).
In the discovery stage with 1,200 cases and 1,200 controls, among the 39 SNPs, there were 10 SNPs from 8 genes showing suggestive associations with type 2 diabetes risk (P < 0.05) ( Table 1). The 10 suggestive SNPs were further to be genotyped in the replication stage with an  Table 2, 6 SNPs (rs13266634 from SLC30A8, rs10811661 from CDKN2A/CDKN2B, rs2237897, rs2237892 and rs2237895 from KCNQ1, rs1552224 from CENTD2) were consistently associated with type 2 diabetes risk, with P values less than 0.05. After combining the two stages together, all of the 6 SNPs were significantly associated with type 2 diabetes susceptibility after Bonferroni correction (P < 1.28 × 10 −3 ). As 3 SNPs from KCNQ1 showed statistically significantly associated type 2 diabetes in the study, we used logistic regression to determine the independent effects of these SNPs and found that SNP rs2237897 conferred independent risk for type 2 diabetes, as shown in S3 Table. To investigate if genes affected type 2 diabetes additively, we examined the joint effect of risk alleles of SNPs. We included the above-mentioned 4 SNPs from 4 genes (i.e. rs13266634 from SLC30A8, rs10811661 from CDKN2A/CDKN2B, rs2237897 from KCNQ1, rs1552224 from CENTD2) and the 5 additional SNPs (i.e. rs7756992 from CDKAL1, rs9472138 from VEGFA, rs1111875, rs7923837 from HHEX, rs8050136 in FTO) that we previously reported [28][29][30]. All these SNPs showed independently significant association with type 2 diabetes. Thus, a total of 9 SNPs were included to analyze the cumulative effect of genetic factors on type 2 diabetes. A genotype score ranging from 0 to 18 was constructed on the basis of the number of risk alleles for each participant who had the genotyping information of the 9 SNPs. The mean (± SD) genotype score was 8.22 (± 1.88) among 2,853 subjects with type 2 diabetes and 7.58 (± 1.90) among 3,210 controls (P = 0.000). The proportions of type 2 diabetes cases and controls grouped by the number of risk alleles that they carried were shown in Fig. 1 and S4 Frequency of minor homozygote/heterozygote/major homozygote. c The odds ratio (OR) with 95% confidence interval (CI) and P value were calculated for the risk allele indicated in Table 1 in the additive genetic model by logistic regression with adjustment for age, sex and body mass index. Table. The percentages of individuals with type 2 diabetes increased in the subgroups with more risk alleles. With the increasing number of risk alleles, the risk of developing type 2 diabetes increased. Individuals carrying 12 or more risk alleles of the 9 SNPs (6.80% of type 2 diabetes cases and 3.64% of controls) had a nearly 4-fold increased risk for developing type 2 diabetes compared with the reference group of 0-5 risk alleles (6.76% of type 2 diabetes cases and 12.99% of controls) ( Fig. 2 and S4 Table). Subjects in the upper quartile of risk score were associated with a 2.22-fold increased type 2 diabetes risk compared to those having the low quartile score (adjusted OR = 2.22, 95%CI: 1.91-2.56, P for trend: 5.2 × 10 −31 ) (S4 Table).
We evaluated the predictive value of these genetic variants in our study population contained 2,925 type 2 diabetes cases and 3281 controls. In all samples, the AUC for clinical characteristic including age and BMI was 0.76 (95%CI: 0.74-0.77) while the AUC for the weighted genetic score based on 9 SNPs was 0.60 (95%CI: 0.58-0.62). As we added the weighted genetic score to the regression model of age and BMI, the AUC slightly increased to 0.78 (95%CI: 0.77-0.79). However, the difference was statistically significant (P < 0.0001) (Fig. 3). The correctly classified rates were 71.07% and 72.01% before and after adding the weighted genetic score to the clinical model of age and BMI, respectively (S5 Table).

Discussion
In the present study, we conducted a two-stage case-control study with a total of 6,206 Han Chinese subjects in order to investigate the associations of 39 SNPs from 30 genes previously identified through GWAS for type 2 diabetes. We confirmed the associations of genetic variations in SLC30A8, CDKN2A/CDKN2B, KCNQ1, CENTD2 with type 2 diabetes. The cumulative effect analysis of 9 SNPs showed that a crude estimate of up to 3.68-fold increased the risk of type 2 diabetes in subjects carrying 12 or more risk alleles compared with those carrying 5 or fewer risk alleles. These genetic variants also showed potential utility in risk prediction for type 2 diabetes.
Our study is the first to confirm the associations between genetic variants in CENTD2 and type 2 diabetes susceptibility in Han Chinese. CENTD2 (also known as ARAP1) was initially identified as significantly associated with type 2 diabetes susceptibility, and its risk allele was associated with reduced insulin beta-cell function in nondiabetic subjects, in populations of European descent in 2010 [18]. However, two subsequent studies conducted with Han Chinese lean individuals and in the Japanese failed to replicate the association [33,34]. Our study confirmed the association of rs1552224 in CENTD2 with type 2 diabetes in a relatively large study of Chinese. Strawbridge, et al reported that the polymorphism of CENTD2 was associated with fasting proinsulin levels in 10,701 nondiabetic adults of European ancestry and the proinsulinraising allele was associated with a lower fasting glucose, improved β-cell function, and lower risk of type 2 diabetes [35]. CENTD2 was significantly associated with increased plasma glucose values and decreased glucose-stimulated insulin release, suggesting that the diabetogenic effect of this locus is mediated through an impaired pancreatic beta cell function [36].
Among the confirmed SNPs in our study, the strongest association with type 2 diabetes was observed for SNP rs2237897 from the KCNQ1gene, as each risk allele increased the odds of type 2 diabetes by 41%. Another SNP, rs2237892, which was in moderate LD with rs2237897 in the Chinese population (r 2 = 0.67) showed similar effect size. SNP rs2237895 (r 2 of 0.24 with rs2237897) also showed a significant but relatively weak association. The associations of these SNPs with type 2 diabetes were identified from GWAS in Japanese or Chinese [11,12,15] and have been replicated in Korean [37], Pakistani [38], Indian [39], Scandinavian [40] and Chinese [22,27,41,42,43] populations. The polymorphisms were also found to be associated with increasing fasting glucose and impairment of insulin, according to the homeostasis model assessment of beta-cell function [11,40,41]. Based on all of this evidence, KCNQ1 seems to have been robustly confirmed as a type 2 diabetes susceptibility gene in Han Chinese and may confer type 2 diabetes risk by impaired beta-cell function. In addition, the current study also replicated the significant associations of two other genes (CDKN2A/CDKN2B and SLC30A8) with type 2 diabetes susceptibility, which was consistent with the observations from other studies in Han Chinese [20][21][22][23][24][25][26][27].
One potential application of identifying genetic variants associated with type 2 diabetes is using genetic information to help predict risk of the disease, which may facilitate preventive interventions on those at the highest risk of type 2 diabetes. A few studies on predicting type 2 diabetes based on genetic polymorphisms have been conducted. Weedon et al. reported the AUC for 3 polymorphisms was 0.58 [44], while Vaxillaire et al. showed that the AUCs for a combination of 3 SNPs were 0.55 or 0.56 [45]. Van Hoek et al. reported the AUC was 0.58 (0.56-0.61) when including the 9 significantly associated genetic variants, and the difference between the AUCs for clinical characteristics (age, sex, and BMI), with and without the genetic polymorphisms, was significant (P < 0.0001) [46]. Hu evaluated the predictive value of 11 SNPs in a Chinese population and the result showed the AUC for the number of risk alleles was 0.621 (95%CI: 0.604-0.639) [21]. Our study showed the AUC of 9 SNPs was 0.60 and the difference between the AUCs for clinical characteristics (age and BMI) with and without the genetic polymorphisms was significant. These results are similar to previous reports. Despite the limited predictive values of genetic variants, as Lango et al. previously reported, patients with high genetic risk had lower BMI and earlier onset-age compared with those with relatively low genetic risk [47]. These findings highlight the potential usefulness for risk prediction of type 2 diabetes, especially in non-obese, young-onset case subjects.
The strengths of our study included the large size of the study population, the two-stage case-control design, and the relatively systematic study strategy for type 2 diabetes-related genetic variants. Despite these advantages, we did not include those established type 2 diabetesassociated genes with larger effect size in our study, such as TCF7L2. The discriminative accuracy of predictive genetic testing in complex diseases depends on the number of genes involved, the risk allele frequencies, and the size of the associated risks. Despite the efficient discovery of disease-associated SNPs, the case-control study is suboptimal for evaluating predictive performance. Further studies should aim at genotyping more genetic susceptibility variants and predicting new cases in prospective studies.
In conclusion, in addition to 4 genes reported previously, we further confirmed that 4 of 30 genes were associated with risk of type 2 diabetes in Han Chinese. These genes may improve the accuracy of risk prediction in combination with clinical characteristics.
Supporting Information S1  Table. Comparison of prediction of type 2 diabetes with and without weighted genetic score using classification rate. (DOC)