Combined Effects of 19 Common Variations on Type 2 Diabetes in Chinese: Results from Two Community-Based Studies

Background Many susceptible loci for type 2 diabetes mellitus (T2DM) have recently been identified from Caucasians through genome wide association studies (GWAS). We aimed to determine the association of 11 known loci with T2DM and impaired glucose regulation (IGR), individually and in combination, in Chinese. Methods/Principal Findings Subjects were enrolled in: (1) a case-control study including 1825 subjects with T2DM, 1487 with IGR and 2200 with normal glucose regulation; and (2) a prospective cohort with 734 non-diabetic subjects at baseline. The latter was followed up for 3.5 years, in which 67 subjects developed T2DM. Nineteen single nucleotide polymorphisms (SNPs) were selected to replicate in both studies. We found that CDKAL1 (rs7756992), SLC30A8 (rs13266634, rs2466293), CDKN2A/2B (rs10811661) and KCNQ1 (rs2237892) were associated with T2DM with odds ratio from 1.21 to 1.35. In the prospective study, the fourth quartile of risk scores based on the combined effects of the risk alleles had 3.05 folds (95% CI, 1.31–7.12) higher risk for incident T2DM as compared with the first quartile, after adjustment for age, gender, body mass index and diabetes family history. This combined effect was confirmed in the case-control study after the same adjustments. The addition of the risk scores to the model of clinical risk factors modestly improved discrimination for T2DM by 1.6% in the case-control study and 2.9% in the prospective study. Conclusions/Significance Our study provided further evidence for these GWAS derived SNPs as the genetic susceptible loci for T2DM in Chinese and extended this association to IGR.


Introduction
Type 2 diabetes mellitus (T2DM) is one of the fastest growing diseases with a major impact on morbidity and premature mortality worldwide. A rapid increase in the prevalence of T2DM and impaired glucose regulation (IGR) has also been observed in China in recent decades [1]. T2DM is a complex disorder characterized by impaired insulin sensitivity and pancreatic b cell dysfunction; and is involved in complicated interactions between genetic variants and environmental factors. Multiple genes have been found involving in the pathogenesis of T2DM. Recently, several genome wide association studies (GWAS) and replicated studies on the common genetic variants in T2DM have been reported in several large white populations [2][3][4][5] since the first GWAS [6] published. Several new candidate genes (TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/2B, IGF2BP2, KCNQ1, etc.) have been identified in relation to an increased risk for T2DM. Some studies have implicated that the genetic polymorphisms may be involved in the process of insulin production and/or secretion [4,7,8].
IGR includes impaired fasting glucose (IFG) and/or impaired glucose tolerance (IGT). IGR is also known as intermediate hyperglycemia or pre-diabetes and characterized by high blood glucose concentrations, insulin resistance and impaired insulin secretion. Previous studies have shown that 5210% IGT subjects developed diabetes each year, although, some of them could revert spontaneously to normal glucose tolerance [9,10]. It would be worthwhile to determine whether the common genetic variations play any role in the pathogenesis of IGR and whether IGR shared the same risk genetic background with T2DM [11,12].
Despite a moderate effect of individual genetic factors on T2DM and a premature testing for inherited susceptibility based on common risk alleles, the genetic assessment for persons at high risk for T2DM has received much consideration [13]. It is important to understand whether a combination of the major genetic factors would contribute more to T2DM or may be used to stratify high-risk populations [14][15][16][17][18].
Given the differences in genetic background (ethnics, geographic ancestries, linkage disequilibrium pattern and risk allele frequencies) [19] and risk factor profiles (body composition and insulin secretion/resistance patterns), it is necessary to replicate the genetic association study in Chinese population to clarify the roles in those susceptible genes. In the present study, we aimed to verify the associations of 19 single nucleotide polymorphisms (SNPs) in 11 genes (PPARG, IGF2BP2, CDKAL1, SLC30A8, CDKN2A/2B, HHEX, EXT2, KCNJ11, KCNQ1, MTNR1B and TCF2) with the risk of T2DM and IGR in Chinese population; and followed by the investigation of the combined effect of these genes on the risk of T2DM in both case-control study and prospective cohort.

Ethics statement
This study was approved by the Institutional Review Board of the Ruijin Hospital, Shanghai Jiao Tong University School of Medicine and was in accordance with the principle of the Helsinki Declaration II. The written informed consent was obtained from each participant.

Study population
Case-control study: The participants were recruited from an ongoing glucose survey in Baoshan District of Shanghai during 2004 to 2008. The study population, design and protocols of this case-control study have been previously described [20,21]. In brief, we first invited all registered permanent residents aged 40 or above by poster advertisement and by mail to participate in a screening examination. We then collected information on lifestyle, medical history and the use of medications using a questionnaire, performed anthropometrical measurements and 75-g oral glucose tolerance tests (OGTT), and blood and urine sampling. Eventually, we enrolled 5012 subjects who have finished OGTT in the genetic study, which included 2200 subjects with normal glucose regulation (NGR, 844 males and 1356 females), 1478 subjects with IGR (595 males and 892 females) and 1825 T2DM patients (802 males, 1023 females).
Prospective study: Nine hundred and forty-four non-diabetic individuals determined at baseline in 2005 in the Baoshan District were invited to participate in the follow-up examination in 2008. After excluding the subjects with neither DNA samples (n = 190) nor information of glucose metabolism status (n = 20) available, the remaining 734 subjects were selected for the genetic analysis.

Clinical examination and biochemical analysis
Individual height, weight, and waist and hip circumferences were measured by the experienced physicians. Blood pressure was measured at non-dominant arm in a seated position after a tenmin rest using an automated electronic device (OMRON Model1 Plus, Japan). Three measurements were taken in one min apart and an average of the three was used in analysis. The fasting and 2-h OGTT plasma glucose, serum triglycerides, total cholesterol, high-density lipoprotein and low-density lipoprotein cholesterol were determined using an automated biochemical instrument (Beckman CX-7 Biochemical Autoanalyser, Brea, CA, USA). Fasting serum insulin was measured by radioimmunoassay (Sangon Company, Shanghai, China).

Definitions
IGR was defined as IFG (Fasting plasma glucose $5.6 mmol/l and ,7.0 mmol/l) and/or IGT (2-h OGTT plasma glucose $7.8 and ,11.1 mmol/l). T2DM was diagnosed at fasting plasma glucose $7.0 mmol/l and/or 2-h OGTT plasma glucose level $11.1 mmol/l and/or treatment with antidiabetic medication (oral agents or insulin injection). A fasting plasma glucose level less than 5.6 mmol/l and a 2-h OGTT plasma glucose level less than 7.8 mmol/l were defined as NGR. The insulin resistance index of the homeostasis model assessment (HOMA-IR) was calculated as fasting plasma insulin (in milliunits per milliliter) 6 fasting plasma glucose (in millimoles per liter)/22.5 and b-cell function (HOMAb) was assessed as fasting plasma insulin (in milliunits per milliliter) 620/(fasting plasma glucose -3.5) (in millimoles per liter).
Genomic DNA was extracted from peripheral blood leukocytes with standard phenol/chloroform-based method. All the selected SNPs were genotyped by SNaPshotH Multiplex System (Applied Systems) following the manufacture's protocol. In our study, the call rate was ranged from 94% (rs2466293) to 99% (rs3740878) in the case-control study, and from 97% (rs1801282) to 99% (rs564398) in the prospective cohort. There is no significant difference of SNP calling between the case and the control groups. The average consensus rate in the duplicate samples (n = 256) was 99.7%, and all the SNPs were in accordance with Hardy-Weinberg equilibrium (all P$0.01, Table S1).

Risk score
The risk score was calculated on the basis of SNPs that were significantly associated with T2DM in the present case-control study. We assumed the additive genetic model [28] for each SNP, applying a linear weighing of 0, 1, and 2 to genotypes containing 0, 1, or 2 risk alleles, respectively. Three logistic regression models with different adjustments were used to investigate effect of risk scores on T2DM and IGR in the case-control analysis and on incident diabetes in the prospective analysis, respectively. Multiplicative interactions between conventional risk factors and the risk scores were tested using the likelihood ratio test. To measure the discriminative improvement attributable to the risk score, we plotted receiver-operating characteristic curves (ROCs) for a logistic regression model including conventional risk factors and a model including conventional risk factors and the genetic risk score [29]. The conventional model included age (continuous), gender, family history of diabetes (yes or no) and BMI (continuous).

Statistical analysis
Deviation from Hardy-Weinberg equilibrium for genotypes at individual locus was assessed using the Chi-square test. A multiple logistic regression model was used to investigate the individual effect of these genes on IGR and T2DM. These analyses were based on additive, recessive and dominant models, and adjusted for age, gender and BMI. The statistical analyses were performed using SAS version 8.1 (SAS Institute, Cary, NC). In order to avoid any potential spurious result in our association replications, the most conservative Bonferroni correction was used to ensure a high stringent condition for any positive result. P,0.0026 (0.05 divided by 19, the total number of SNPs studied) was considered significant. LD estimation of the SNPs was obtained using Haploview version 3.32 (http://www.broad.mit. edu/mpg/haploview/). Current sample size, minor allele frequencies observed in the present study and the previously reported odds ratios (ORs) for T2DM was used for statistical power estimation (Table S1).

The clinical characteristics of the study subjects
The case-control study had a total of 5512 subjects, including 2200 subjects (39.9%) with NGR, 1487 (27.0%) with IGR and 1825 (33.1%) T2DM patients. The characteristics of the participants were shown in Table 1.
In the prospective study, of the 734 non-diabetic subjects at baseline, 67 subjects turned to T2DM in 3.5 years. The clinical characteristics of the prospective study subjects were shown in Table S2.

Individual effects of polymorphisms on IGR and T2DM
The case-control study. The characteristics of the 19 risk loci and their associations with IGR and T2DM were shown in Table 2. Three heredity models (additive, recessive or dominant) were introduced to study the associations between the SNPs and IGR or T2DM. SNPs rs10811661 (CDKN2A/2B) and rs2466293 (SLC30A8) were associated with increased risk in both IGR and T2DM. SNP rs7756992 (CDKAL1) was associated with T2DM, but not IGR. SNPs rs13266634 (SLC30A8) and rs2237892 (KCNQ1) were nominally associated with IGR and statistically significantly associated with T2DM (P,00001); whereas, two SNPs rs1470579 and rs4402960 (IGF2BP2), and two SNPs rs5215 (KCNJ11) and rs7501939 (TCF2) were nominally associated with T2DM. SNPs rs1111875 (HHEX) and rs10830963 (MTNR1B) were only associated with the risk of IGR, not T2DM. All the analysis was based on the adjustment for age, gender and BMI.
The prospective study. The genotype frequencies and individual risk for incident diabetes were shown in Table 3. The risk allele of SNPs rs10811661 (CDKN2A/2B), rs13266634 (SLC30A8) and rs2466293 (SLC30A8) increased the risk of incident T2DM by 94%, 88% and 152%, respectively, in the recessive model after adjustment for the effect of age, gender and BMI. The risk allele C of rs1387153 (MTNR1B) was associated with the increased risk of T2DM by 85% in the dominant model.

Genetic risk score and risk of type 2 diabetes
The risk score was calculated based on SNPs rs7756992 (CDKAL1), rs2466293 (SLC30A8), rs10811661 (CDKN2A/2B) and rs2237892 (KCNQ1), which were statistically significantly associated T2DM in the case-control study. The risk score was calculated by summing up the number of risk alleles for each participant who had the genotyping information of these 4 SNPs (534 participants were excluded from calculation because of incomplete genotype information). We included SNP rs2466293 of SLC30A8 to calculate the risk score since it was reported to be associated with T2DM in Chinese in our previous study [30] and the correlation  between SNP rs13266634 and rs2466293 was moderate (rsquared = 0.49) ( Table S3). The risk scores were significantly higher in T2DM and IGR than that in NGR. The mean risk scores for T2DM, IGR and NGR were 4.45, 4.24 and 3.99, respectively (P,0.0001) after adjustment for age, gender, BMI, diabetes family history, current smoking and alcohol intake in the case-control analysis. Similarly, the mean risk score was 4.81 for the incident diabetic patients and 4.33 for the non-diabetics in the prospective study, and the difference reached statistical significant (P = 0.02), after the adjustment for the same factors as above. The subjects with T2DM or IGR had more risk alleles than those with NGR (both P#0.0003) ( Figure 1A). Also, the T2DM incidence was increased significantly along with the increased number of risk alleles ( Figure 1B).
In the case-control study, we performed the logistic regression analysis for the association between the risk scores and T2DM and IGR in both continuous and category patterns (Table 4).
Compared with the first quartile of risk scores, the fourth quartile has 1.53 and 2.29 folds higher risk of IGR and T2DM, respectively, after adjustment for age, gender, BMI, current smoking, alcohol intake (Model 2). The category analysis, in which the risk score was classified by quartiles (0-3, 4, 5, 6-8) yielded similar results (Table 4). Furthermore, possible interactions between genetic risk scores and the clinical risk factors in the case-control association study were explored in stratified analysis and by adding interaction terms to logistic regression models (Table S4). We stratified the study subjects by quartiles of BMI (#22.9, 23.0-25.0, 25.1-27.4, $27.5), quartiles of HOMA_b (#39.1, 39.2-69.4, 69.5-114.6, $114.7) and family history (yes or no). The P values for interaction were shown in Table S4. In each of the stratification, the increased risk score was associated with the prevalence of T2DM (all P for trend ,0.05, Table S4).
In the prospective study, the same models were introduced ( Table 5). The fourth quartile of risk scores had 3.05 folds (95% CI, 1.31-7.12) higher risk for incident T2DM as compared with the first quartile, after adjustment for age, gender, BMI, diabetes family history, current smoking and alcohol intake (Model 2, Table 5).

Discriminative improvement attributable to the risk score
The combined effect of genetic and clinical risk factors on T2DM was shown in Figure. 2. The area under ROC was 0.714 for clinical risk factors alone and 0.730 for combined genetic risk score and clinical risk factors (both P,0.0001). Thus, the combined effect was only increased by 1.6% as compared with clinical factors in the case-control study (Figure 2A). In the prospective study, the discriminative improvement for incidence of T2DM by combining the genetic risk score was increased by 2.9% as compared with clinical factors alone ( Figure 2B, the area under ROC was 0.634 for clinical factors and 0.663 for combined risk factors, P = 0.002 and ,0.0001, respectively).

Discussion
In our present study, the findings support the individual associations of CDKN2A/2B (rs10811661), SLC30A8 (rs13266634 and rs2466293), CDKAL1 (rs7756992) and KCNQ1 (rs2237892) with not only T2DM but also IGR in a case-control study. We also confirmed the predictive effect of CDKN2A/2B (rs10811661), SLC30A8 (rs13266634 and rs2466293) on the incident T2DM in the 3.5 year follow-up study. Furthermore, we found that the combination of the risk alleles demonstrated a more robust association with T2DM and IGR than a single one after adjustment for the common clinical risk factors, such as age, gender, BMI and diabetes family history in both case-control and prospective studies. The combined genetic risk scores only had a discriminative improvement of 2.9% for incidence of T2DM as compared with clinical risk factors alone.
We observed a significant association between T2DM and SLC30A8 (rs13266634 and rs2466293) and CDKN2A/2B (rs10811661) in not only the case-control study but also the prospective study. These findings were consistent with what have been observed in a large sample-size Caucasian population in Denmark [31] and several populations in Asia [30,[32][33][34]. Recently, we have verified that SLC30A8 gene is a susceptible locus for T2DM in Chinese population [30]. We confirmed the rs13266634 was associated with T2DM and reported that rs2466293 (one of the tagger SNPs of SLC30A8) was nominally associated with T2DM. Moreover, SNP rs13266634 is correlated with glucose-stimulated insulin secretion. A similar observation is also confirmed in a Japanese population [32]. In a case-control replication study of 6719 Asians including a Chinese cohort from Hong Kong and two Korean cohorts, these candidate genes have critical contribution to T2DM as compared with Caucasians [33]. Functional studies [35,36] found that zinc transporter 8 (ZnT8) is required for normal insulin crystallization and insulin processing and secretion. The R allele of rs13266634 (W325R) may increase T2DM risks [35]. Further studies which focus on small molecule activators that target ZnT8 may thus represent an interesting means to treat insulin secretary deficiency in T2DM. SNP rs10811661 is located at 125 kilo-bases upstream of the CDKN2A/2B gene. Given the prior knowledge on SNP function, we assumed that SNP rs10811661 might exert its effect on transcription directly or indirectly through an unknown locus which have high LD with this variant. The CDKN2A/2B genes are expressed in adipocytes and pancreatic islets [6]. CDKN2A/2B encodes for p16INK4a, a tumor suppressor influencing pancreatic b-cell proliferation [37,38].It is possible for a causal variant situated in CDKN2A/2B to increase the susceptibility of T2DM through b-cell mass reduction and subsequent insulin release impairment in the sates with increased insulin demand.
KCNQ1 gene was believed to be a confirmed risk loci for T2DM in Chinese [39,40]. In the present study, we confirmed that SNPs rs2237892 in KCNQ1 was in relation to T2DM and IGR with the odds ratio of 1.35 and 1.17 respectively in the case-control analysis, but not in the prospective study. The predictive effect of KCNQ1 gene for incident diabetes and the potential mechanism of this gene in the pathogenesis of T2DM remain to be explored.
We found some evidence of combined effect of those risk alleles on T2DM in both case-control and prospective studies. These results are consistent with those reported by Scott et al. [3] and other groups [14][15][16][17][18]. In Scott's study, they examined the combined effect of ten risk variants in a GWAS of Europeans, in which they found a fourfold variation in T2DM risk from the lowest to highest predicted risk groups. However, they pointed out that the predictions based on their data might be biased owing to a likely overestimation of ORs because of enrichment for familial T2DM and exclusion of individuals with impaired glucose tolerance or impaired fasting glucose. The risk score in our study improved case-control discrimination beyond what the clinical risk factors could provide, but the magnitude of this improvement was small. This was consistent with other studies performed in prospective populations that provided the joint effect of multiple risk loci and the combined prediction on incident T2DM [14][15][16][17][18]. Lyssenko et al. [16] suggested that the addition of genotyping data from the known DNA variants to clinical risk factors, including a family history of diabetes, had a minimal, albeit statistically significant effect on the prediction of future T2DM and the assessment of genetic risk factors is more meaningful in the early  life. However, a replication in a larger prospective population would be more convinced to affirm whether combinations of risk alleles from these variants provide a better predictive and diagnostic potential in Chinese. We included subjects with IGR (impaired fasting glucose and impaired glucose tolerance) in the present study. Few studies were concerned about the association of these GWAS variations with IGR [11]. IFG and/or IGT were predisposed to diabetes; however, whether the IGR and T2DM shared the same spectrum of genetic variations is not well characterized. Here in our study, we found that majority of the SNPs that are associated with T2DM was also conferred the risk of IGR. Our study provided evidence that IGR might have similar background of susceptible genetic variations. However, because the IGR included IFG and IGT which may have different genetic etiology [41,42], more prospectively-designed association studies with large sample size and more SNPs included are needed in the near future.
Our present study has strength and limitation to be addressed. The main strength of the present study was that we explored the combining effect of those susceptible genes in both a case-control study with a moderate sample size and a 3.5-year follow-up study. We speculated that the joint effect of the genetic variations, which were validated in our study, provided a more strong association with risk of T2DM and IGR. This study extended the knowledge about the genetic factors and the pathogenesis of T2DM beyond the Caucasian population. There are some limitations that should be addressed in this study. The sample size for the prospective study was relatively small and the cases of incident T2DM were limited. Only two of the SNPs that were found to be significantly associated with T2DM in the case-control analysis were validated in the prospective study.
In conclusion, our study affirmed the associations of SNPs in CDKN2A/2B, SLC30A8, KCNQ1, and CDKAL1 genes with the risk of IGR and T2DM in a case-control study; and stronger associations were found when the risk alleles combined. Our study provided the further evidence of that these GWAS derived genetic susceptible variations are also important for T2DM in Chinese and extended the association of these variations with IGR.