Validation of Type 2 Diabetes Risk Variants Identified by Genome-Wide Association Studies in Han Chinese Population: A Replication Study and Meta-Analysis

Background Several genome-wide association studies (GWAS) involving European populations have successfully identified risk genetic variants associated with type 2 diabetes mellitus (T2DM). However, the effects conferred by these variants in Han Chinese population have not yet been fully elucidated. Methods We analyzed the effects of 24 risk genetic variants with reported associations from European GWAS in 3,040 Han Chinese subjects in Taiwan (including 1,520 T2DM cases and 1,520 controls). The discriminative power of the prediction models with and without genotype scores was compared. We further meta-analyzed the association of these variants with T2DM by pooling all candidate-gene association studies conducted in Han Chinese. Results Five risk variants in IGF2BP2 (rs4402960, rs1470579), CDKAL1 (rs10946398), SLC30A8 (rs13266634), and HHEX (rs1111875) genes were nominally associated with T2DM in our samples. The odds ratio was 2.22 (95% confidence interval, 1.81-2.73, P<0.0001) for subjects with the highest genetic score quartile (score>34) as compared with subjects with the lowest quartile (score<29). The incoporation of genotype score into the predictive model increased the C-statistics from 0.627 to 0.657 (P<0.0001). These estimates are very close to those observed in European populations. Gene-environment interaction analysis showed a significant interaction between rs13266634 in SLC30A8 gene and age on T2DM risk (P<0.0001). Further meta-analysis pooling 20 studies in Han Chinese confirmed the association of 10 genetic variants in IGF2BP2, CDKAL1, JAZF1, SCL30A8, HHEX, TCF7L2, EXT2, and FTO genes with T2DM. The effect sizes conferred by these risk variants in Han Chinese were similar to those observed in Europeans but the allele frequencies differ substantially between two populations. Conclusion We confirmed the association of 10 variants identified by European GWAS with T2DM in Han Chinese population. The incorporation of genotype scores into the prediction model led to a small but significant improvement in T2DM prediction.


Introduction
Type 2 diabetes mellitus (T2DM) is a complex disease influenced by both genetic and environmental factors. The heritability of T2DM is relatively strong with an estimated h 2 of 31-69% [1]. Previous genetic studies have suggested the involvement of multiple genes with modest effects in the pathogenesis of T2DM [2]. This notion was supported by several genome-wide association studies (GWAS) for T2DM in European population [3][4][5][6][7]. These GWAS showed associations of approximately ,40 risk variants with T2DM in European population. Further large-scaled meta-analyses confirmed these associations and estimated their relative contributions in European populations [8]. These discoveries greatly advanced our understanding toward the genetic architecture of T2DM and provided valuable tools for the prediction of personal T2DM risk in European populations.
The prevalence of diabetes mellitus has increased rapidly in Chinese populations in recent decades. In 2013, the prevalence of diabetes and prediabetes was estimated to be 11.6% and 50.1% respectively, suggesting that there were 113.9 million Chinese adults with diabetes and 493.4 million with prediabetes [9]. This dramatic surge of T2DM prevalence poses a serious threat to the public health of Chinese populations. Since diabetes can be effectively prevented by life-style or pharmacological intervention in high-risk patients, it is important to identify high-risk subjects for preventive measures. With the strong heritability of diabetes, genetic information is expected to offer additional benefits towards the identification of high-risk subjects. Previous studies incorporating genetic scores into T2DM prediction models have successfully demonstrated the benefit of utilizing such approach [10,11]. However, this approach has not yet been validated in Han Chinese population. Given the heterogeneous genetic structures between European and Chinese populations, it is essential to confirm the association and the predictive value of these genetic variants in the Chinese.
In this case-control study, we genotyped 24 risk variants identified from European GWAS in 3,040 Han Chinese. The associations of these variants with T2DM were analyzed in our sample and were further validated by a meta-analysis pooling 20 case-control association studies of Han Chinese. The discriminative power of the prediction models with and without genotype information were then compared.

Study participants
A total of 760 T2DM patients were recruited from the metabolism clinics of the National Taiwan University Hospital (NTUH) and another 760 T2DM patients were recruited from the metabolism clinic of the Yunlin branch of NTUH. T2DM were diagnosed according to the criteria of the American Diabetes Association [12] or the use of anti-diabetic therapy. Patients with ages of onset below 35 years were excluded. In addition, 760 glucose-tolerant healthy controls were recruited from the health check-up service of NTUH and another 760 controls were recruited from a community screening for metabolic syndrome in the Yunlin county of Taiwan. Glucose tolerance was defined as fasting plasma glucose , 126 mg/dl or 2-hour plasma glucose , 200 mg/dl during a 75-g oral glucose tolerance test (OGTT). Written informed consent was obtained from each participating subject, and the study was approved by the institutional review board of the National Taiwan University Hospital.

Selection of SNPs and genotyping
Twenty-four genetic risk variants were selected from GWAS or well-established candidate-gene association studies for T2DM in European populations [3][4][5][6][7]. In view of the low risk allele frequencies and the negative T2DM association of rs7903416 in the TCF7L2 gene observed in previous researches in Chinese population [13,14], we genotyped another SNP in this gene, rs290487, which has been reported to be associated with T2DM in Chinese [13]. Genotype data of rs7903146 were retrieved from our previous study [13]. Genotyping was performed using the GenomeLab SNPstream genotyping platform (Beckman Coulter) and its accompanying SNPstream software suite. The concordance rate based on this platform was 99.62% [15].

Statistical analyses
Hardy-Weinberg equilibrium (HWE) test was performed for each SNP in the control group before marker-trait association analysis. Tests for the associations of each SNP and haplotype with type 2 diabetes were conducted using logistic regression. Nominal two-sided P-values were reported. Multivariate analysis with age, gender, and BMI as covariates was performed using multivariate logistic regression. The odds ratio (OR) and 95% confidence interval (CI) associated with each risk allele were also estimated. Pairwise gene-gene and gene-environment interactions were analyzed by logistic regression. The significance of interaction was adjusted for multiple testing using the Bonferroni method.
To test the cumulative effects of genetic variants on the T2DM risk, weighted genetic score for each risk allele was calculated using the beta-coefficients of logistic regression model. All participants were divided into four equal groups according to their genetic score (,29, 29-31, 31-34, .34). The OR and 95% CI for each group were estimated using the lower quartile group (score,29) as the reference group. The statistical power of this study for each SNP was estimated using the Genetic Power Calculators (http:// pngu.mgh.harvard.edu/,purcell/gpc/) assuming diabetes prevalence of 8% [37]. Meta-analysis under fixed effect models were used to estimate pooled odds ratio (OR) using the Comprehensive Meta-Analysis software (Biostat, Englewood, NJ). Cochran's Q test and I 2 was used to assess heterogeneity between the individual studies. The Z test was used to determine the significance of the pooled OR.

Characteristics of study subjects and SNP information
Twenty four SNPs were successfully genotyped in 1,502 unrelated T2DM cases and 1,518 glucose-tolerant controls except for rs10811661 in the CKD2A/B gene, in which genotyping failed in all samples. The demographic and biochemical characteristics of the study participants are shown in Table 1. Basic information of these SNPs is summarized in Table 2. All SNPs were in Hardy- Weinberg equilibrium ( Table 2). The average call rate was 99.08%.

Gene-gene interactions and interactions between genetic variants and other known T2DM risk factors
We next explored potential gene-gene interaction and interaction with other risk factors of T2DM. No significant gene-gene interaction was found using pairwise interaction testing. Analyses of interaction between genetic variants and other known T2DM risk factors showed significant interaction between rs13266634 in SLC30A8 gene and age on T2DM risk (P for interaction,0.0001, adjusted P,0.0001). As shown in Figure 4A, the OR associated with the risk C allele was attenuated with advanced age, ranging from 1.73 in the group of lowest age quartile (age,48) to 0.88 in the group of highest age quartile (age.69). A suggestive interaction between BMI and rs10946398 in CDKAL1 gene (P for interaction = 0.006, adjusted P = 0.084) was also found. The increased risk associated with the C allele at rs10946398 was attenuated in subjects with larger BMI ( Figure 4B). However, we did not observed significant interactions between genetic score and other T2DM risk factors including age, sex, and BMI.

Discussion
In this study, we confirmed the association of 10 genetic risk variants identified from European GWAS in Han Chinese. The incorporation of genetic information improves the prediction of T2DM. The effect sizes conferred by risk variants are similar but the allele frequencies differ substantially between Han Chinese and European populations.
Previous GWAS in Han Chinese identified several candidate variants associated with T2DM. Tsai et al reported genetic variants in PRPRD and SRR genes associated with T2DM in a GWAS conducted in Han Chinese in Taiwan [38]. Another GWAS in Han Chinese by Shu et al found that genetic variants near CDC123/CAMD1A, SPRY2, and C2CD4B genes are associated with T2DM [28]. However, the loci discovered from both GWAS did not overlap with each other. Therefore, instead of testing these variants, we attempted to validate the association of established risk loci in Europeans in our population. Given the heterogeneous genetic structure of different ethnic populations, it is necessary to validate the relative contribution of T2DM variants identified from Caucasian GWAS in Han Chinese. Here, we confirmed the association of genetic variants in the IGF2BP2, CDKAL1, JAZF1, SCL30A8, HHEX, TCF7L2, EXT2, and FTO gene with T2DM in Han Chinese population. Interestingly, the effect sizes conferred by these variants were similar between Han Chinese and European populations despite marked differences in  . Odds ratios of type 2 diabetes associated with the risk C allele at rs13266634 in the SLC30A8 gene according to age groups (A). Odds ratio association with type 2 diabetes associated with the C allele at rs10946398 in the CDKAL1 gene according to body mass index groups (B). doi:10.1371/journal.pone.0095045.g004 allele frequencies, suggesting that the biological actions of these variants are the same across different ethnic groups. The effect size of several uncommon or rare variants, including rs10490072 in the BCL11A gene, rs7578597 in the THADA gene, and rs7903146 in the TCF7L2 gene, are relatively large (OR: 1.34, 1.30, and 1.45, respectively). Therefore an aggregate of all SNPs were used for our prediction model of T2DM instead of using only common variants.
We found that the addition of genetic information to clinical predictors slightly improved the prediction for T2DM in Han Chinese. The 3% increment in C-statistics is consistent with those observed in Europeans [10,11]. Similarly, Xu et al reported that a 1.6% increase in C-statistics for T2DM prediction in another Chinese population using a set of 19 risk variants. Collectively, these studies confirmed a small improvement in the prediction of T2DM by incorporating genetic information. It should be noted that such increment in C-statistics, albeit statistically significant, may not be of clinical significance. However, we also found that the OR of T2DM in subjects with the highest genetic score quartile was 2.22 as compared with those with the lowest genetic score quartile. This estimate is in concordance with the 2.60-fold increased risk associated with higher genetic scores in a European population as reported by Meigs et al. [11].
An interesting interaction was found between age and an exonic variant in the SLC30A8 gene. The SLC30A8 gene encodes a zinc transporter specifically expressed in pancreatic beta-cells. We found that the effect conferred by the risk allele was diminished with aging. The underlying mechanism is currently unknown. Zinc deficiency has been shown to develop with advanced age when the ability to transport zinc is disrupted [39]. Therefore, the reduced zinc transporter capability associated with aging may mask the genetic effect of SLC30A8 mutation. However, further replication is needed to verify this observation.
Our study has both strengths and limitations which need to be addressed. First, this study provides the largest and the most updated meta-analysis for T2DM genetic association in Han Chinese population. However, this study are still insufficiently powered to validate the association of variants in ADAM30, NOTCH2, THADA, ADAMTS9, WFS1, VEGFA, LOC387761, and TSPAN8/LGR5 genes with T2DM, probably owing to their low allele frequencies and small effects in Han Chinese. Second, variants identified by recent GWAS in East Asians were not genotyped [40][41][42][43]. With the rapidly expanding knowledge for T2DM genetics, further incorporation of new genetic variants is warranted for in order to enhance prediction. Third, this study could not provide accurate estimation of disease incidence because of the case-control design. Therefore, the net improvement in reclassification could not be estimated.
In summary, this study affirmed the association of 10 genetic loci with T2DM in Han Chinese. Carriers with higher genetic risk scores have a 2.2-fold increase in T2DM risk and the addition of genetic information to clinical factors lead to a ,3% increment in the discriminative power for prediction of T2DM. These data, together with previous studies, support the usefulness of genetic testing for T2DM prediction. Figure S1 Forest plots for meta-analyses showing odds ratios of type 2 diabetes conferred by risk variants identified from European genome-wide association studies in Han Chinese.

Supporting Information
(DOC) Figure S2 Comparison of odds ratio associated with risk alleles (A) or minor allele frequencies (B) between Chinese and European populations. (TIF) Checklist S1 PRISMA (Preferred Reporting Items for Systemic Review and Meta-analysis) check-list and flow diagram for metaanalysis. (DOC)