Evaluation of Common Type 2 Diabetes Risk Variants in a South Asian Population of Sri Lankan Descent

Introduction Most studies seeking common variant associations with type 2 diabetes (T2D) have focused on individuals of European ancestry. These discoveries need to be evaluated in other major ancestral groups, to understand ethnic differences in predisposition, and establish whether these contribute to variation in T2D prevalence and presentation. This study aims to establish whether common variants conferring T2D-risk in Europeans contribute to T2D-susceptibility in the South Asian population of Sri Lanka. Methodology Lead single nucleotide polymorphism (SNPs) at 37 T2D-risk loci attaining genome-wide significance in Europeans were genotyped in 878 T2D cases and 1523 normoglycaemic controls from Sri Lanka. Association testing was performed by logistic regression adjusting for age and sex and by the Cochran-Mantel-Haenszel test after stratifying according to self-identified ethnolinguistic subgroup. A weighted genetic risk score was generated to examine the combined effect of these SNPs on T2D-risk in the Sri Lankan population. Results Of the 36 SNPs passing quality control, sixteen showed nominal (p<0.05) association in Sri Lankan samples, fifteen of those directionally-consistent with the original signal. Overall, these association findings were robust to analyses that accounted for membership of ethnolinguistic subgroups. Overall, the odds ratios for 31 of the 36 SNPs were directionally-consistent with those observed in Europeans (p = 3.2×10−6). Allelic odds ratios and risk allele frequencies in Sri Lankan subjects were not systematically different to those reported in Europeans. Genetic risk score and risk of T2D were strongly related in Sri Lankans (per allele OR 1.10 [95%CI 1.08–1.13], p = 1.2×10−17). Conclusion Our data indicate that most T2D-risk variants identified in Europeans have similar effects in South Asians from Sri Lanka, and that systematic difference in common variant associations are unlikely to explain inter-ethnic differences in prevalence or presentation of T2D.


Introduction
Type 2 diabetes (T2D) is a major global health concern that is currently estimated to affect 336 million people worldwide [1]. It is widely accepted that T2D is a complex disorder and individual risk reflects the influence of environmental factors on a background of genetic predisposition. Over the past three decades the prevalence of T2D in South Asians has shown a particularly dramatic increase [2,3], prompted by profound changes in socioeconomic factors and lifestyle. Compared to European counterparts, South Asians tend to be diagnosed with diabetes earlier, with a lower BMI and display a more rapid decline in glycaemic control over time [3]. The increased prevalence of T2D extends to South Asian groups living outside their native countries, and this suggests that there may also be an underlying biological predisposition in addition to environmental and lifestyle factors [2,4].
The advent of large scale genome-wide association studies (GWAS) has led to the identification of over 70 genetic loci that contribute to T2D risk [5][6][7][8][9][10][11][12][13][14][15][16][17]. Most of the early studies were conducted in Europeans, but increasingly, similar approaches are being deployed in samples of South Asian, East Asian and African origin [11,13,[18][19][20][21][22]. These studies have revealed novel signals  [4,13,23], but have also shown appreciable overlap with associations first discovered in European groups [19][20][21][22]. Here, we extend these studies to South Asians from the island of Sri Lanka. As elsewhere in South Asia, the incidence of T2D is increasing and it is predicted that by 2030 approximately 14% of the adult population will have the condition, many of them undiagnosed [1]. Relatively little is known about the genetic predisposition of T2D in this country. In this study, we determined whether a set of T2D-risk variants reaching genome-wide significance in Europeans carry the same disease risk in South Asians from Sri Lanka.

Study Samples
Cases and controls were ascertained from two independent collections of South Asian subjects from Sri Lanka. T2D cases (n = 1001, 44% male) were recruits to the Sri Lankan Young Diabetes Study (SLYDS), consecutively ascertained from private and government diabetes clinics [24]. Age of diabetes diagnosis was between 16-40 years and all participants were under the age of 45 years at recruitment. Of the 1001, 965 had DNA samples available for genotyping. Within these individuals, T2D status was defined if Glutamic Acid Decarboxylase Autoantibodies (GADA) titre was #14 units/ml, and if the interval between diagnosis and the initiation of insulin therapy was at least six months [24]. Twenty-nine samples were excluded due to missing GADA data, 48 for having insulin treatment within 6 months of diagnosis and 10 individuals for a positive diagnosis of mitochondrial diabetes (mt3243 A.G;), leaving 878 T2D cases available for inclusion [24].
Control subjects were participants in the Sri Lankan Diabetes Cardiovascular Study (SLDCS), a cross-sectional epidemiological study that used a multi-stage random cluster sampling technique to recruit 4388 subjects across seven Sri Lankan provinces [25]. DNA collection was initiated partway through the study and DNA samples were available for 1769 subjects. Of these, 1523 individuals who were confirmed as normoglycaemic based on oral glucose tolerance data (interpreted according to then-current ADA and WHO criteria), and under the age of 80 years, were included in this study [26].
At the time of recruitment, participants in both studies were classified according to the major ethnolinguistic and religious groups in Sri Lanka, using categories specified, for example, in the national Census, and based on a combination of spoken language, religion/cultural identification and surname [27]. The majority were Sinhalese (86%) or Tamil (5.4%), with the rest categorised as Muslim (8.3%) or Burgher (0.26%) or having other designations (0.04%).
Participant collection was approved by the Ethical Review Committee of the University of Colombo. All participants provided informed written consent [24,25].

Genotyping and quality control
We genotyped the lead single nucleotide polymorphisms (SNPs) at 37 T2D-risk loci that had reached genome-wide significance in Europeans from studies published as of mid-2010 [28,29]. We genotyped 2401 individuals (878 cases, 1523 controls) using Applied Biosystems TaqMan SNP genotyping assays on an Applied Biosystems 7900HT system. Seventy-four samples with a low (,80%) overall call rate were removed from further analysis leaving 2327 individuals (830 cases, 1497 controls) for final analysis. The average genotyping call rate for these 2327 individuals was 97%.

Statistical Analysis
All 37 SNPs were in Hardy Weinberg equilibrium (HWE) except for the SNP rs2237892 in the KCNQ1 locus (p,0.001 in controls) which was removed from subsequent analyses. First, we used logistic regression to assess the association between each individual SNP and T2D status assuming a log additive model. All associations were adjusted for age and sex. In addition we reanalysed the case-control data after stratifying for self-identified ethnic subgroup using the Cochran-Mantel-Haenszel (CMH) test. The individual SNP association analyses were undertaken in PLINK v1.07 [30,31].
Next we tested the significance of genetic risk scores (GRS) that combine information from all 36 T2D associated SNPs using logistic regression. The SNPs were coded as 0,1 or 2 corresponding to the number of T2Ds risk increasing alleles in Europeans, except for the X chromosome SNP rs5945326 at DUSP9 where male genotypes were coded as 0 or 1, and female genotypes as 0, 0.5 or 1 (to reflect random X inactivation). These analyses were performed using Stata/SE version 10.1 for Windows (StataCorp, Brownsville, TX). To create the GRS, we used individuals with genotypes available from at least 29 of the 36 type 2 diabetes SNPs (i.e. 80% of the SNPs genotyped), and accounted for the varying effect sizes of each SNP using equation 1, where w is the natural log of the per allele type 2 diabetes odds ratio (OR) reported in Europeans.
We used this weighted GRS as the independent variable and T2D status as the dependent variable in logistic regression analyses. We also stratified individuals into quintiles of GRS.
To compare effect size estimates for Sri Lankan case-control samples with those observed in Europeans, we compiled odds ratio (OR) estimates for each locus for European case-control data from the literature [6,7,[14][15][16]29,[33][34][35][36] (Table S1). To minimise inflation of these estimates in initial genome-wide association discovery samples (the ''winner's curse''), we used OR values from replication samples wherever possible.

Power Calculations
Quanto was used to calculate power under assumptions of a logadditive model, a disease prevalence of 10%, and a significance threshold (a) of 0.05. Power was calculated for each individual SNP using allelic odds ratios (for European case-control comparisons) collated from the literature and effect allele frequencies from the CEU component of HapMap (Table S1)

Results
Individual SNP associations for the 36 SNPs in 830 T2D patients and 1497 controls are summarised in Table 1. Nominal associations with T2D (p#0.05) were observed for 16 of the 36 SNPs tested when adjusted for age and sex. Given some casecontrol imbalance with respect to ''self-identified'' ethnolinguistic subgroups (e.g. Sinhala, Tamil), case-control analyses were repeated in stratified samples using the CMH test. The results obtained were broadly comparable with highly-correlated odds ratios (Table 1). In the CMH analysis, a total of 17 loci were nominally-associated with T2D, fourteen of them overlapping with the non-stratified analysis. Two loci (IRS1 and JAZF1) were no longer associated (p,0.05) in the CMH test, but three others (PROX1, PPARG and FTO) became significant for the first time. Subsequent analyses were performed (unless otherwise stated) on the combined dataset.
Given the substantial differences in prevalence and presentation of T2D between European and South Asian populations, we sought evidence for consistent differences in effect size or allele frequency between the present study and previous reports from European studies (see methods). We found significant correlations between Sri Lankan and European samples for both the allelic odds ratio point estimates (Figure 1; r = 0.50, p = 1.8610 23 ) and risk allele frequencies (Figure 2: r = 0.64, p = 2.3610 25 ) but no suggestion of systematic differences in either. Allelic OR point estimates were greater in Sri Lankans than Europeans at 19 of 36 loci, and risk allele frequencies at 20 of 36.
As expected, individuals carrying greater numbers of (weighted) T2D risk increasing alleles had increased T2D risk (Figure 3), with an allelic OR of 1.10 (95%CI: 1.08-1.13, p = 1.2610 217 ) per unit of the weighted genetic risk score. Individuals in the highest quintile of the genetic risk score had more than three-fold higher odds of T2D (3.44 [95%CI: 2.56-4.63] p = 2.6610 216 ) when compared to individuals in the lowest quintile.

Discussion
In this study, we have shown that established T2D-risk variants, most of them first identified in European samples, show strong enrichment of association in T2D cases and controls of South Asian origin from Sri Lanka. This pattern of enrichment, along with the absence of any systematic difference in risk-allele frequency or odds ratios between Sri Lankan and European samples has several important corollaries.
Firstly, these data provide further evidence for the transethnic consistency in allelic patterns of association for T2D, building on similar findings seen in a variety of ethnic groups including other samples of South Asian origin [17,[19][20][21][22]38]. These patterns of transethnic consistency are consistent with a model in which the (often unknown) casual variants driving these association signals are also themselves common, a model which is also increasingly supported by fine mapping data [17,39]. However, definitive confirmation of this model will require comprehensive identification of variants in these regions (e.g. via genome sequencing studies that are ongoing) such that the contribution of variants of all frequencies to disease predisposition can be directly tested.
Secondly, despite differences in both the prevalence and presentation of T2D between Sri Lanka and Europe, we observed no systematic differences in either risk allele frequency or effect size. We conclude therefore, that these ethnic differences in epidemiological and physiological patterns cannot be attributed to differences in common variant predisposition.
Though the general patterns are clear, the relatively modest sample sizes available in this study limit the inferences that can be made at any individual locus. None of the variants tested reached stringent genome-wide significance, and only about half of the loci reached nominal significance (i.e. p#0.05). The patterns of association seen even amongst those variants not reaching nominal significance (from the twenty SNP associations with p.0.05, fifteen are directionally consistent with the associations reported in Europeans) indicate that many of these are likely to be falsenegatives reflecting the limited power of our study. Many of these loci have very modest odds ratios and would have required much larger sample sizes to be detected than were available to us. Indeed, amongst the 15 loci with no formal evidence (p.0.05) of association in our study, but displaying directional consistency with data from Europeans, are several that show evidence of association in other South Asian case-control studies. For example, variants at the GCKR and CDC123 loci were not associated with T2D in the present study, but have strong associations in far larger meta-analyses of South Asian samples [13].
In summary, we have shown that common T2D risk variants identified in Europeans have a similar genetic risk in Sri Lankans, adding further to the evidence that South Asians and Europeans share many overlapping common variants which contribute to T2D risk.

Supporting Information
Table S1 Summary of the reported allele frequency and odds ratio in Europeans for the T2D SNPs investigated. (DOCX) Figure 3. The combined impact of the 36 T2D-associated SNPs on T2D risk in T2D cases and controls of South Asian origin from Sri Lanka. Subjects were grouped into quintiles of the weighted genetic risk score. Circles represent the T2D odds ratio (adjusted for age, sex and ethnic group) when comparing each quintile group to the group in the lowest quintile (Q1). The capped lines represent the 95% CI of the T2D odds ratios. doi:10.1371/journal.pone.0098608.g003