Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Adaptive Human CDKAL1 Variants Underlie Hormonal Response Variations at the Enteroinsular Axis

  • Chia Lin Chang ,

    Affiliation Department of Obstetrics and Gynecology, Chang Gung Memorial Hospital Linkou Medical Center, Chang Gung University, Kweishan, Taoyuan, Taiwan

  • James J. Cai,

    Affiliation Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas, United States of America

  • Shang Yu Huang,

    Affiliation Department of Obstetrics and Gynecology, Chang Gung Memorial Hospital Linkou Medical Center, Chang Gung University, Kweishan, Taoyuan, Taiwan

  • Po Jen Cheng,

    Affiliation Department of Obstetrics and Gynecology, Chang Gung Memorial Hospital Linkou Medical Center, Chang Gung University, Kweishan, Taoyuan, Taiwan

  • Ho Yen Chueh,

    Affiliation Department of Obstetrics and Gynecology, Chang Gung Memorial Hospital Linkou Medical Center, Chang Gung University, Kweishan, Taoyuan, Taiwan

  • Sheau Yu Teddy Hsu

    Affiliation Reproductive Biology and Stem Cell Research Program, Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, California, United States of America


Recent analyses have identified positively selected loci that explain differences in immune responses, body forms, and adaptations to extreme climates, but variants that describe adaptations in energy-balance regulation remain underexplored. To identify variants that confer adaptations in energy-balance regulation, we explored the evolutionary history and functional associations of candidate variants in 207 genes. We screened single nucleotide polymorphisms in genes that had been associated with energy-balance regulation for unusual genetic patterns in human populations, followed by studying associations among selected variants and serum levels of GIP, insulin, and C-peptide in pregnant women after an oral glucose tolerance test. Our analysis indicated that 5′ variants in CDKAL1, CYB5R4, GAD2, and PPARG are marked with statistically significant signals of gene–environment interactions. Importantly, studies of serum hormone levels showed that variants in CDKAL1 are associated with glucose-induced GIP and insulin responses (p<0.05). On the other hand, a GAD2 variant exhibited a significant association with glucose-induced C-peptide response. In addition, simulation analysis indicated that a type 2 diabetes risk variant in CDKAL1 (rs7754840) was selected in East Asians ∼6,900 years ago. Taken together, these data indicated that variants in CDKAL1 and GAD2 were targets of prior environmental selection. Because the selection of the CDKAL1 variant overlapped with the selection of a cluster of GIP variants in the same population ∼11,800 to 2,000 years ago, we speculate that these regulatory genes at the human enteroinsular axis could be highly responsive to environmental selection in recent human history.


Because of the recency of our common ancestry, it is well accepted that the investigation of positively selected single nucleotide polymorphisms (SNPs) in human populations may explain arrays of phenotypic variation in humans. Indeed, recent studies have identified hundreds of loci that exhibit evidences of positive selection in human genomes, and uncovered variants that describe variations in appearance, physiological adaptations, and pathological responses to diseases, thereby providing a better understanding of the molecular mechanisms that underlie these selections [1], [2], [3], [4]. However, genetic variants that confer adaptations in energy-balance regulation remain underexplored. Because such variants were likely subject to selection pressures that fluctuated over time, a moderate signal of selection could be all that has remained as a clue to selection. Accordingly, we screened gene region SNPs for unusual patterns in derived allele frequency and linkage disequilibrium (LD) in genes that had been associated with energy-balance regulation (i.e., past clinical investigations, genome-wide association (GWA) studies, and quantitative trait loci analyses) [5], [6].

In these studies, we identified common 5′ variants in five genes—glucose-dependent insulinotropic polypeptide (GIP), Cdk5 regulatory associated protein 1-like 1 (CDKAL1), cytochrome b5 reductase 4 (CYB5R4), glutamate decarboxylase 2 (GAD2), and peroxisome proliferator-activated receptor gamma (PPARG)—as top metabolic modifier candidates. With the exception of GAD2 variant rs2236418, none of the candidate variants had been implicated in earlier GWA studies although CDKAL1, CYB5R4, and PPARG loci have been associated with diabetes- and/or obesity-related traits [7], [8], [9], [10], [11], [12]. It is possible that these variants operate under physiological conditions that have not been specifically investigated. Consistent with the hypothesis, our recent studies have shown that three regulatory GIP variants were positively selected ∼8,100 (11,800 to 2,000) years ago, and are associated with variations in glucose metabolism and glucose-induced GIP response in pregnant women; moreover, a coding GIP variant (rs2291725) affects the bioactivity of GIP [5], [6].

To gain a better understanding of whether and how CDKAL1, CYB5R4, GAD2, and PPARG variants contribute to adaptations in energy-balance regulation, we further explored their evolutionary history and potential genotype–phenotype relationships in humans.

Materials and Methods

Ethics Statement and Patients

The study was approved by the institutional ethics committee review board of Chang Gung Memorial Hospital Linkou Medical Center, and was conducted in accordance with the guidelines in The Declaration of Helsinki. A total of 131 unrelated healthy women with normal pregnancy were recruited, and all patients gave written informed consent to participate in the study. Related patients (i.e., blood relatives) were excluded to ensure that the observed associations were not due to confounding effects from ancestry. The screening glucose challenge test for gestational diabetes mellitus (GDM) was performed as previously described [13]. Specifically, patients were given a 50-gram glucose solution after overnight fasting at 23–29 weeks of gestation, and a blood sample was taken at 1 hour after glucose intake. The mean age and body mass index (BMI) of patients were 30.7±0.4 years and 25.4±0.3, respectively.


Genomic DNA samples of subjects were extracted and purified from anticoagulated blood with the DNeasy Blood & Tissue Kit (Qiagen). Genotyping of SNPs was performed using the Applied Biosystems TaqMan Validated SNP Genotyping Assays. The genotyping analysis had a >96% success rate and >99% reproducibility.

Measurements and statistical analysis of circulatory levels of GIP, insulin, and C-peptide

Blood samples were collected from patients 1 hr after administration of the oral glucose tolerance test (OGTT). Serum GIP, insulin, and C-peptide levels were measured using specific ELISA kits from Millipore, Mercodia, and Calbiotech, respectively. Patients' serum hormone profiles were analyzed by the χ2 test, the Student's t-test, or linear regression analysis using GraphPad Prism 5. All p-values were two-sided. The statistical significance cutoff value was 0.05.

FST estimation and analysis of haplotypes

To identify adaptive SNPs, we screened for common gene region variants that have divergent allele frequencies in HapMap populations in 207 candidate genes [14]. To improve the odds of identifying potential genotype–phenotype relationships, we focused on SNPs with a >30% minor allele frequency (MAF) in the overall Eurasian (CEU [U.S. residents with northern and western European ancestry] and ASN [pooled samples of Chinese from Beijing [CHB] and Japanese from Tokyo [JPT]) populations. The population genetic differentiation statistic FST among HapMap II populations (i.e., CEU, ASN, and YRI [Yoruba from Ibadan]) was computed using the PGEToolbox and SPSmart [15], [16]. Putative positive SNPs (i.e., those in the top 10%) were further analyzed using pairwise population comparisons. The use of FST is advantageous given that FST measures the proportion of total genetic variance that is caused by differences among populations, and does not require assumptions about the structure of human populations and SNP ascertainment bias. In addition, LD blocks and haplotype plots were analyzed using HaploView 4.1 [17].

Analysis of EHH and iHS statistics

We computed the extended haplotype homozygosity (EHH) statistic for SNPs as previously described [18], and the EHH plots were generated as described [4]. The EHH curve depicts the decay of identity of haplotypes that carry alleles of a core SNP as a function of distance between tested SNPs and the core SNP. When an allele rises rapidly in frequency due to positive selection, it tends to exhibit a high haplotype homozygosity that extends further than expected in a neutral model. For the analysis of EHH decay, haplotypes in the genomic region centered on select SNPs were downloaded from the HapMap project website.

The integrated Haplotype Score (iHS) statistic detects whether the area under the EHH curve for a selected allele is greater than that for a neutral allele; it is a measure of recent positive selection for variants that have not yet reached fixation [4]. To test whether the observed iHS deviated significantly from the expected neutral values, we used a coalescent model and generated 10,000 replicated haplotype sets using the coalescent simulator ms [19]. We adopted simulation parameters compatible with the sample size of the population and the length of the genomic region that was analyzed. The simulation was conditioned based on the same recombination rate and the number of segregating sites within tested regions.

Estimation of the age of selected allele

When a novel allele is positively selected, the allele and its linked haplotype quickly rise to a high frequency. As the allele ages, the length of the haplotype shortens over time due to recombination and mutation. By modelling this process and measuring the decay of the haplotype that carries the positively selected allele, the age of the allele can be estimated [20]. The breakdown of the intactness of the haplotype surrounding the selected allele was modeled using a Poisson process as described earlier [21]. We obtained the recombination map from the FTP site of the HapMap project ( The rs7754840 block consists of 70 SNPs, and is represented by 11 distinct haplotypes in ASN; among which haplotypes 1, 6, and 8–10 carry the derived allele C of rs7754840 (Fig. S1 in File S1, upper left panel). Among these derived haplotypes, haplotype 1 with a frequency of 0.259 is apparently the ancestral haplotype, from which haplotypes 6 and 8–10—with frequencies of 0.029, 0.018, 0.015, and 0.015, respectively—were derived. The relative portion of haplotype 1 (the ancestral haplotype) of all derived haplotypes is P' = 0.77.

The recombination map correlates the increment of physical distance (Mb) with that of genetic distance (cM), which is derived from the estimates of LD between HapMap SNPs [22]. We took the mutation rate of the region u = 1.66×10−6—based on the haploid mutation rate of 1.1×10−8 per base per generation [23] and the probability of ascertaining HapMap SNPs, ∼10−3—for the analysis. We assumed that the most common haplotype is the ancestral haplotype and used P' as an approximation of P.

The probability that a haplotype remains ancestral (i.e., that the haplotype that carries the derived allele retains the status of high frequency right after being selected) is , where G is the number of generations, r is the recombination rate per generation, and μ is the mutation rate per generation. For a given allele in the derived haplotype, the haplotype-decay approach estimates the number of generations G in terms of P (the probability that a given haplotype does not change from its ancestor) [20], [21].

In these analyses, the hypothetical demographic history assumed that the ASN population is one panmictic population that underwent a size reduction, followed by a period of constant size. The population had an ancestral population size of N1, which at time T2 instantaneously shrank to size N2. It remained constant at size N2 until time T1, at which point it began expanding exponentially until the present time. The population size at the present time is N3 (i.e., the bottleneck model) [24].


Common 5′ variants in CDKAL1, CYB5R4, GAD2, and PPARG exhibit signals of selection

Studies of allele frequency of genic region SNPs in 207 candidate genes showed that, in addition to a cluster of 37 linked GIP SNPs [6], common variants in the 5′ proximate region of CDKAL1 (rs9368197, position -1305 nt), CYB5R4 (rs1325471, position -1695 nt), GAD2 (rs2236418, position -243nt), and PPARG (rs2920502, position −154 nt) exhibit highly skewed population frequency in YRI, CEU, and ASN populations (Table 1) [5], [14]. The differences in derived allele frequency between African (YRI) and Eurasian (i.e., CEU and ASN) populations are in excess of 45–70%. For example, the derived allele of PPARG rs2920502 increased from 2.6% in YRI to 35.7% and 75.3% in CEU and ASN, respectively.

Table 1. Gene region candidate SNPs that exhibited signs of selection.

Analysis of FST statistics indicated that the frequency of the derived CYB5R4 rs1325471 and GAD2 rs2236418 alleles in CEU, and the frequency of the derived PPARG rs2920502 allele in ASN are significantly different from those of YRI (Fig. 1; Table 1, p<.05). Consistently, analysis of LD showed that the genomic regions surrounding CDKAL1 rs9368197, GAD2 rs2236418, and PPARG rs2920502 are characterized by long LD blocks in CEU and ASN (Fig. S2, a-c in File S1). On the other hand, there is no appreciable difference in LD block characteristics surrounding CYB5R4 rs1325471 among populations (Fig. S2d in File S1).

Figure 1. Differential distribution of CYB5R4, GAD2, and PPARG variants in human populations.

The distribution of the FST for common SNPs across the human genome and the FST for rs1325471, rs2236418, and rs2920502. For rs1325471 and rs2236418, FST estimations between YRI and CEU (a-b) are shown on the x-axis. For rs2920502, comparisons between YRI and ASN (c) are shown on the x-axis. The black arrows indicate the corresponding values of FST for the candidate variant. The y-axis represents the frequency of SNPs with a given FST estimate. Significant FST, which is defined as higher than 95th percentile of FST of all common SNPs of HapMap populations, is indicated by an asterisk.

Measurements of the iHS statistic using coalescent simulations in HapMap populations with a >10% derived allele frequency showed that the derived CDKAL1 rs9368197 allele exhibits significant EHH as compared to the ancestral allele in CEU and ASN [4], [18](Fig. 2a). The derived CDKAL1 rs9368197 allele-associated haplotypes spanned approximately 150 kb, and the frequency of two 90-kb-long derived haplotypes increased from negligible in YRI to more than 53% in ASN (Fig. S3 in File S1, upper panel). Likewise, the ancestral CYB5R4 rs1325471 allele exhibited EHH in the YRI (Fig. 2b) whereas the ancestral and derived alleles at rs2236418 of GAD2 exhibited EHH in CEU and YRI, respectively (Fig. 2c). By contrast, rs2920502 of PPARG did not exhibit appreciable EHH (Fig. 2d).

Figure 2. Haplotype diversity surrounding CDKAL1, CYB5R4, and GAD2 variants is allele-dependent in select populations.

Plots of iHS over the distance between selected variants and neighboring SNPs at increasing distances in YRI, CEU, and ASN populations (rs9368197 [a], rs1325471 [b], rs2236418 [c], rs292502 [d], rs7754840 [e], and rs7756992 [f]). A significant difference in EHH decay between the derived allele and the ancestral allele is indicated by an asterisk. The candidate variant is positioned at the center of the plots.

Because two intronic CDKAL1 variants (rs7754840 and rs7756992) that are positioned close to rs9368197 have been associated with type 2 diabetes and GDM in GWA studies [25], [26], [27], [28], we also analyzed the genetic patterns surrounding these two variants (Fig. 2, e-f) [8], [9], [25], [26]. Although the allele frequencies of these variants were similar among HapMap populations (Table 1), the derived allele of rs7754840, but not rs7756992, exhibited a significant EHH in ASN (Fig. 2e).

Taken together, these data indicated that directional selection resulted in the spread of the derived rs9368197 and rs7754840 alleles of CDKAL1, the ancestral rs2236418 allele of GAD2, and the derived rs2920502 allele of PPARG in CEU and/or ASN populations. On the other hand, the ancestral CYB5R4 rs1325471 allele and the derived GAD2 rs2236418 allele were selected in YRI populations.

Common CDKAL1 variants rs9368197 and rs7754840 impart variations in glucose-induced insulin and GIP response, respectively, in pregnant women

Because pregnancy represents a critical life stage that subjects individuals to excessive metabolic load, and its success has a major impact on reproductive fitness, we chose to analyze the relationships among candidate variants and serum levels of GIP, insulin, and C-peptide in pregnant women with the hope that it would provide a sensitive model to uncover potential genotype–phenotype relationships.

The frequency of candidate SNPs in the study population was similar to that in ASN (Table 2), and was in the Hardy–Weinberg equilibrium. Measurements of serum hormone levels showed that levels of insulin, but not GIP or C-peptide, in patients who are homozygous for the derived T allele of rs9368197 (i.e., CDKAL1−1305T/T, N = 36) are significantly lower than those carrying an ancestral rs9368197 allele (i.e., CDKAL1−1305G/G and CDKAL1−1305G/T, N = 89)(Fig. 3a; Table 2).

Figure 3. Genotypes of rs9368197 and rs7754840 in CDKAL1 impart a difference in glucose-induced insulin and GIP response, respectively, after glucose challenge tests.

a) Measurements of serum insulin levels in patients during the 23rd to the 29th weeks of pregnancy at 1 hr after the 50-gram glucose challenge test. Levels of insulin in patients who were homozygous for the derived T allele of rs9368197 (28.8±3.1 mU/L, N = 36) were significantly lower than those of patients carrying the ancestral G allele (i.e., heterozygotes and ancestral allele homozygotes, 42.8±3.7 mU/L, N = 89). b) Measurements of serum GIP levels in patients during the 23rd to the 29th weeks of pregnancy at 1 hr after the 50-gram glucose challenge test. Circulating levels of GIP in patients who were homozygous for the ancestral G allele of rs7754840 (68.9±4.5 pg/ml, N  = 54) were significantly lower than were those of patients carrying a derived C allele (i.e., heterozygotes and derived allele homozygotes, 82.8±4.5 pg/ml, N = 65). Serum hormone levels are presented as mean ± SEM.

Table 2. Association between selected SNPs and hormonal responses after glucose challenge tests in pregnant women.

On the other hand, circulating GIP level in patients who are homozygous for the ancestral G allele of rs7754840 (i.e., CDKAL1+12656G/G, N = 54) was significantly lower than that of those carrying a derived C allele (i.e., CDKAL1+126562C/C and CDKAL1+126562G/C, N = 65)(Fig. 3b; Table 2). Consistently, circulating GIP level in patients who are homozygous for the derived rs7754840 C allele (i.e., CDKAL1+12656C/C, N = 15) was significantly higher than that of those carrying an ancestral G allele (i.e., CDKAL1+126562G/G and CDKAL1+126562G/C, N = 104)(Table 2). In addition, linear regression analysis showed that the derived CDKAL1 rs7754840 allele is associated with an enhanced glucose-induced GIP response with an R2 = 0.046 (p = 0.018).

Furthermore, patients who are homozygous for the ancestral G allele of GAD2 rs2236418 (GAD2−243G/G, N = 10) had a significantly lower circulating level of C-peptide compared with those carrying a derived A allele (GAD2−243G/A and GAD2−243A/A, N = 118)(Table 2). By contrast, the CBY5R4 and PPARG variants did not have any significant relationship with serum insulin, GIP, or C-peptide levels.

The type 2 diabetes/GDM risk variant of CDKAL1 rs7754840 emerged ∼6,900 years ago in East Asians

To better understand putative events that led to the selection of CDKAL1 variants, we estimated the age of selection based on coalescent simulations [20], [21], [29], [30]. Scanning of the LD map and haplotype structures showed that informative SNPs surrounding rs9368197 are too sparse for reliable age determination. On the other hand, rs7754840 is located in a 151-kb-long LD block, bound between rs6927481 and rs7741604 in the ASN (Fig. S1 in File S1, upper left panel). Using the recombination map of the HapMap project [14], [22], we interpolated the values of accumulative recombination at the positions of the two boundary SNPs of the LD block and took the difference between the two values (0.0946 cM; i.e., 0.0946% chance of crossing over per generation) as the recombination fraction for the region. This was translated into a recombination rate r = 9.46×10−4. Using the equation, we obtained a generation estimate G = 276. An estimated generation time of 25 years indicated that the age of selection for rs7754840 is ∼6,900 years (Fig. 4).

Figure 4. CDKAL1 and GIP variants were selected in recent human history.

Schematic representation of the selection of multiple CDKAL1 and GIP variants in ASN population [5], [6]. The selection of CDKAL1 and GIP variants appeared to occur independently within a 1,200-year time window in the last ten thousand years. The effects of these selections on enteroinsular hormonal responses are indicated by red arrows. The actions of hormones are indicated by black arrows.


Based on an analysis of the evolutionary history of candidate genes, we showed that common 5′ variants in CDKAL1, CYB5R4, GAD2, and PPARG as well as an intronic type 2 diabetes-associated CDKAL1 variant exhibit non-neutral evolutionary patterns. In addition, we found that CDKAL1 and GAD2 variants are associated with glucose-induced insulin, GIP, or C-peptide responses in pregnant women. Together with our earlier studies of adaptive GIP variants [5], [6], these data provided strong evidence that the GIP-insulin-glucose axis is a hotspot for energy-balance regulation-associated selection (Fig. 4) [31].

A wide spectrum of physiological differences among humans are believed to be the result of positive selection of pre-existing variants and recent mutations that accumulated in response to environmental or cultural changes [4], [32]. Recent advances have revealed adaptive SNPs that are associated with immune responses, changes in body forms (e.g., that affect the pigmentation of skin, hair, or eyes), and adaptations to extreme climates (e.g., the selection of the EPAS1 gene in Tibetans) after humans spread to various parts of the world in the last 50–60 thousand years [1], [2], [3], [4]. However, the identification of adaptive variants in energy-balance regulation appeared to lag behind that of other traits [33], [34]. Earlier studies have shown that variants in TCF7L2 were positively selected, and were associated with BMI and levels of ghrelin and leptin; however, no general enrichment of adaptive variants in type 2 diabetes- and obesity-associated loci have been found [35]. On the other hand, a study of SNPs in 40 major human diseases indicated that a repertoire of type 2 diabetes risk alleles have a tendency for directional selection in Eurasians [36]. It was speculated that this phenomenon could be associated with culture changes; however, the importance of this ensemble of “selection signal” in humans remains to be investigated. Consistent with these earlier studies, we found only a few SNPs that exhibit signals of selection. The paucity of adaptive SNPs in energy-balance regulation-associated genes could be due to heterogeneity in intensity, form, or time of environmental selection. In addition, recent culture changes that have subjected humans to varying degrees of population growth, and cycles of feast and famine could have made such variants retain only a minimal “signature of selection” [37].

The finding that insulin levels are CDKAL1 variant-dependent is not new. Intronic CDKAL1 variants rs7754840, rs7756992, and rs10946398 have been associated with variations in insulin release, pancreatic cell functions, hemoglobin A1C level, and/or response to pancreatic KATP channel agonists [8], [38], [39] as well as type 2 diabetes, GDM, ulcerative colitis, Crohn's disease, obesity, and/or birth weight [9], [25], [26], [28]. CDKAL1 has been shown to function as a tRNA modification enzyme, and its activity is associated with ATP generation and first-phase insulin secretion [40], [41], [42]. The carriage of CDKAL1 risk variants may lead to lower insulin release and impaired conversion of proinsulin to insulin in pancreatic beta cells [43], [44], [45]. Although the effect of derived rs9368197 allele on glucose-induced insulin response is similar to that reported for the derived rs7754840 allele, the finding that the derived rs7754840 allele (i.e., C) is associated with an elevated GIP response is unique because the same allele has been found to be associated with type 2 diabetes and reduced insulin secretion in earlier studies [43], [44], [45], [46]. These results suggest that the selection of derived rs7754840 allele has opposite effects on insulin secretion and glucose-induced GIP response. Because GIP's normal function is to stimulate insulin secretion, the adverse effect of derived rs7754840 allele on insulin metabolism may be counteracted by that on glucose-induced GIP response. Therefore, the reported association between CDKAL1 variants and insulin metabolism could be partially affected by the rs7754840-associated GIP responses. In addition, the selection of the derived rs7754840 allele in East Asians may explain part of the observed differences in association strength between CDKAL1 and type 2 diabetes among human populations [46]. In corroboration with this idea, recent studies have reported that circulating GIP levels are highly familial compared to circulating levels of glucose, insulin, or C-peptide [47]. Future study of the effect of CDKAL1 on GIP synthesis/secretion is needed to reveal whether CDKAL1 affects insulin secretion and GIP metabolism via a similar mechanism.

In a recent investigation, we documented the strong selection of a cluster of GIP variants in East Asians ∼8,100 (11,800 to 2,000) years ago [5], [6]. It is surprising to find that the selection of rs7754840 in CDKAL1 occurred in the same population at approximately the same time period (i.e., ∼6,900 years ago). Because the selection of CDKAL1 and GIP variants overlapped, and because GIP metabolism is highly regulated during pregnancy [48], we speculate that the regulation of enteroinsular axis in Eurasians was subject to strong environmental selection in recent human history (Fig. 4). Because no significant selection of SNPs was observed in other key enteroinsular regulatory genes—including insulin, insulin receptor, glucagon-like peptide-1 (GLP-1), GLP-1 receptor, glucagon, glucagon receptor, and GIP receptor—CDKAL1 and GIP appear to be unique targets for energy-balance regulation-associated selections.

Although it is impossible to discern which environmental forces are responsible for the selection of CDKAL1 and GIP variants, the selection of these variants during a period that heralded the agriculture revolution and animal domestication (i.e., 10,000 to 4,000 years ago following the Neolithic period) suggests that the selected CDKAL1 and GIP variants may have facilitated the survival of their carriers in the face of changing subsistence culture. As culture again shifted in modern society, some of these prior advantageous adaptations (e.g., rs7754840) now may manifest as risk factors for type 2 diabetes and GDM [25], [27], [28], [44], [45]. In support of this view, recent studies have shown that minimal changes in subsistence patterns could have profound effects on the survival of individuals [49]. For example, it has been shown that feeding mice human-relevant concentrations of added sugar (i.e., 25% kcal from a mixture of fructose and glucose) is sufficient to increase mortality and reduce fertility even though such treatments have minimal effects on serum levels of cholesterol, fasting insulin, fasting glucose, or fasting triglycerides. Although our hypothesis cannot be falsified at the present time, future studies that consider Neanderthal and Denisovan genomes may provide further insight [50].

Studies of hormonal profiles have also uncovered an association between GAD2 rs2236418 and glucose-induced C-peptide response. Earlier studies have shown that polymorphisms of GAD2, which encodes an enzyme that catalyzes the production of gamma-aminobutyric acid (GABA), are associated with obesity, BMI, and postabsorptive resting energy expenditure in selected populations [10], [51], [52], [53], [54]. GABA originating within the islets can evoke tonic currents, and decrease both insulin and glucagon secretion [55], [56]. In addition, the GABA signaling system has been shown to be compromised in islets of type 2 diabetes patients [57]. Therefore, the adaptive GAD2 variant could indirectly contribute to variations in C-peptide, and perhaps insulin, metabolism by affecting GABA synthesis. Future study of the rs2236418-C-peptide relationship in other populations is needed to have a better understanding of the regulation of enteroinsular axis via neuronal signaling in humans.

Whereas CYB5R4 and PPARG variants were not associated with the limited number of hormones that were analyzed, the “signals of selection” surrounding these variants warrant future investigations under different conditions. Among them, rs2920502 and linked SNPs like rs2920500 could be particularly interesting because PPARG has been associated with the development of type 2 diabetes, atherosclerosis, and cancer [7], [58], and because PPAR-γ agonists (e.g., thiazolidinedione) have been used to treat type 2 diabetes [58]. Likewise, future studies in patients of African or European origin may further reveal the role of selected CYB5R4 and GAD2 variants in the regulation of hormonal metabolism because these variants were selected in CEU and/or YRI.

In conclusion, the present study showed that adaptive CDKAL1 variants could spread in Eurasians by contributing to alternative insulin or GIP responses. Therefore, environmental or culture changes in recent human history could have shaped the regulation of our enteroinsular axis to a greater extent than what has previously been assumed.

Supporting Information

File S1.

Figure S1. Plots of the haplotype structure of SNPs in a 151-kb LD block surrounding rs7754840 in the HapMap II populations. The 70 SNPs between rs6927481 and rs7741604 (151-kb in length) were linked, and displayed a low haplotype diversity in the ASN population (left upper panel). By contrast, the same region in the YRI chromosomes (right panel) exhibited a high complexity compared to the CEU and ASN populations. The position of rs7754840 is indicated by a vertical rectangular box. Figure S2. Variants in CDKAL1, GAD2, and PPARG are highly linked in select human populations. Plots of the degree of LD between each pair of genotyped SNPs in a 200-kb region surrounding the CDKAL1 (a), GAD2 (b), PPARG (c), and CYB5R4 (d) loci in YRI, CEU, and ASN populations. The color scheme was based on r2 values. Red areas represent regions with a high degree of LD and a high likelihood of odds (LOD) (D' = 1, LOD scores >2). Blue areas represent regions with low LOD (D' = 1, LOD <2). Figure S3. Haplotype structures of SNPs neighboring rs9368197 in the HapMap II populations. Plots of the haplotypes in a 200-kb genomic region (20,541-20,741 kb on chromosome 6) surrounding rs9368197 in CDKAL1 showed that ASN chromosomes (upper panel) are characterized by a low-complexity structure whereas YRI chromosomes exhibit extensive recombinations (lower panel). The CEU chromosomes exhibited an intermediate pattern of complexity (middle panel). A 90-kb region with extensive haplotype homozygosity in ASN chromosomes is indicated by a red rectangular box in each of the three haplotype structure plots. The position of rs9368197 in each plot is indicated by a blue arrow.



We thank Shripa Patel (Stanford University Pan Facility) and Wei Yi (Dept. of OB/GYN, Stanford University) for their excellent technical assistance.

Author Contributions

Conceived and designed the experiments: CLC SYTH. Performed the experiments: CLC JJC PJC HYC SYH SYTH. Analyzed the data: CLC JJC SYTH. Contributed reagents/materials/analysis tools: CLC JJC PJC HYC SYH SYTH. Wrote the paper: CLC JJC SYTH.


  1. 1. Leslie M (2010) Genetics. Kidney disease is parasite-slaying protein's downside. Science 329: 263.
  2. 2. Simonson TS, Yang Y, Huff CD, Yun H, Qin G, et al.. (2010) Genetic Evidence for High-Altitude Adaptation in Tibet. Science May 13..
  3. 3. Tishkoff SA, Reed FA, Ranciaro A, Voight BF, Babbitt CC, et al. (2007) Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet 39: 31–40.
  4. 4. Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4: e72.
  5. 5. Chang CL, Cai JJ, Cheng PJ, Chueh HY, Hsu SY (2011) Identification of metabolic modifiers that underlie phenotypic variations in energy-balance regulation. Diabetes 60: 726–734.
  6. 6. Chang CL, Cai JJ, Lo C, Amigo J, Park JI, et al. (2011) Adaptive selection of an incretin gene in Eurasian populations. Genome Res 21: 21–32.
  7. 7. Hansen L, Ekstrom CT, Tabanera YPR, Anant M, Wassermann K, et al. (2006) The Pro12Ala variant of the PPARG gene is a risk factor for peroxisome proliferator-activated receptor-gamma/alpha agonist-induced edema in type 2 diabetic patients. J Clin Endocrinol Metab 91: 3446–3450.
  8. 8. Pascoe L, Tura A, Patel SK, Ibrahim IM, Ferrannini E, et al. (2007) Common variants of the novel type 2 diabetes genes CDKAL1 and HHEX/IDE are associated with decreased pancreatic beta-cell function. Diabetes 56: 3101–3104.
  9. 9. Steinthorsdottir V, Thorleifsson G, Reynisdottir I, Benediktsson R, Jonsdottir T, et al. (2007) A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet 39: 770–775.
  10. 10. Witchel SF, White C, Libman I (2009) Association of the −243 A—>G polymorphism of the glutamate decarboxylase 2 gene with obesity in girls with premature pubarche. Fertil Steril 91: 1869–1876.
  11. 11. Xie J, Zhu H, Larade K, Ladoux A, Seguritan A, et al. (2004) Absence of a reductase, NCB5OR, causes insulin-deficient diabetes. Proc Natl Acad Sci U S A 101: 10750–10755.
  12. 12. Boutin P, Dina C, Vasseur F, Dubois S, Corset L, et al. (2003) GAD2 on chromosome 10p12 is a candidate gene for human obesity. PLoS Biol 1: E68.
  13. 13. Kjos SL, Buchanan TA (1999) Gestational diabetes mellitus. N Engl J Med 341: 1749–1756.
  14. 14. The-International-HapMap-Project, Frazer KA, Ballinger DG, Cox DR, Hinds DA, et al (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.
  15. 15. Cai JJ (2008) PGEToolbox: A Matlab toolbox for population genetics and evolution. J Hered 99: 438–440.
  16. 16. Amigo J, Salas A, Phillips C, Carracedo A (2008) SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access. BMC Bioinformatics 9: 428.
  17. 17. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265.
  18. 18. Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.
  19. 19. Hudson RR (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18: 337–338.
  20. 20. Stephens JC, Reich DE, Goldstein DB, Shin HD, Smith MW, et al. (1998) Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am J Hum Genet 62: 1507–1515.
  21. 21. Reich DE (1998) Estimating the age of mutations using variation at linked markers. Microsatellites: evolution and applications. Oxford: Oxford University Press. pp. 129–138.
  22. 22. McVean GA, Myers SR, Hunt S, Deloukas P, Bentley DR, et al. (2004) The fine-scale structure of recombination rate variation in the human genome. Science 304: 581–584.
  23. 23. Roach JC, Glusman G, Smit AF, Huff CD, Hubley R, et al. (2010) Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328: 636–639.
  24. 24. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (2009) Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5: e1000695.
  25. 25. Cho YM, Kim TH, Lim S, Choi SH, Shin HD, et al. (2009) Type 2 diabetes-associated genetic variants discovered in the recent genome-wide association studies are related to gestational diabetes mellitus in the Korean population. Diabetologia 52: 253–261.
  26. 26. Lauenborg J, Grarup N, Damm P, Borch-Johnsen K, Jorgensen T, et al. (2009) Common type 2 diabetes risk gene variants associate with gestational diabetes. J Clin Endocrinol Metab 94: 145–150.
  27. 27. Wang Y, Nie M, Li W, Ping F, Hu Y, et al. (2011) Association of six single nucleotide polymorphisms with gestational diabetes mellitus in a Chinese population. PLoS ONE 6: e26953.
  28. 28. Kwak SH, Kim SH, Cho YM, Go MJ, Cho YS, et al. (2012) A genome-wide association study of gestational diabetes mellitus in Korean women. Diabetes 61: 531–541.
  29. 29. Nei M, Suzuki Y, Nozawa M (2010) The neutral theory of molecular evolution in the genomic era. Annu Rev Genomics Hum Genet 11: 265–289.
  30. 30. Kimura M, Ota T (1973) The age of a neutral mutant persisting in a finite population. Genetics 75: 199–212.
  31. 31. McIntosha C, Widenmaiera S, Kim S (2009) Glucose-Dependent Insulinotropic Polypeptide (Gastric Inhibitory Polypeptide; GIP) Vitamins & Hormones. 80: 409–471.
  32. 32. Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449: 913–918.
  33. 33. Luca F, H PG, di Rienzo A (2010) Evolutionary Adaptatons to Dietary Changes. Annu Rev Nutr 30: 291–314.
  34. 34. Pritchard JK, Pickrell JK, Coop G (2010) The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Curr Biol 20: R208–215.
  35. 35. Helgason A, Palsson S, Thorleifsson G, Grant SF, Emilsson V, et al. (2007) Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nat Genet 39: 218–225.
  36. 36. Chen R, Corona E, Sikora M, Dudley JT, Morgan AA, et al. (2012) Type 2 diabetes risk alleles demonstrate extreme directional differentiation among human populations, compared to other diseases. PLoS Genet 8: e1002621.
  37. 37. Hancock AM, Witonsky DB, Ehler E, Alkorta-Aranburu G, Beall C, et al. (2010) Colloquium paper: human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency. Proc Natl Acad Sci U S A 107 Suppl 28924–8930.
  38. 38. Miyaki K, Oo T, Song Y, Lwin H, Tomita Y, et al. (2010) Association of a cyclin-dependent kinase 5 regulatory subunit-associated protein 1-like 1 (CDKAL1) polymorphism with elevated hemoglobin A(1)(c) levels and the prevalence of metabolic syndrome in Japanese men: interaction with dietary energy intake. Am J Epidemiol 172: 985–991.
  39. 39. Ohara-Imaizumi M, Yoshida M, Aoyagi K, Saito T, Okamura T, et al. (2010) Deletion of CDKAL1 affects mitochondrial ATP generation and first-phase insulin exocytosis. PLoS ONE 5: e15553.
  40. 40. Xie P, Wei FY, Hirata S, Kaitsuka T, Suzuki T, et al. (2013) Quantitative PCR measurement of tRNA 2-methylthio modification for assessing type 2 diabetes risk. Clin Chem 59: 1604–1612.
  41. 41. Wei FY, Tomizawa K (2011) Functional loss of Cdkal1, a novel tRNA modification enzyme, causes the development of type 2 diabetes. Endocr J 58: 819–825.
  42. 42. Wei FY, Suzuki T, Watanabe S, Kimura S, Kaitsuka T, et al. (2011) Deficit of tRNA(Lys) modification by Cdkal1 causes the development of type 2 diabetes in mice. J Clin Invest 121: 3598–3608.
  43. 43. Chistiakov DA, Potapov VA, Smetanina SA, Bel'chikova LN, Suplotova LA, et al. (2011) The carriage of risk variants of CDKAL1 impairs beta-cell function in both diabetic and non-diabetic patients and reduces response to non-sulfonylurea and sulfonylurea agonists of the pancreatic KATP channel. Acta Diabetol 48: 227–235.
  44. 44. Stancakova A, Pihlajamaki J, Kuusisto J, Stefan N, Fritsche A, et al. (2008) Single-nucleotide polymorphism rs7754840 of CDKAL1 is associated with impaired insulin secretion in nondiabetic offspring of type 2 diabetic subjects and in a large sample of men with normal glucose tolerance. J Clin Endocrinol Metab 93: 1924–1930.
  45. 45. Kirchhoff K, Machicao F, Haupt A, Schafer SA, Tschritter O, et al. (2008) Polymorphisms in the TCF7L2, CDKAL1 and SLC30A8 genes are associated with impaired proinsulin conversion. Diabetologia 51: 597–601.
  46. 46. Dehwah MA, Wang M, Huang QY (2010) CDKAL1 and type 2 diabetes: a global meta-analysis. Genet Mol Res 9: 1109–1120.
  47. 47. Gjesing AP, Ekstrom CT, Eiberg H, Urhammer SA, Holst JJ, et al. (2012) Fasting and oral glucose-stimulated levels of glucose-dependent insulinotropic polypeptide (GIP) and glucagon-like peptide-1 (GLP-1) are highly familial traits. Diabetologia 55: 1338–1345.
  48. 48. Moffett RC, Irwin N, Francis JM, Flatt PR (2013) Alterations of Glucose-Dependent Insulinotropic Polypeptide and Expression of Genes Involved in Mammary Gland and Adipose Tissue Lipid Metabolism during Pregnancy and Lactation. PLoS ONE 8: e78560.
  49. 49. Ruff JS, Suchy AK, Hugentobler SA, Sosa MM, Schwartz BL, et al. (2013) Human-relevant levels of added sugar consumption increase female mortality and lower male fitness in mice. Nat Commun 4: 2245.
  50. 50. Crisci JL, Wong A, Good JM, Jensen JD (2011) On characterizing adaptive events unique to modern humans. Genome Biol Evol 3: 791–798.
  51. 51. Chen KC, Lin YC, Chao WC, Chung HK, Chi SS, et al. (2012) Association of genetic polymorphisms of glutamate decarboxylase 2 and the dopamine D2 receptor with obesity in Taiwanese subjects. Ann Saudi Med 32: 121–126.
  52. 52. Goossens GH, Petersen L, Blaak EE, Hul G, Arner P, et al. (2009) Several obesity- and nutrient-related gene polymorphisms but not FTO and UCP variants modulate postabsorptive resting energy expenditure and fat-induced thermogenesis in obese individuals: the NUGENOB study. Int J Obes (Lond) 33: 669–679.
  53. 53. Boesgaard TW, Castella SI, Andersen G, Albrechtsen A, Sparso T, et al. (2007) A -243A—>G polymorphism upstream of the gene encoding GAD65 associates with lower levels of body mass index and glycaemia in a population-based sample of 5857 middle-aged White subjects. Diabet Med 24: 702–706.
  54. 54. Bannai M, Ichikawa M, Nishihara M, Takahashi M (1998) Effect of injection of antisense oligodeoxynucleotides of GAD isozymes into rat ventromedial hypothalamus on food intake and locomotor activity. Brain Res 784: 305–315.
  55. 55. Bonaventura MM, Crivello M, Ferreira ML, Repetto M, Cymeryng C, et al. (2012) Effects of GABAB receptor agonists and antagonists on glycemia regulation in mice. Eur J Pharmacol 677: 188–196.
  56. 56. Braun M, Ramracheya R, Bengtsson M, Clark A, Walker JN, et al. (2010) Gamma-aminobutyric acid (GABA) is an autocrine excitatory transmitter in human pancreatic beta-cells. Diabetes 59: 1694–1701.
  57. 57. Taneera J, Jin Z, Jin Y, Muhammed SJ, Zhang E, et al. (2012) gamma-Aminobutyric acid (GABA) signalling in human pancreatic islets is altered in type 2 diabetes. Diabetologia 55: 1985–1994.
  58. 58. Tontonoz P, Spiegelman BM (2008) Fat and beyond: the diverse biology of PPARgamma. Annu Rev Biochem 77: 289–312.