17 Sep 2013: Mühlenbruch K, Jeppesen C, Joost HG, Boeing H, Schulze MB (2013) Correction: The Value of Genetic Information for Diabetes Risk Prediction – Differences According to Sex, Age, Family History and Obesity. PLOS ONE 8(9): 10.1371/annotation/65bd3a11-b821-4f10-88d2-29b69a730f21. https://doi.org/10.1371/annotation/65bd3a11-b821-4f10-88d2-29b69a730f21 View correction
Genome-wide association studies have identified numerous single nucleotide polymorphisms associated with type 2 diabetes through the past years. In previous studies, the usefulness of these genetic markers for prediction of diabetes was found to be limited. However, differences may exist between substrata of the population according to the presence of major diabetes risk factors. This study aimed to investigate the added predictive value of genetic information (42 single nucleotide polymorphisms) in subgroups of sex, age, family history of diabetes, and obesity.
A case-cohort study (random subcohort N = 1,968; incident cases: N = 578) within the European Prospective Investigation into Cancer and Nutrition Potsdam study was used. Prediction models without and with genetic information were evaluated in terms of the area under the receiver operating characteristic curve and the integrated discrimination improvement. Stratified analyses included subgroups of sex, age (<50 or ≥50 years), family history (positive if either father or mother or a sibling has/had diabetes), and obesity (BMI< or ≥30 kg/m2).
A genetic risk score did not improve prediction above classic and metabolic markers, but – compared to a non-invasive prediction model – genetic information slightly improved the area under the receiver operating characteristic curve (difference [95%-CI]: 0.007 [0.002–0.011]). Stratified analyses showed stronger improvement in the older age group (0.010 [0.002–0.018]), the group with a positive family history (0.012 [0.000–0.023]) and among obese participants (0.015 [−0.005–0.034]) compared to the younger participants (0.005 [−0.004–0.014]), participants with a negative family history (0.003 [−0.001–0.008]) and non-obese (0.007 [0.000–0.014]), respectively. No difference was found between men and women.
There was no incremental value of genetic information compared to standard non-invasive and metabolic markers. Our study suggests that inclusion of genetic variants in diabetes risk prediction might be useful for subgroups with already manifest risk factors such as older age, a positive family history and obesity.
Citation: Mühlenbruch K, Jeppesen C, Joost H-G, Boeing H, Schulze MB (2013) The Value of Genetic Information for Diabetes Risk Prediction – Differences According to Sex, Age, Family History and Obesity. PLoS ONE 8(5): e64307. https://doi.org/10.1371/journal.pone.0064307
Editor: Christian Herder, German Institute of Human Nutrition Potsdam-Rehbruecke, Germany
Received: January 14, 2013; Accepted: April 13, 2013; Published: May 20, 2013
Copyright: © 2013 Mühlenbruch et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported in part by a grant from the German Federal Ministry of Education and Research (BMBF) to the German Center for Diabetes Research (DZD e.V.). The recruitment phase of the EPIC-Potsdam Study was supported by the Federal Ministry of Science, Germany (01 EA 9401) and the European Union (SOC 95201408 05F02). The follow-up of the EPIC-Potsdam Study was supported by German Cancer Aid (70-2488-Ha I) and the European Community (SOC 98200769 05F02). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
According to the most recent meta-analysis of genome-wide association studies, 63 individual SNPs have now been linked with diabetes risk . However, these variants explain only ∼5.7% of variance in disease susceptibility . Genetic markers have also been frequently compared with established risk factors for type 2 diabetes in terms of their usefulness for predicting risk . For example, we have previously reported that information on 20 SNPs is not informative for predicting future diabetes in the EPIC-Potsdam study . Overall, prospective studies showed limited predictive value of genetic markers in general, and particularly if compared to classical non-genetic risk factors . However, few studies indicate that prediction by genetic variants might be informative among specific subgroups, e.g. individuals who are younger (<50 years) , , or who are obese . However, a systematic comparison of genetic and non-genetic risk factors in subgroups of a prospective study that allows an accurate determination of the diabetes risk is still lacking. Our aim was therefore to evaluate if the predictive value of a large set of genetic variants differed between subgroups according to sex, age, family history, and BMI.
The ethics committee of the State of Brandenburg, Germany, gave approval for the study and written informed consent was obtained from all participants.
Study design and participants
The European Prospective Investigation into Cancer and Nutrition (EPIC)–Potsdam study is a population-based cohort study of 27,548 participants recruited from Potsdam, Germany, in the years 1994–1998 . Participants were mainly aged 35–65 years at baseline. Follow-up assessment was performed every 2 to 3 years to identify incident type 2 diabetes cases. Over a mean follow-up time of 7 years, 849 incident cases were identified. The diagnosis of incident cases was based on self-reports in a questionnaire and verification by physicians.
We used a case-cohort nested within the prospective EPIC-Potsdam cohort for evaluation of genetic risk factors. Out of 26,444 study participants with blood samples collected at baseline a random subcohort with 2,500 participants was selected. This subcohort is representative for the full cohort and baseline characteristics showed no significant differences .Participants with prevalent type 2 diabetes at baseline, abnormal baseline plasma glucose levels or more than 9 missing data on SNPs were excluded from analysis leaving 1,968 participants in the subcohort. Of the 801 incident cases identified during follow-up in the cohort with blood samples, 578 cases remained after similar exclusions.
Baseline information was used to calculate the German Diabetes Risk Score (GDRS), a validated prediction equation developed in the EPIC-Potsdam cohort including the following non-invasive measures: age (years), height (cm), waist circumference (cm), prevalent hypertension (yes vs. no), physical activity (h/week), smoking (currently smoking≥20 cig./d, ex-smoking vs. never smoker or currently smoking <20 cig./d), alcohol intake (moderate consumption [10–40 g/d] vs. low or high consumption), intake of red meat (150 g/d), intake of wholegrain bread (50 g/d) and coffee consumption (150 ml cup/d). . Diet was assessed with a validated semiquantitative food frequency questionnaires (FFQ) including 148 food items. Frequencies were measured in 10 categories and portion sizes were estimated from photographs of standard portion sizes , , . Information on family history of diabetes was obtained from self-reports in a questionnaire, and body mass index (BMI) was calculated from measures of height and weight (kg/m2) in a physical examination. Measurements of metabolic markers were described in previous reports , .
Genotyping of 20 previously analyzed SNPs associated with type 2 diabetes was performed with Taqman technology (Applied Biosystems, Foster City, CA) . For the current analyses, 23 additional SNPs were genotyped by KBioscience (http://www.kbioscience.co.uk) using KASP SNP genotyping system. This is a competitive allele-specific PCR incorporating a FRET quencher cassette. The overall set of SNPs was selected to reflect common diabetes-associated single nucleotide polymorphisms and to be largely identical to the set of SNPs evaluated in sub-group analysis the Framingham Offspring study  for better comparability. The accuracy of genotyping was independently assessed to be between 0 and 0.2%, with reproducibility at 99.9% and success rate of 96.5%. For single SNPs frequency of missing genotype information was lower than 6.5%. All SNPs were in Hardy-Weinberg equilibrium (p>0.001), except for rs5945326 (near DUSP9 gene) and hence this SNP was excluded from analysis.
We assumed an additive model for each SNP with values of 0, 1 and 2 for the number of risk alleles and analyzed the predictive value of the SNPs using a count genetic score. For participants with missing genotypes the genetic score was standardized to score values for participants with complete genotype information .
The incremental value of metabolic markers and the count genetic score was investigated with several different prediction models. The discriminatory ability of each model was determined with the areas under receiver operating characteristic curves (ROC-AUCs) using logistic regression analysis. Model comparisons of a sparser with an extended model were used to assess the improvement in prediction with difference in ROC-AUCs and 95% confidence intervals (95% CI) calculated with the Delong test . The integrated discrimination improvement (IDI) was calculated with predicted risks from logistic regression . Model comparisons were repeated in subgroups of sex, age (<50, ≥50 years of age), family history of diabetes (positive: father, mother or at least one sibling has type 2 diabetes) and obesity (positive: BMI≥30 kg/m2).
All statistical analyses were performed with SAS (Version 9.2, Enterprise Guide 4.3, SAS Institute Inc., Cary, NC, USA). The significance level was defined with a two-tailed p-value of <0.05.
Baseline characteristics of the random subcohort of EPIC-Potsdam and incident cases are presented in table 1. Incident cases were, compared to the subcohort, more likely to be males, were on average older, had a higher BMI, and had a wider waist circumference. Proportions of hypertensive participants and former or current smokers were larger among incident cases. Diabetes risk quantified with the GDRS was considerably higher among incident cases compared to the subcohort participants. Also, concentrations of biochemical markers reflected a higher baseline risk for incident cases compared to the subcohort. Regarding the genetic score, incident cases had slightly higher number of risk alleles than the members in the subcohort.
The genetic loci and risk-allele frequencies are listed in Table S1. Risk-allele frequencies ranged from 0.09 to 0.94 in the random EPIC-subcohort and were comparable with prior reports on allele frequencies , ,  or HapMap-CEU and 1000 Genomes data .
Table 2 shows comparisons of models without or with inclusion of the count genetic score. Discrimination for a model including only the 42 SNP genetic score was weak (ROC-AUC: 0.579; 95% CI: 0.552–0.605). However, when adding the genetic score to the GDRS ROC-AUC increased from 0.846 to 0.853 (delta: 0.007 [95% CI: 0.002–0.011]). Additionally including genetic markers to a model containing the GDRS, glucose, A1C, triglycerides, HDL cholesterol, γ-glutamyltransferase and alanine aminotransferase showed small differences in ROC-AUC (0.002 [−0.001–0.004]) but without significance.
Stratified analyses for prediction models including the count genetic score, GDRS or count genetic score along with the GDRS are presented in table 3. The discriminative ability of the genetic score alone was weak in both men and women. Also, the predictive value of the genetic score added to the GDRS was similar in men and women (differences in ROC-AUC: 0.006 and 0.008, IDI: 6.20 and 6.24%, respectively). Analyses stratified by age showed lower ROC-AUCs for the GDRS and the genetic score in the upper age group. When the genetic score was included along with the GDRS, improvement was more apparent in the older age group (difference of ROC-AUC: 0.010, IDI: 7.50%) compared to the younger group (0.005, 5.25%). We observed a slightly higher ROC-AUC for the genetic risk score among participants with a positive family history, while the GDRS discriminated slightly better in the group with a negative history. Improvement by genetic information was larger among participants with a positive family history (delta ROC-AUC: 0.012, relative IDI: 8.71%) compared to participants with a negative family history. With regard to BMI subgroups, both the GDRS and genetic score showed a better discrimination in the group without obesity. Although non-significant, improvement in discrimination was larger in the obese group.
We observed that prediction based on a large number of single nucleotide polymorphisms is not accurate, regardless of subgroups with different risk according to sex, age, family history, or obesity. Prediction based on a model with non-invasive risk factors was slightly improved by genetic information, but not if established biochemical risk markers were also considered. We also observed that improvement in prediction by genetic information beyond classical risk factors was slightly larger among older or obese participants and participants with a family history of diabetes.
Genetic markers alone showed a discriminatory ability (ROC-AUC) between 0.54  and 0.68  in previous studies. Our results, based on a genetic risk score including 42 SNPs, are comparable with these findings. Although we used a larger set of SNPs compared to most previous studies, acceptable discrimination by genetic information alone will require identification of many more common variants (usually with small effects) or rare variants with stronger effects .
We have previously reported that metabolic markers improved discrimination of the German diabetes risk score substantially, whereas a genetic risk score including 20 SNPs did not . Our current analyses showed comparable results with no added value of 42 SNPs beyond metabolic markers. These results are in accordance with previous studies ,  suggesting that prediction models involving lifestyle and biochemical predictors and showing very good discrimination are not improved by the known genetic markers . Still, we observed an improvement in discrimination with the genetic risk score if only a non-invasive model served as the reference. Thus, genetic profiling could be an alternative to the determination of biochemical markers. However, the improvement by genetic information is so far much smaller than that observed with conventional biochemical risk markers, such as plasma glucose, HbA1c, or plasma lipids.
Previous reports suggest that genetic risk prediction might be more useful in younger populations , , . In the Framingham Offspring study Meigs et al. found a substantially better model improvement by a genotype score in participants being <50 years of age compared to older . The same trend was observed by de Miguel-Yanes et al. when evaluating an extended set of SNPs (40) in the same study population . Data from the Malmö and Botnia studies support this notion: the ability of genetic risk factors to predict future type 2 diabetes improved with an increasing duration of follow-up, in contrast to lifestyle-related risk factors . However, a recent study by Vassy et al. among young adults did not see an improvement in prediction by genetic information over routine clinical measurements . Similarly, our results did not support the hypothesis that prediction by genetic information is more accurate in younger individuals. To the contrary, we found that 42 SNPs, an almost identical set compared to de Miguel-Yanes et al. , improved prediction more among older participants (≥50 years) than among younger. Although we cannot rule out that the larger improvement among older participants in our study could be due to the relatively lower discrimination achieved by the lifestyle-related prediction model, the Framingham Offspring study observed larger improvements among younger participants despite the fact that the baseline model without genetic risk factors actually showed slightly better discrimination as well compared to older participants. While these different results might be explained by differences in the study populations , in the study design and in the identification of diabetes cases, or in the baseline risk factor models considered, our results support that genetic risk might affect people with an adverse risk profile (e.g. older age) more likely than with a healthy risk profile . This is further supported by our observation that improvement in discrimination by genetic information was larger among participants who had a family history of diabetes or who were obese.
To our knowledge, no previous study reported stratified analysis by family history of diabetes. Some studies suggest that the strength of the association between genetic risk scores and diabetes depends on family history, but this is only indirect support for our observation , . Talmud et al. hypothesized that the inclusion of family history in a reference model could weaken the added predictive value of genetic risk markers, if it was part of the family history complex . However, recent results from the InterAct consortium suggest that the currently known diabetes gene variants only explain a very minor proportion of excess risk associated with family history .
We found that the discriminatory ability of the 42 SNPs alone was slightly better in the non-obese group compared to the obese, however, discrimination was generally poor irrespective of obesity status. Van Hoek et al. showed similar results for low compared to high BMI groups (cut-off 26 kg/m2). However, genetic information resulted in a stronger improvement in ROC-AUC among obese compared to non-obese participants in our study. This difference could be due to the relatively lower discrimination among obese participants achieved by the lifestyle-related prediction model. However, the improvement in ROC-AUC reached not statistical significance in the obese group, but this might mainly reflect the smaller sample size particularly for non-cases. No other study investigated improvement of prediction models by genetic markers in subgroups of BMI. While two studies observed stronger associations of genetic risk scores with type 2 diabetes risk in obese people compared to non-obese after adjustment for age and sex , , the extent to which genetic information improve prediction has not been investigated.
Several limitations of our study need to be considered. While a strength of our study is its prospective design, we included in our analysis only clinical cases of diabetes identified by self-reports and did not screen our study population for unknown diabetes during follow-up. Thus, our results may not be generalizable to patients who are identifiable only by screening. Further, we cannot rule out that diabetes cases include subtypes such as latent autoimmune diabetes of adults (LADA) which might have affected our results. Also, the prospective design rendered it necessary to exclude all prevalent diabetes cases at baseline. Thus, our results reflect genetic prediction in middle-aged individuals but not prediction at birth. However, a prospective design is more meaningful than a case-control design if prediction by genetic variants is compared with prediction by anthropometric and lifestyle-related risk factors. We based our analysis on a large number of established diabetes SNPs, however, we cannot rule out that a more comprehensive list of SNPs would be more useful for prediction purposes, although this appears to be unlikely . We have evaluated the discriminatory predictive power of genetic markers using ROC analyses and reclassification (IDI). It has been suggested that evaluation of different risk prediction models should also include the net reclassification improvement , . However, we have recently reported that the absence of established risk classes for diabetes introduces large subjectivity to such analyses . Similar to previous studies , , the comparison of delta ROC-AUCs between subgroups did not rely on a statistical test which might introduce subjectivity to the interpretation of the results. Our stratified analyses were based on the original GDRS with published points, but the predictive value of age or waist circumference might be different in strata of age or BMI, respectively. However, refitting the prediction models in different strata only slightly affected the ROC-AUCs and improvement stayed almost the same, so that the overall conclusion would not be changed.
In conclusion, genetic risk prediction with 42 SNPs alone was not accurate enough to be used for identification of individuals at high risk. In addition to conventional non-invasive risk factors genetic risk prediction might be used to achieve a slightly higher accuracy, however, it failed to significantly improve risk prediction with established biochemical risk factors. Although differences were not substantial, our data suggest, that genetic variants might be more useful for prediction within subgroups with already manifest risk factors, such as higher age, obesity, and a positive family history of diabetes.
Conceived and designed the experiments: HB MBS. Analyzed the data: KM. Wrote the paper: KM. Discussed and interpreted the data: KM CJ HGJ HB MBS. Reviewed the manuscript: CJ HGJ HB MBS.
- 1. Morris AP, Voight BF, Teslovich TM, Ferreira T, Segre AV, et al. (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 44: 981–990.
- 2. Buijsse B, Simmons RK, Griffin SJ, Schulze MB (2011) Risk assessment tools for identifying individuals at risk of developing type 2 diabetes. Epidemiol Rev 33: 46–62.
- 3. Schulze MB, Weikert C, Pischon T, Bergmann MM, Al-Hasani H, et al. (2009) Use of multiple metabolic and genetic markers to improve the prediction of type 2 diabetes: the EPIC-Potsdam Study. Diabetes Care 32: 2116–2119.
- 4. Willems SM, Mihaescu R, Sijbrands EJ, van Duijn CM, Janssens AC (2011) A methodological perspective on genetic risk prediction studies in type 2 diabetes: recommendations for future research. Curr Diab Rep 11: 511–518.
- 5. de Miguel-Yanes JM, Shrader P, Pencina MJ, Fox CS, Manning AK, et al. (2011) Genetic risk reclassification for type 2 diabetes by age below or above 50 years using 40 type 2 diabetes risk single nucleotide polymorphisms. Diabetes Care 34: 121–125.
- 6. van Hoek M, Dehghan A, Witteman JC, van Duijn CM, Uitterlinden AG, et al. (2008) Predicting type 2 diabetes based on polymorphisms from genome-wide association studies: a population-based study. Diabetes 57: 3122–3128.
- 7. Boeing H, Korfmann A, Bergmann MM (1999) Recruitment procedures of EPIC-Germany. European Investigation into Cancer and Nutrition. Ann Nutr Metab 43: 205–215.
- 8. Stefan N, Fritsche A, Weikert C, Boeing H, Joost HG, et al. (2008) Plasma fetuin-A levels and the risk of type 2 diabetes. Diabetes 57: 2762–2767.
- 9. Schulze MB, Hoffmann K, Boeing H, Linseisen J, Rohrmann S, et al. (2007) An accurate risk score based on anthropometric, dietary, and lifestyle factors to predict the development of type 2 diabetes. Diabetes Care 30: 510–515.
- 10. Bohlscheid-Thomas S, Hoting I, Boeing H, Wahrendorf J (1997) Reproducibility and relative validity of energy and macronutrient intake of a food frequency questionnaire developed for the German part of the EPIC project. European Prospective Investigation into Cancer and Nutrition. Int J Epidemiol 26 Suppl 1: S71–81.
- 11. Bohlscheid-Thomas S, Hoting I, Boeing H, Wahrendorf J (1997) Reproducibility and relative validity of food group intake in a food frequency questionnaire developed for the German part of the EPIC project. European Prospective Investigation into Cancer and Nutrition. Int J Epidemiol 26 Suppl 1: S59–70.
- 12. Kroke A, Klipstein-Grobusch K, Voss S, Moseneder J, Thielecke F, et al. (1999) Validation of a self-administered food-frequency questionnaire administered in the European Prospective Investigation into Cancer and Nutrition (EPIC) Study: comparison of energy, protein, and macronutrient intakes estimated with the doubly labeled water, urinary nitrogen, and repeated 24-h dietary recall methods. Am J Clin Nutr 70: 439–447.
- 13. Cornelis MC, Qi L, Zhang C, Kraft P, Manson J, et al. (2009) Joint effects of common genetic variants on the risk for type 2 diabetes in U.S. men and women of European ancestry. Ann Intern Med 150: 541–550.
- 14. DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44: 837–845.
- 15. Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 27: 157–172 discussion 207–112.
- 16. Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI, et al. (2007) Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316: 1331–1336.
- 17. Voight BF, Scott LJ, Steinthorsdottir V, Morris AP, Dina C, et al. (2010) Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet 42: 579–589.
- 18. dbSNP - Database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants. Available: http://www.ncbi.nlm.nih.gov/snp/?SITE=NcbiHome&submit=Go. Accessed 2012 Nov.
- 19. Talmud PJ, Hingorani AD, Cooper JA, Marmot MG, Brunner EJ, et al. (2010) Utility of genetic and non-genetic risk factors in prediction of type 2 diabetes: Whitehall II prospective cohort study. BMJ 340: b4838.
- 20. Lyssenko V, Almgren P, Anevski D, Orho-Melander M, Sjogren M, et al. (2005) Genetic prediction of future type 2 diabetes. PLoS Med 2: e345.
- 21. Janssens AC, Aulchenko YS, Elefante S, Borsboom GJ, Steyerberg EW, et al. (2006) Predictive testing for complex diseases using multiple genes: fact or fiction? Genet Med 8: 395–400.
- 22. Balkau B, Lange C, Fezeu L, Tichet J, de Lauzon-Guillain B, et al. (2008) Predicting diabetes: clinical, biological, and genetic approaches: data from the Epidemiological Study on the Insulin Resistance Syndrome (DESIR). Diabetes Care 31: 2056–2061.
- 23. Mihaescu R, van Zitteren M, van Hoek M, Sijbrands EJ, Uitterlinden AG, et al. (2010) Improvement of risk prediction by genomic profiling: reclassification measures versus the area under the receiver operating characteristic curve. Am J Epidemiol 172: 353–361.
- 24. Lyssenko V, Jonsson A, Almgren P, Pulizzi N, Isomaa B, et al. (2008) Clinical risk factors, DNA variants, and the development of type 2 diabetes. N Engl J Med 359: 2220–2232.
- 25. Meigs JB, Shrader P, Sullivan LM, McAteer JB, Fox CS, et al. (2008) Genotype score in addition to common risk factors for prediction of type 2 diabetes. N Engl J Med 359: 2208–2219.
- 26. Vassy JL, Durant NH, Kabagambe EK, Carnethon MR, Rasmussen-Torvik LJ, et al. (2012) A genotype risk score predicts type 2 diabetes from young adulthood: the CARDIA study. Diabetologia 55: 2604–2612.
- 27. Franks PW (2012) Genetic risk scores ascertained in early adulthood and the prediction of type 2 diabetes later in life. Diabetologia 55: 2555–2558.
- 28. Scott RA, Langenberg C, Sharp SJ, Franks PW, Rolandsson O, et al. (2012) The link between family history and risk of type 2 diabetes is not explained by anthropometric, lifestyle or genetic risk factors: the EPIC-InterAct study. Diabetologia 56: 60–69.
- 29. Pencina MJ, D'Agostino RB Sr, Steyerberg EW (2011) Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med 30: 11–21.
- 30. Mühlenbruch K, Heraclides A, Steyerberg EW, Joost HG, Boeing H, et al. (2013) Assessing improvement in disease prediction using net reclassification improvement: impact of risk cut-offs and number of risk categories. Eur J Epidemiol 28: 25–33.