Do Variants Associated with Susceptibility to Pancreatic Cancer and Type 2 Diabetes Reciprocally Affect Risk?

Objectives Although type 2 diabetes mellitus is a known risk factor for pancreatic cancer, the existence of shared genetic susceptibility is largely unknown. We evaluated whether any reported genetic risk variants of either disease found by genome-wide association studies reciprocally confer susceptibility. Methods Data that were generated in previous genome-wide association studies (GENEVA Type 2 Diabetes; PanScan) were obtained through the National Institutes of Health database of Genotypes and Phenotypes (dbGaP). Using the PanScan datasets, we tested for association of 38 variants within 37 genomic regions known to be susceptibility factors for type 2 diabetes. We further examined whether type 2 diabetes variants predispose to pancreatic cancer risk stratified by diabetes status. Correspondingly, we examined the association of fourteen pancreatic cancer susceptibility variants within eight genomic regions in the GENEVA Type 2 Diabetes dataset. Results Four plausible associations of diabetes variants and pancreatic cancer risk were detected at a significance threshold of p = 0.05, and one pancreatic cancer susceptibility variant was associated with diabetes risk at threshold of p = 0.05, but none remained significant after correction for multiple comparisons. Conclusion Currently identified GWAS susceptibility variants are unlikely to explain the potential shared genetic etiology between Type 2 diabetes and pancreatic cancer.


Introduction
1 consisted of 12 prospective cohort studies and a Mayo Clinic case-control study. The Illumina HumanHap550v3.0 genotyping platform genotyped 558,542 SNPs. Genotyping in PanScan 2 augmented PanScan 1 with 8 case-control studies from the PanC4. The Illumina Human610_ Quadv1_B genotyping platform genotyped 581,188 SNPs. Age, sex, study site, and race comprised available covariate data. The Mayo Clinic subset drawn from PanScan 1 and 2 comprised 654 cases and 618 controls, with additional covariate information available: smoking status, first degree family history of cancer, body mass index (BMI), and diabetes status [14].
The GENEVA Genes and Environment Initiatives in Type 2 Diabetes study consisted of GWAS data generated using the NHS and HPFS cohorts. This dataset (http://www.ncbi.nlm. nih.gov/projects/gap/cgi-bin/study.cgi?study_id = phs000091.v2.p1) included two subsets: NHS (1581 cases and 1810 controls), and HPFS (1164 T2DM cases and 1338 controls, and 68 cases of uncertain diabetes type). Both studies used the Affymetrix Affy_6.0 genotyping platform and 934,940 SNPs were genotyped. Available covariates included family history of diabetes, high blood pressure, high blood cholesterol, smoking status, physical activity, age, BMI, alcohol intake, fat intake measures, magnesium intake, cereal fiber intake, heme iron intake, glycemic load, and gender.

Quality Control filters
PanScan genotype data had been filtered previously for call rate, relatedness, controls that became cases, and sex chromosome abnormalities. If duplicates were genotyped, the individual with the higher call rate was used. The final analysis included 3360 cases and 3468 controls in the combined PanScan data. The Mayo Clinic subset analysis consisted of 63 cases and 23 controls in the long-standing diabetes group (a diabetes diagnosis of 2 or more years earlier than diagnosis of pancreatic cancer for cases, or 2 or more years earlier than date of enrollment for controls), and 448 cases and 557 controls without diabetes.

Shared Susceptibility Variants
The GENEVA datasets had been pre-filtered for sex chromosome abnormalities, sample identity, and call rate. We further filtered for relatedness of the sample and Hardy-Weinberg equilibrium (HWE) for variants studied in both NHS and HPFS sets. For the HPFS, we additionally filtered out subjects with uncertain diabetes status. The final sample for our analysis consisted of 1579 cases and 1801 controls from the NHS set (all females), 1162 cases and 1336 controls from the HPFS set (all males), for a total of 2741 cases and 3137 controls.

Genetic Variant Selection
Thirty-eight T2DM susceptibility SNPs [8,15,16,17,18,19,20,21,22,23,24,25] and 14 PaC susceptibility SNPs [9,10,11] from published GWAS were included in our analyses. Thirty-seven of the 38 T2DM predisposition variants and five of the 14 PaC predisposition variants had been genotyped and were available in the PanScan datasets and GENEVA T2DM datasets, respectively. To represent variants which were not captured in the GWAS data, we captured the genotype by identifying SNPs in high LD (r 2 >0.5) as determined by Haploview [26]. If multiple variants were identified for a variant of interest not in the GWAS, the SNP with the smallest p-value was chosen as representative for the association.

Statistical Analysis
Unconditional multivariable logistic regression assuming an additive model was employed in the association analyses. Available covariates were adjusted in the analyses. For associations reaching p = 0.05 threshold, a Bonferroni correction was conducted to adjust for multiple comparisons. Specifically, for the PaC dataset analysis, a threshold of 0.05/37 = 0.0014 was used since all 38 variants from 37 genomic regions were tested; for the T2DM dataset analysis, a threshold of 0.05/6 = 0.008 was used since 11 variants from 6 genomic loci were tested. Additionally, a correction of false discovery rate (FDR) was conducted to evaluate the significance [27].
For the PanScan data, covariates in the model included age, sex, study site, genotypic race (Eigenstrat principal components 1 and 2), and other significant principal components. We further conducted a stratified analysis in the Mayo Clinic subset according to diabetes status to investigate whether the T2DM susceptibility variants have a specific effect on either (1) longstanding diabetic PaC patients (cases who had long-standing diabetes 2 or more years before the diagnosis of pancreatic cancer) versus long-standing diabetic PaC controls, or (2) PaC cases and controls without diabetes.
In the GENEVA Type 2 Diabetes datasets, we conducted similar adjusted analysis. Covariates in the model included age, principal components (the first two components were used for NHS and 20 components were used for HPFS), family history of diabetes, high blood pressure, high blood cholesterol, smoking status, physical activity, BMI, alcohol intake, fat intake measures, magnesium intake, cereal fiber intake, heme iron intake, and glycemic load. As the NHS and HPFS set each contained a single gender, we did not include gender as a covariate in analyses of the separate sets but included gender in the combined set.
Overall, four associations in PanScan sets reached statistical significance at a threshold of p = 0.05. However, none of them were significant after correction for multiple comparisons using the Bonferroni correction and FDR correction.

Association with type 2 diabetes
Among the 14 reported PaC susceptibility variants, 11 were genotyped in the GENEVA diabetes GWAS (n = 5) or were represented by SNPs in high LD (n = 6). Variants ABO rs505922, PDX1 rs9581943 and TERT rs2736098 were not captured in these datasets. Only the association between LINC-PINT rs6971499 and T2DM risk appeared significant at p = 0.05 level in the combined dataset of NHS and HPFS, however, it was no longer significant after correction for multiple comparisons (Table 3).

Discussion
Based on previous results that demonstrated a relationship between T2DM and PaC, we hypothesized a shared genetic etiology. Using published GWAS data, we tested the association between reported T2DM susceptibility variants and PaC risk, as well as the association between reported PaC susceptibility variants and T2DM risk. The analyses showed that only one PaC susceptibility variant was associated with T2DM risk at a weak significance level; similarly, there were only weak associations between T2DM susceptibility variants and PaC risk. These associations, found at a significance threshold of p = 0.05 became not significant after adjustment for multiple comparisons. We also did not replicate any of the three T2DM SNPs from the PanScan 1 dataset analysis in the PanScan 2 dataset.
One design strength of our study is that we could perform initial stratified analyses by diabetes status of PaC cases and controls to evaluate whether associations between T2DM susceptibility variants and PaC risk differ by diabetes history. We found no associations significant, despite one potential association showing a relatively large effect size (rs13266634, OR = 4.92). We discounted the large effect size for rs13266634 because it was likely due to the subgroup sample sizes of 63 cases and 23 controls, and the frequency of the minor allele (T) in the control subjects was not congruent with reports in the general population (0.46 vs 0.29). The sample size of the PaC cases and controls without diabetes (448 cases and 557 controls), was likely underpowered as well, but our findings on DGKB-TMEM195 rs2191348, KCNQ1 rs231362, ADCY5 rs2877716 and BCL11A rs243021 may warrant further evaluation from larger studies.
Our study has other limitations. We did not evaluate susceptibility variants from candidate gene association studies. Many biologically plausible candidate genes have been reported to be   associated with T2DM and PaC in different studies [29,30]. However, it is well known that the replication rate for SNPs derived using the candidate gene association design is relatively low [6]. Thus rather than test variants which were not always demonstrated to be associated with disease risk, we focused on GWAS derived susceptibility SNPs which were consistently replicated. One possible explanation for these results is that the variants we analyzed from the GWAS studies account for only a modest genetic susceptibility to disease risk. There may still be genetic loci beyond these GWAS-derived SNPs that could play a role that link these two diseases. It is also likely that the majority of the shared etiology of PaC and T2DM involve other factors beyond genetic level, such as obesity, epigenetic or environmental factors. Shared family environment could potentially explain the observed association between family history of diabetes and pancreatic cancer risk. Complementary genetic methods, such as family-based linkage studies or high throughput sequencing studies, may offer alternatives to characterize the potential shared genetic etiology of the two conditions, and may reveal novel associations or interactions.
In conclusion, we found that GWAS-derived susceptibility variants do not explain the potential shared genetic etiology of PaC and T2DM. We do report interesting associations that may warrant further study using independent datasets with larger sample sizes.