Pancreatic Cancer Susceptibility Loci and Their Role in Survival

Pancreatic cancer has one of the worst mortality rates of all cancers. Little is known about its etiology, particularly regarding inherited risk. The PanScan project, a genome-wide association study, identified several common polymorphisms affecting pancreatic cancer susceptibility. Single nucleotide polymorphisms (SNPs) in ABO, sonic hedgehog (SHH), telomerase reverse transcriptase (TERT), nuclear receptor subfamily 5, group A, member 2 (NR5A2) were found to be associated with pancreatic cancer risk. Moreover the scan identified loci on chromosomes 13q22.1 and 15q14, to which no known genes or other functional elements are mapped. We sought to replicate these observations in two additional, independent populations (from Germany and the UK), and also evaluate the possible impact of these SNPs on patient survival. We genotyped 15 SNPs in 690 cases of pancreatic ductal adenocarcinoma (PDAC) and in 1277 healthy controls. We replicated several associations between SNPs and PDAC risk. Furthermore we found that SNP rs8028529 was weakly associated with a better overall survival (OS) in both populations. We have also found that NR5A2 rs12029406_T allele was associated with a shorter survival in the German population. In conclusion, we found that rs8028529 could be, if these results are replicated, a promising marker for both risk and prognosis for this lethal disease.


Introduction
Pancreatic cancer is the fifth leading cause of cancer deaths in Europe and the eighth worldwide, with a five year relative survival of less than 5% [1]. No effective screening test for this malignancy exists, and metastatic disease is commonly present at initial diagnosis. Established risk factors include cigarette smoking, obesity or overweight, a medical history of diabetes type II, and family history of pancreatic cancer [2].
The PanScan project, a genome-wide association study (GWAS), recently identified various pancreatic cancer susceptibility loci. Several single nucleotide polymorphisms (SNPs) in the gene regions of ABO, sonic hedgehog (SHH), telomerase reverse transcriptase (TERT), nuclear receptor subfamily 5, group A, member 2 (NR5A2) were found to be associated with pancreatic cancer risk [3]. Statistically significant associations were found also with SNPs mapping to a region on chromosome 13q22.1 and a region on chromosome 15q14, where no known genes are mapped [3].
The genes to which several of the GWAS loci were mapped are biologically plausible candidates for involvement in pancreatic cancer. Several early studies reported an association between ABO blood type and gastrointestinal cancers, strongest for gastric cancer but also notable for pancreatic cancer [4,5]. SHH plays a key role as a morphogenic factor and is related to the formation of various malignancies, including pancreatic cancer [6]. NR5A2 encodes a nuclear receptor of the fushi tarazu (Ftz-F1) subfamily that interacts with b-catenin and is predominantly expressed in exocrine pancreas, liver, intestine and ovaries in adults. The TERT gene encodes the catalytic subunit of telomerase, essential for maintaining telomere ends. While telomerase activity cannot be detected in most normal tissues, it is seen in approximately 90% of human cancers [7]. The region of chromosome 5p15.33 where TERT maps has been identified in genome-wide association studies of a number of different cancers, including brain tumors, lung cancer, basal cell carcinoma, and melanoma. Although the region on 13q22.1 does not contain any known gene, it is frequently deleted in a spectrum of cancers, including pancreatic cancer [8,9,10].
The GWAS has convincingly shown association between several of these loci and pancreatic cancer risk. For others (SNPs in the SHH region and in a gene desert on chromosome 15q14), it also showed promising associations, although supported by less strong statistical evidence [3]. Therefore, we sought to replicate these observations in an additional, independent population of 690 cases of pancreatic ductal adenocarcinoma (PDAC), and we evaluated the possible impact of these SNPs on patient survival.

Ethics Statement
All participants signed an informed written consent. The study was approved by the ethical review boards of the institutions responsible for subject recruitment in each of the recruitment centres. The ethical committees were the following: South West Research Ethics Committee (Liverpool Subjects), Ethikkommission der Medizinischen Fakultä t, Heidelberg (German subjects), Ethics commitee of the University of Oxford (Oxford subjects), Ethics commitee of the University of Cambridge (Cambridge subjects).

Study population
Samples were collected from pancreatic cancer patients during surgery between December 1996 and September 2009, snapfrozen in liquid nitrogen directly after resection and subsequently stored at 280uC.
Detailed information on the control population is given elsewhere. Briefly, a total of 1141 healthy blood donors of German origin were recruited in 2004 at the Institute of Transfusion Medicine, Mannheim, Germany [11].
136 British controls were selected from people recruited in two cohorts in EPIC, an ongoing prospective cohort being carried out in ten European countries. The EPIC-Norfolk cohort (http:// www.srl.cam.ac.uk/epic/) comprises over 30,000 individuals, ages 45 to 75 at recruitment, resident in Norfolk, East Anglia, and recruited from general practice registers between 1993 and 1997 [12]. The EPIC-Oxford cohort comprises 65,429 people aged 20 years and above and living in the UK recruited between 1993 and 1999 [13]. Characteristics of patients and controls are described in Table 1.

Selection of genes and polymorphisms
We selected 15 polymorphisms that were found to be associated with the risk of developing pancreatic cancer by a recent GWAS [3]. In each of the six regions identified by the GWAS, we selected the SNPs showing the strongest associations: we selected rs12029406, rs10919791, rs3790844 in the NR5A2 gene; rs4635969 and rs401681 for the TERT/CLPTM1L region; rs172310, rs167020 for SHH; rs657152, rs505922, rs630014, rs495828 for ABO; rs9543325, rs9543325 at 13q22.1; rs8028529 at 15q14. More detailed information on the selected SNPs is given in Table 2.

DNA extraction and genotyping
DNA was extracted from frozen or paraffin-embedded pancreatic tissues of 690 patients with resected tumors (576 from Heidelberg, 114 from Liverpool) using the AllPrep Isolation Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Genotyping was performed using an allele-specific PCR-based KASPar SNP genotyping system (KBiosciences, Hoddesdon, UK) as recommended by the manufacturer. The order of DNAs from cases and controls was randomized on PCR plates in order to ensure that an equal number of cases could be analyzed simultaneously. Thermocycling was performed according to the manufacturer's instructions. Detection was performed using an ABI PRISM 7900 HT sequence detection system with SDS 2.2 software (Applied Biosystems, Foster City, CA, USA). Genotyping for British controls was performed in the context of a genome-wide association study using the Human 660W-Quad BeadChip array according to manufacturer's instructions (Illumina, San Diego, CA, USA) at Imperial College.

Statistical analysis
Hardy-Weinberg equilibrium was tested in the controls by the chi square test. Risk analysis was performed in a total of 690 PDAC cases and 1277 healthy controls. We used logistic regression for multivariate analyses to assess the main effects of the genetic polymorphism on pancreatic cancer risk using a codominant inheritance model. The most common allele in the controls was assigned as the reference category. All analyses were adjusted for age and gender.
For survival analysis, the median follow-up time was computed with censored observations only (20%), whereas the median survival time was calculated using data from all patients. Overall survival (OS) was defined as the time interval between diagnosis and death (uncensored observation) or the last date when the patient was still alive (censored observation, medium follow up time 1249 days). OS was evaluated using methods for censored survival time. In particular, risk of dying was estimated by hazard ratios (HR) and 95% confidence intervals (CI) in Cox proportional hazard models. All the analyses were performed with STATA software (StataCorp, College Station, TX, USA). For the survival analysis, in order to take into account the number of tests performed in this project, we calculated for each gene/region the number of effective independent variables, M eff , by use of the SNP Spectral Decomposition approach [14]. We obtained a gene-wide M eff value for each gene and also a study-wide M eff value, by adding up the gene M eff 's. For the replication analysis, since the assocations had already been convincingly shown in a GWAS, a correction for multiple testing is not necessary, therefore we used a threshold of 0.05 to confirm our findings.

Results
In this study we sought to investigate two different endpoints: replication of the associations between 15 GWAS SNPs and the risk of developing pancreatic cancer, and an evaluation of the possible associations between the same SNPs and patient survival.
We genotyped 15 SNPs in 690 cases of PDAC and 1277 healthy controls. The average call rate was 97.20% (range 94.69%-98.79%). For 27 cases, both normal and tumor tissues were available and used for genotyping. No differences were observed (398 informative genotype comparisons). Approximately 10% of the samples were analyzed in duplicate, and the concordance rate of the genotypes was higher than 99%. The genotype distributions at all loci were in Hardy-Weinberg equilibrium in controls, with non-significant chi square values (using a threshold of p,0.05, data not shown).
The frequencies and distribution of the genotypes and the odds ratios for the association of each polymorphism with PDAC are described in Table S1. We were able to replicate several significant associations between the SNPs and PDAC risk. Table 3 shows the SNPs associated with pancreatic cancer risk in this study. The strongest association with an increased risk of PDAC we observed was with the C allele of the 9q34 rs9543325 SNP (OR het 1.23, 95% CI 0.98-1.55, OR hom 1.60, 95% CI 1.14-2.25, P trend = 0.0023). For two SNPs in NR5A2 (rs12029406, rs10919791), one in SHH (rs167020) and the 15q14 region SNP (rs rs8028529) no statistically significant association was detected. Figure S1 shows a summary of the replications in the study. Table   S1 shows the distribution of each SNP genotyped in the study and the relative ORs in the two populations separately and together.
We investigated a possible association between the selected SNPs and patient survival. Median survival of cases was different in patients from Liverpool (305 days) and Heidelberg (387 days; Cox regression test, p = 10 25 ), therefore we conducted this analysis separately for the two populations. The results of this analysis are reported in Table 4 and Table S2.
We found that SNP rs8028529 (located on chromosome 15q14) was weakly associated with a better OS in both populations. In the German population, we found that the heterozygous carriers had a better survival (HR = 0.73, 95%CI 0.59-0.91, Pvalue = 0.01, median survival of heterozygotes = 440 days, median survival of homozygotes for the common allele = 346 days), while in the British population a better survival was observed in homozygous carriers of the variant allele (HR = 0.40, 95%CI 0.16-1.01, P = 0.05, median survival of homozygotes for the variant allele = 421 days, median survival of homozygotes for the common allele = 287 days). Analysing all samples together, adjusting by age, gender and recruitment center, we observed that the combined genotype (C/T + C/C) had a statistically significant association with better survival of PDAC: HR = 0.76 (95% CI 0.64-0.92) p = 0.004.
In the German population the carriers of at least one T allele of the rs12029406 SNP, which belongs to the NR5A2 gene, showed a shorter survival time (HR = 1.23, 95%CI 1.01-1.49, P = 0.04, median survival of allele carriers = 359 days). Table 4 shows the results for these two SNPs while table S2 shows the results for all the SNPs.
We calculated M eff values for each candidate gene/region separately and for the whole study (by adding the individual gene M eff values;). The study-wide M eff was 9.8. We therefore used a study-wide significance p-threshold of 0.05/9.8 = 0.0051. Using this threshold, no significant associations were observed between any of the polymorphisms genotyped and patients survival, with the exception of the T allele of rs8028529, in the pooled population, with better patients survival.

Discussion
Pancreatic cancer is among the deadliest of cancers, with mortality rates approaching incidence rates [1,15]. There is no effective curative treatment yet for pancreatic cancer. Surgery offers the only treatment option that significantly improves survival. Therefore, finding genetic variants associated with disease risk, progression and survival is of the utmost importance. Given that there are few known risk factors, improved diagnostics and a better understanding of the molecular pathogenesis of this disease are urgently needed.
We report the re-evaluation of 15 SNPs found to be associated with pancreatic cancer risk as reported in PanScan [3] and their possible involvement in patient survival. In this study, we were able to replicate six SNPs at a P value of at least 0.05 (0.044920.0023). We could not replicate the other reported associations, although the allelic frequencies in our study subjects were comparable to those obtained in PanScan, and trends of risk went in the same direction as reported by PanScan. A possible explanation of our failure in replicating the associations may be due to insufficient statistical power.
The most important and novel finding of this manuscript is the fact that the C allele of SNP rs8028529, located in a gene desert on chromosome 15q14 is associated with better survival. This association reached statistical significance, at the conventional 0.05 threshold, in both populations we studied, although for the German cases significance was only observed for the heterozygous (C/T) carriers, while for the British cases it was observed for the homozygous carriers of the variant allele (C/C) only. An analysis of all samples combined, adjusting by age, gender and recruitment center, revealed that the combined genotype (C/T + C/C) had a statistically significant association with better survival of PDAC: HR = 0.76 (95% CI 0.64-0.92) p = 0.004. In PanScan, the association was found between the C allele and an increased risk of pancreatic cancer. It is very difficult to understand the biological mechanism that could explain these associations since the SNP is located in a gene desert. The nearest gene (located at a distance of about 500 kb) is myeloid ecotropic insertion site homeobox 2 (MEIS2), which is known to be expressed at high levels in the pancreas and in pancreatic cancer (data from the In Silico Transcriptomics database) [16]. This gene encodes a homeobox protein belonging to the TALE ('three amino acid loop extension') family of homeodomain-containing proteins. TALE homeobox proteins are highly conserved transcription regulators, and several members have been shown to be essential contributors to developmental programs. Recent studies have shown that MEIS proteins may also be involved in tumorigenesis, although the underlying mechanism is not yet clear [17,18]. We speculate that the SNP could be situated in a regulatory region of the MEIS2 gene that may modify its expression and in this way alter cancer risk and prognosis. A similar mechanism seems to be in place at the locus on chromosome 8q24, where SNPs associated with risk of several cancer types are located in a gene desert. Recent data suggest that one of those SNPs can affect the binding of a transcription factor that regulates the WNT pathway and possibly the MYC oncogene, which maps about 1 Mb downstream [19].
In this report we have also found that the NR5A2 rs12029406_T allele is associated with a shorter survival, although only in in the German population. In a recent review Li and Abruzzese [20] point out why this receptor may play a role in pancreatic cancer. The authors report that it has been speculated that NR5A2 contributes to diseases linked to pancreatic dysfunction, such as diabetes. For example NR5A2 plays an important role in transcriptional activation of the adiponectin gene [21], an adipocyte-secreted hormone, that has been proposed to be a biological link between obesity and increased risk of pancreatic cancer [22]. It is interesting to note that a NR5A2 gene variant has been associated with excess BMI in a genome-wide association study [23]. SNPs in NR5A2, such as rs12029406, might modulate the receptor activity which in turn can modify the disease risk and survival.
Applying the correction for multiple testing the only association which is lower than the study-wise threshold of 0.0051 was shown by the combined analysis of the two populations for the T allele of rs8028529 and better survival. The results in the two populations are not identical, however all the HRs go in the same direction in both populations, i.e. a protective effect of the variant allele, although they reach statistical significance only in the heterozygotes in the population from Heidelberg and only in the homozygotes in the population from Liverpool. Indeed, by pooling together the two populations the effect of allele remains similar in terms of HR, but the statistical significance increases, as it is expected by increasing the numbers of the subjects in study. It is not immediate to explain this discrepancy: it can be that residual confounding factors mask the associations for the heterozygotes in the British and in the homozygotes in the Germans, or the difference in the survival between the two populations may contribute to masking the true association. Finally it can be that these associations are due to chance.
These results have to be taken with caution and further replications and functional studies are warranted. The fact that the polymorphism is located in a vastly unexplored gene desert region makes it difficult to really assess an immediate clinical impact for the finding reported in this study. These results if confirmed may prompt research on the molecular biology of the mechanism and this can ultimately further our understanding of the disease. A good example is the 8q24 hits, where following an original epidemiological observation, many studies have contributed to uncover the relationship between the SNPs of the region and the activation of the MYC gene [19]. From this point of view our study may be considered as a first preliminary step that could contribute to a better understanding of the disease and, in the long term, to the establishment of diagnostic and prognostic tools.
In conclusion, we present here the replication of six previously associated SNPs with pancreatic cancer risk and the first evidence of a possible involvement of rs8028529 in PDAC prognosis.