Replication and Predictive Value of SNPs Associated with Melanoma and Pigmentation Traits in a Southern European Case-Control Study

Background Genetic association studies have revealed numerous polymorphisms conferring susceptibility to melanoma. We aimed to replicate previously discovered melanoma-associated single-nucleotide polymorphisms (SNPs) in a Greek case-control population, and examine their predictive value. Methods Based on a field synopsis of genetic variants of melanoma (MelGene), we genotyped 284 patients and 284 controls at 34 melanoma-associated SNPs of which 19 derived from GWAS. We tested each one of the 33 SNPs passing quality control for association with melanoma both with and without accounting for the presence of well-established phenotypic risk factors. We compared the risk allele frequencies between the Greek population and the HapMap CEU sample. Finally, we evaluated the predictive ability of the replicated SNPs. Results Risk allele frequencies were significantly lower compared to the HapMap CEU for eight SNPs (rs16891982 – SLC45A2, rs12203592 – IRF4, rs258322 – CDK10, rs1805007 – MC1R, rs1805008 - MC1R, rs910873 - PIGU, rs17305573- PIGU, and rs1885120 - MTAP) and higher for one SNP (rs6001027 – PLA2G6) indicating a different profile of genetic susceptibility in the studied population. Previously identified effect estimates modestly correlated with those found in our population (r = 0.72, P<0.0001). The strongest associations were observed for rs401681-T in CLPTM1L (odds ratio [OR] 1.60, 95% CI 1.22–2.10; P = 0.001), rs16891982-C in SCL45A2 (OR 0.51, 95% CI 0.34–0.76; P = 0.001), and rs1805007-T in MC1R (OR 4.38, 95% CI 2.03–9.43; P = 2×10−5). Nominally statistically significant associations were seen also for another 5 variants (rs258322-T in CDK10, rs1805005-T in MC1R, rs1885120-C in MYH7B, rs2218220-T in MTAP and rs4911442-G in the ASIP region). The addition of all SNPs with nominal significance to a clinical non-genetic model did not substantially improve melanoma risk prediction (AUC for clinical model 83.3% versus 83.9%, p = 0.66). Conclusion Overall, our study has validated genetic variants that are likely to contribute to melanoma susceptibility in the Greek population.


Introduction
Plethora of studies has shown that ultra-violet (UV) light exposure and certain phenotypic traits, i.e. red or blonde hair, light-colored eyes, fair skin complexion, and prominent mole pattern are major risk factors for the development of cutaneous melanoma (CM) [1][2][3][4][5][6]. A strong genetic background has been supported by twin studies showing a 55% contribution of genetic effects in melanoma variation liability [7].
High-penetrance germline mutations in CDKN2A and CDK4 genes are rare (0.2-1.2%) in sporadic CM, but they are encountered in approximately 5% of families with only two members with CM, and in 30-40% of families with 3 or more affected members [8][9][10]. The advent of high-throughput genotyping technologies and their utilization in population-based studies has discovered a considerable number of rare and common genetic variants at different genetic loci associated with melanoma. The most prevalent low penetrance locus is the melanocortin 1 receptor gene (MC1R), whose variants have been associated both with melanoma [11][12][13][14][15][16][17][18] as well as with related traits [11,[19][20][21]. Apart from MC1R, a significant number of low penetrance genes involved in various cellular pathways, such as pigmentation, cell cycle control, DNA repair, oxidation stress, apoptosis, senescence and melanocyte differentiation and migration have been implicated in melanoma susceptibility [22]. A detailed synopsis and meta-analysis of reported melanoma-associated variants is available in MelGene, an on-line database (http://www.melgene.org) [23]. In addition to common variants, a rare germline variant in MITF (rs14917956 -E318K) that alters MITF transcriptional activity was recently found to be associated with melanoma and renal cell cancer [24][25].
Most genetic association studies on CM have been performed in populations with fair skin and, hence, the effect of melanomaassociated variants in relatively darker skin populations residing in areas of higher ambient UV-exposure is less well known. Being a southern European country, Greece is characterized by a high degree of sun exposure year-round, a population of relatively darker skin complexion compared to northern European countries and the lowest incidence of melanoma (4-5 per 100,000 personyears) among European countries [26][27][28]. Mutational analyses performed by our group in Greek patients with sporadic and genetically enriched melanoma, found a higher prevalence of CDKN2A/CDK4 mutations than previously reported, suggesting a more prominent role of genetic susceptibility to melanoma in regions with a relatively low incidence of melanoma [28][29]. In the present study, we sought to replicate the most prominent results of MelGene and other findings from genome-wide association studies (GWAS) in a Greek case-control study. Our research replicates a number of variants that are associated with melanoma risk in the Greek population and the relevant pathogenetic pathways that are involved; it also highlights differences in risk allele frequencies among the Greek population and the HapMap European sample concerning mainly pigmentation-related risk loci. Finally, it provides insights about the predictive value of identified genetic risk factors compared to wellestablished clinical ones.

Study Population
The study population consisted of Greek melanoma cases and control subjects, above 18 years of age. The case sample consisted of patients diagnosed with non-familial, histologically confirmed invasive melanoma at A. Sygros Hospital, a large referral center of melanoma in Athens, and participating melanoma centers, from 2003 to 2009. The control sample included blood donors from a blood donation center and individuals with minor skin diseases attending A. Sygros Hospital. Controls were matched 1:1 on age (+/22 years) and gender to the cases. Individuals with a history of melanoma, other types of skin cancer, or any non-dermatological malignancy were excluded from the control arm of the study.
Each subject was interviewed and examined by a dermatologist or trained physician and information was retrieved on demographic variables (age, sex), pigmentation traits (eye, hair, and skin color), phototype, and sun exposure variables (sunburns, tanning). The Declaration of Helsinki protocols were followed and the Scientific and Ethics Committee of Andreas Sygros Hospital has reviewed and approved the research protocol; all participating individuals gave written informed consent.

SNP selection and Genotyping
All variants included in this study were selected from the last update of the MelGene field synopsis (October 2011), a large online database that was created with the purpose of comprehensively collecting and meta-analyzing all published genetic associations of melanoma (http://www.melgene.org) [23]. More specifically, the 34 selected variants from MelGene were distinguished in two groups: 1) all variants associated with melanoma at a level of p,0.05 following meta-analysis of relevant data from at least 3 independent case-control datasets (28 variants) and 2) additional biologically plausible variants representing potential causal pathways and selected from GWAS (3 variants) and candidate gene studies (3 variants) with genome-wide (p,10 27 ) or nominally significant (p,0.05) associations. These variants were also included in MelGene but not necessarily metaanalyzed due to insufficient number of available datasets. In all, of the 34 variants, 19 had reached genome-wide significance in a previous GWAS or in MelGene, and the other 15 had not.

DNA isolation, Genotyping and Quality control
Genomic DNA was isolated from peripheral blood using the QIAamp DNA blood mini kit (Qiagen). DNA concentration was quantified in samples prior to genotyping by using Quant-iT dsDNA HS Assay kit (Invitrogen). The concentration of the DNA was adjusted to 5 ng/ml.
A total of 50 ng from each DNA sample were used to genotype the selected 34 SNPs using the Sequenom iPLEX assay (Sequenom, Hamburg, Germany). Allele detection in this assay was performed using matrix-assisted laser desorption/ionizationtime-of-flight mass spectrometry [30].
Our quality control criteria included the inclusion of SNPs with a genotype call rate of 95% or higher, as well as SNPs showing no deviation from Hardy-Weinberg equilibrium (HWE) in the controls using a chi-squared test (P.0.05).

Statistical Analysis
We examined the association of each SNP with CM by performing conditional logistic regression analyses assuming a multiplicative model of inheritance considering the minor allele as the reference allele. To control for the effect of the other covariates/risk factors on CM in the Greek population, each SNP was subsequently incorporated into multivariable logistic regression models using a stepwise variable selection approach. The covariates considered were eye color (light: blue, green/gray and light brown or dark: dark brown and black), hair color (light: blond/red and light brown or dark: dark brown and black), skin color (light: fair/pale and light brown or dark: dark brown), phototype (type I, II, III or IV, according to the Fitzpatrick scale), tanning ability (burn, minimal tan, burn then tan or deep tan), and sunburn (presence or absence). We estimated odds ratios and 95% confidence intervals (95% CI) for all models. Missing values for any of the non-genetic risk factors were imputed using multiple imputation methods. Variables where all the required information was available were used for the construction of the models for the estimation of the imputed missing values.
Additionally, we estimated the correlation of risk allele frequencies between the HapMap CEU sample and the Greek population across all the evaluated SNPs. Moreover, we estimated the correlation of the effect sizes found in the Greek population with those found previously in the original publications or MelGene dependent on the source of SNP selection. We examined whether the direction of the effect estimates was in the same or in opposite directions.   For the sample size of our study, we estimated the power G i to detect each of the previously described effects at a = 0.05 level given the observed minor allele frequency in the Greek studied population assuming a multiplicative (per-allele) genetic model. We used the QUANTO software (http://hydra.usc.edu/gxe, accessed 30 September 2012). The sum of the power estimates corresponds to the number of variants that would be expected to replicate. Subsequently we calculated the binomial test for the expected vs. the replicated variants across all evaluated SNPs.
Finally, we created receiver operating characteristic (ROC) curves to assess the predictive ability of the CM-associated SNPs. We considered 3 models including, respectively, the phenotypic traits alone (model 1); the phenotypic traits along with the SNPs that remain statistically significant after Bonferroni correction (model 2); and the phenotypic traits along with all nominally statistically significant SNPs (model 3). In order to assess the validity of our models, we used k-fold cross-validation with k = 2 splits and 1,000 replications.
All statistical analyses were performed in STATA version 11.2 (College Station, TX, USA). All P-values are two-tailed.

Results
Our sample included 284 patients with CM matched on age and sex to 284 controls; of those, 270 (48%) were men. Median age was 44 years (range 18-85) for patients and 42 years (range 18-81) for controls. Demographics and phenotypic traits are shown in Table S1. Missing values in phenotypic characteristics were due to the fact that blood samples and questionnaires in one participating center were collected in the early phase of this study, and the corresponding individuals could not be found in order to retrieve these data. A total of 34 variants were selected for genotype analysis ( Table 1). All of them were successfully genotyped with call rates of 95% or above. Deviation from HWE in the control population was noticed for one singlenucleotide polymorphism (SNP) (rs4636294), which was subsequently excluded from further statistical analyses.
From the selected variants, four SNPs are found in the 39-UTRs and one in the 59-UTR of the respective gene loci; 13 are located in introns; and 10 are within exons. The remaining 6 variants are found in intergenic positions. We found evidence for strong pairwise LD (r 2 .0.85) between rs2218220 and rs4636294 (r 2 = 0.95), which deviated from HWE; rs10757257 and rs1335510 (r 2 = 0.96); rs1393350 and rs1126809 (r 2 = 0.94); and rs1885120, rs910873 and rs17305573 (r 2 = 0.90). For the remaining, moderate LD was observed (r 2 ,0.60). Table 1 shows the 33 analyzed SNPs, their effect sizes, minor and major alleles and the corresponding frequencies in the Greek population. All alleles identified as minor in the Greek population were also minor alleles in the CEU HapMap sample with one exception (rs6001027 whose minor allele was T in the Greek population but C in HapMap CEU). Figure 1 shows the correlation between the ORs identified for the 33 eligible SNPs in the Greek population and in the original source where these were selected. We noticed overall modestly high correlation of the respective effect estimates (r = 0.72, P,0.0001). No differences in ORs between the Greek population and the original source were beyond chance (i.e. 95% CI between the two populations showed overlap for each SNP). Overall, no nominally significant difference in ORs was noticed across all SNPs in the two populations (P = 0.411 for Mann-Whitney U). Reference source = Melgene: nominal association with melanoma after meta-analysis of data for this variant derived from at least 3 datasets (MelGene is an online database of all reported genetic associations of melanoma which includes a systematic meta-analysis of melanoma-associated variants from published datasets and grading of this associations for strength of epidemiogical evidence) [23], or data derived from original study (variants not metaanalyzed in Melgene). 2 Showed deviation from HWE, and was therefore not included in the analyses: N/A for MAF & OR in the Greek sample. 3 All individuals were homozygous for the major allele. Abbreviations: MAF, minor allele frequency; CI, confidence interval; OR, odds ratio. When limited to SNPs that had previously reached genomewide significance in either Melgene or a previous GWAS, the correlation of effect sizes was r = 0.83 (P,0.0001) and the correlation of risk allele frequencies was r = 0.98 (P,0.0001).

Association of variants with CM risk
Conversely, for the 14 SNPs that had not previously reached genome-wide significance, the respective correlation coefficients were r = 0.24 (P = 0.43) and r = 0.72 (P = 0.003).  MelGene status = Data from MelGene, an online database of reported genetic associations of melanoma including a systematic meta-analysis of melanoma-associated variants from published datasets and grading of these associations for strength of epidemiogical evidence [23]. OR (95% CI) and p value correspond to nominal association with melanoma after meta-analysis of data for each variant.
Five SNPs were significantly associated with CM in the multivariable analyses after controlling also for hair color, skin color, eye color, phototype, sunburn and tanning ( Table 2).

Power Considerations
The power of our study to detect ORs similar to those previously found, given the allele frequencies observed in the Greek population, ranges from 5.2% for rs12203592 to 100% for rs16891982 at a = 0.05. By summing the power estimates for all SNPs to detect the respective ORs seen previously, we estimated that if ORs were identical in the Greek population, our study would be expected to have found 8 nominally statistically significant associations among the 33 tested. Among the 18 variants that had been previously identified with genome-wide significance and did not show deviation from HWE, our study would be expected to have found 6 nominally statistically significant associations and 7 were indeed nominally significant.

Comparison of risk allele frequencies between Greek sample and HapMap CEU
For 20 SNPs, the respective minor alleles were the risk alleles for melanoma. Table 3 shows risk alleles in the Greek sample and their frequency in both the Greek sample and HapMap CEU. The risk alleles in the Greek population had a median frequency of 20% (IQR, 4-60%), while their median frequency in HapMap CEU was 32% (IQR, 12-62%) (P = 0.243 for Mann-Whitney U). The correlation between the two populations was very high (r = 0.95, P,0.0001) (Fig. 2).
The risk allele frequencies of nine SNPs (rs6001027-C, rs16891982-G, rs12203592-T, rs258322-T, rs1805007-T, rs1805008-T, rs910873-A, rs17305573-C, and rs1885120-C) were different beyond chance between the Greek sample and HapMap CEU (i.e. 95% CI of risk allele frequencies in the Greek population and the HapMap sample did not overlap). All these variants (except for rs6001027, a nevi-related SNP in PLA2G6) had significantly lower frequencies of risk alleles in the Greek population compared to HapMap CEU, while six of those are variants of genes with well-established role in the genetic control of pigmentation (rs16891982 in SCL45A2, rs12203592 in IRF4, rs258322 in CDK10, rs1885120 in MYH7B, rs1805007 and rs1805008 both in MC1R).
Predictive value of predisposing SNPs in melanomaassociated risk factor models Figure 3 shows the areas under the curve (AUC) for 3 models considering different levels of genetic information. Compared to the phenotypic traits alone, models including the CM-associated SNPs only slightly improved the AUC. The AUC for the model that included only the nominally significant phenotypic traits (i.e. eye color, skin color, sunburn, phototype and tanning) (model 1) was 83.3%, whereas for the model that included these traits along with the 3 SNPs that remained significant after Bonferroni correction in the univariable analysis (model 2) was 83.7%, and

Discussion
We have replicated SNP-melanoma associations, with MAFs ranging from 2% to 41%. Eight associations were nominally statistically significant in the Greek population, the majority of which (87%) had previously reached genome wide significance. The replication of variants deriving from GWAS-discovered loci in our cohort, such as 20q11.2 (ASIP region), 9p21 (MTAP region), 16q24 (MC1R region) and 5p13 (CLPTM1L region), underscores the important contribution of the agnostic approach of GWAS in revealing genuine associations of genetic factors in complex diseases. For 8 SNPs the risk alleles had significantly lower frequencies in the Greek population compared to the HapMap CEU sample, while for 1 SNP the risk allele in the Greek population was higher than HapMap. The genetic models containing the SNPs that confer risk for melanoma improved the AUC compared with the model including only the phenotypic risk factors, but the improvement was of small magnitude.
The aim of our study was to validate a selected panel of SNPs in a case-control cohort of Greek descent, given our recent findings of a higher than expected genetic contribution of CDKN2A/CDK4 Table 3. List of genotyped SNPs, risk alleles in the Greek sample, and risk allele frequency in the Greek sample and HapMap CEU. mutations in a sizable cohort of sporadic and familial cases of our population [28]. Recent GWAS employing a higher density SNP tagging in large patient datasets has revealed a number of variants in genes involved in cell cycle regulation, telomere maintenance and DNA damage response, such as MITF, ATM, PARP-1, TERT, CASP8, CCND1, as well as polymorphisms in MX2, SETDB1 and ARNT/LASS2/ANXA9 region [31][32][33][34]. Although this study was based on earlier GWAS findings and certain candidate gene studies, our findings underscore the role of genes controlling pigmentary traits and DNA damage response in melanoma susceptibility in our population. This may reflect the importance of these pathways in melanoma development in a darker-skin population residing at an area of high year-round UV-influx. Most of the SNPs with significantly lower risk allele frequencies compared to HapMap CEU are found in loci implicated in pigmentation (SCL45A2, IRF4, CDK10, MYH7B, MC1R) and all but 2 (rs16891982, rs258322, rs1805007, and rs1885120) were replicated in the Greek population according to univariable analysis. These findings imply that there might be some differences in the genetic background underlying the phenotypical differences between the Greek and other European populations, and could partially explain the lower melanoma incidence in a population of darker skin complexion residing in a country with intense yearround UV exposure. In addition, our results may underscore the role of natural selection which tends to eliminate the prevalence of predisposing alleles in a population with high sun exposure and increase the frequency of protective alleles which also act through the protective pathways of pigmentation, However, Greeks harboring certain pigmentation-related risk alleles are at risk of developing melanoma.
In the case of melanocytic nevi, the comparison of allele frequencies between nevi-related variants in our cohort and the HapMap were less conclusive, with one variant (rs6001027) showing a higher allele frequency in our population. Only one (rs2218220 in the MTAP region, chrom. 9p21) of the previously nevi-associated SNPs was found to be positively associated with melanoma in our analysis. Given that nevi have been shown to be a strong risk factor of melanoma in the Greek population [35], it is likely that our study was not powered enough to detect smaller effect sizes conferred by these variants. In addition, other nevusassociated variants, yet uncovered, may play a role in melanoma risk.
Among the three top variants of our analysis, the most prominent locus was located within the cleft lip and palate transmembrane 1-like (CLPTM1L) gene and the telomerase reverse transcriptase (TERT) gene. The major C allele of rs401681 has been repeatedly reported to confer risk for BCC and protection against melanoma [36][37][38][39], and was recently replicated at a GWAS of 2,981 melanoma patients and 1,982 controls [31]. In addition, a meta-analysis including data from an Australian case-control study showed that TERT-CLPTM1L variants do influence melanoma risk, albeit with a relatively small effect size [32]. The ''red hair'' variant rs1805007 of the MC1R gene has been consistently linked to melanoma risk in relevant studies. In meta-analyses, rs1805007 showed the highest attributable risk for melanoma among MC1R variants [13,40] with effect estimates similar to those found in this study and a previous Greek case-control study [16]. rs16891982 of the SLC45A2, influences skin pigmentation and exhibits substantially different frequencies among populations, thus determined as an ancestry informative marker. The ancestral Leu allele (rs16891982-C) has been associated with dark skin, eye, and hair color in whites [41], while exhibiting a protective effect against melanoma [39,[42][43][44].
The variants selected for this study were based on the results of a large field synopsis and on-line database that scrutinized all published data on the genetic association of melanoma and subjected them to systematic meta-analyses. All but one (rs1805005) nominally significant associations in our selected set of SNPs came from a subgroup of variants which had p values of 10 27 and are likely to represent genuine associations [45]. We were also able to assess the predictive value of genetic factors in models incorporating various phenotypical and genetic risk factors. In the examined models, the predictive value of AUC did not substantially improve by the addition of genetic variants, compared with the model that involved only the clinical risk factors. Although these genetic models do not seem to contribute substantially to melanoma risk prediction, they are nevertheless suggestive of the contribution of low-penetrance gene variants to melanoma risk. Failure of models relying on common gene variants to improve substantially the predictive discrimination of traditional risk factors is a common problem encountered in complex diseases. Much larger effect sizes and a very large number of genetic variants are needed to improve perceptively the predictive value of genetic models [46]. Moreover, our findings show that statistical significance of a risk model does not guarantee clinical utility highlighting the distinction between the statistical and clinical perspectives of genetic risk models [47].
The current study has some limitations. First, the sample size is modest resulting probably in limited power to detect small or even moderate effects for additional SNPs. Second, no data were recorded on the number of nevi, a well-known melanoma risk factor for melanoma. Nevertheless, only one (rs2218220 in MTAP) of the eight SNPs associated with melanoma has been reported to be also associated with nevus count [48]. It is possible that rs2218220 would lose its significance as a melanoma-associated variant if the number of nevi were included in the multivariate analyses. Third, failure to replicate candidate loci in pigmentationassociated genes other than MC1R, SLC45A2, CDK10, MYH7B and ASIP region could derive from a lack of sufficient statistical power. Fourth, we selected our SNPs from the last update (October 2011) of the MelGene database. However, in the meantime between updates new SNPs might have been discovered in new GWAS, which are likely not to have been included in the accumulated evidence reported in MelGene because of the practical issues of intervals between database updates. This limitation may have a limited impact since some of the newest GWAS, which are not included in this paper, i.e., Barrett et al 2011 [31], provide estimates for established genetic risk factors on expanded datasets of previous GWAS, i.e., Bishop et al 2009 [49], the results of which are included in this study.
In conclusion, our research validated a number of variants that contribute to melanoma susceptibility in Greek population. The assessment of genetic input in a population with one of the lowest incidence of the disease could highlight the variation of genetic risk factors that are in-play in different environmental and population settings from those used in the majority of previous studies. Further validation of newly described variants and a better understanding of the gene-environment interaction may provide valuable insight in the variation of melanoma risk among white populations of different ancestry.

Supporting Information
Table S1 Demographic characteristics and pigmentary phenotype of melanoma cases and control subjects.