No prognostic value added by vitamin D pathway SNPs to current prognostic system for melanoma survival

The prognostic improvement attributed to genetic markers over current prognostic system has not been well studied for melanoma. The goal of this study is to evaluate the added prognostic value of Vitamin D Pathway (VitD) SNPs to currently known clinical and demographic factors such as age, sex, Breslow thickness, mitosis and ulceration (CDF). We utilized two large independent well-characterized melanoma studies: the Genes, Environment, and Melanoma (GEM) and MD Anderson studies, and performed variable selection of VitD pathway SNPs and CDF using Random Survival Forest (RSF) method in addition to Cox proportional hazards models. The Harrell’s C-index was used to compare the performance of model predictability. The population-based GEM study enrolled 3,578 incident cases of cutaneous melanoma (CM), and the hospital-based MD Anderson study consisted of 1,804 CM patients. Including both VitD SNPs and CDF yielded C-index of 0.85, which provided slight but not significant improvement by CDF alone (C-index = 0.83) in the GEM study. Similar results were observed in the independent MD Anderson study (C-index = 0.84 and 0.83, respectively). The Cox model identified no significant associations after adjusting for multiplicity. Our results do not support clinically significant prognostic improvements attributable to VitD pathway SNPs over current prognostic system for melanoma survival.


Introduction
Cutaneous melanoma is a potentially fatal form of skin cancer. More than 10,000 individuals in the US are expected to die from this disease in 2016 [1]. The American Joint Committee on Cancer (AJCC) developed the I-IV staging system [2] for melanoma based on tumor characteristics, including Breslow thickness, ulceration and mitoses, which is the major prognostic a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Materials and methods
The study design, sample and data collection methods for the GEM study were described elsewhere [8]. In brief, the GEM Study is an international population-based study of melanoma development and progression consisting of 3,578 incident cases of cutaneous melanoma, where controls (n = 2,372) were newly diagnosed with invasive single primary melanomas and the cases (n = 1,206) were newly diagnosed with multiple primary melanomas. The sample collection, germline buccal DNA extraction, and genotyping pipeline along with standard quality control procedures for the 38 VDR SNPs in GEM using the Sequenom MassARRAY iPLEX platform, pyrosequencing and melting temperature assays were previously described [3,9]. A subset of SNPs were genotyped using a custom Illumina GoldenGate assay, where standard quality control procedures were performed to ensure the quality of the Illumina SNP data. Specifically, we visually evaluated the genotype clustering images and excluded samples and SNPs with genotyping call rates of <90%. There are 2993 samples genotyped using Illumina that passed quality control. Assays were considered optimal according to degree of clustering of repeats, the absence of signal in controls and reproducibility. Data on major clinical prognostic factors for melanoma, demographic, sun exposure and histopathological variables were collected for all study participants and were described previously [10]. Seventy SNPs from genes in the VitD pathway, including 38 VDR SNPs and 32 additional SNPs genotyped by Sequenom or Illumina platform, among 3,566 white subjects were included in the analyses. To illustrate the genes evaluated in this study along with their biological roles, we have included a vitamin D pathway diagram (S1 Fig) adapted from a previous report [11] which also incorporates the anti-cancer effects of vitamin D [12,13]. The list of SNPs along with their chromosome locations, minor/major alleles in the population, gene names, minor allele frequency (MAF) in the GEM study participants, and genotyping platform are shown in S1 functionally relevant SNPs or SNPs previously reported in other genetic association studies that included the Vitamin D pathway genes [9,11]. The linkage disequilibrium pattern for the 38 VDR SNPs was described elsewhere [9]. The MD Anderson study is a hospital-based study of cutaneous melanoma, consisting of 1804 melanoma patients presenting to clinics at MD Anderson [7]. A total of 1788 samples among them were successfully genotyped and passed quality control using the Illumina Human Omni1-Quad_v1-0_B array. The sample collection, DNA extraction methods, genotyping platform, and standard quality control procedures were described previously [7]. Genome-wide genotyping imputation was conducted using MACH [14] and the Hapmap2 CEU population reference panel [7]. Among the 70 VitD pathway SNPs genotyped in GEM, 65 were either genotyped or imputed in the MD Anderson study, and were evaluated as an independent validation. The genotype data from the MD Anderson study [7]  Utilizing the genotype and phenotype data from the GEM study and the MD Anderson study, we investigated the prognostic improvement of VitD pathway SNPs over the major known prognostic factors including: age, sex, Breslow thickness, mitoses and ulceration. In the GEM study we also performed a secondary analysis by including other additional prognostic factors (histology, site, sun exposure, and phenotypic index) to evaluate the change in results and model performance.
Summary statistics were used to describe the patient demographics and characteristics of the common clinical and demographical factors in the two studies. Two-sided t-tests and Wilcoxon rank-sum tests were used to compare continuous variables. Chi-squared tests were performed to compare categorical variables. We applied the Random Survival Forest (RSF) method [15] as well as Cox proportional hazards models to data from the two melanoma studies. RSF is an ensemble tree method for analysis of right censored survival data [15]. Each tree is built using a recursive portioning method to split the feature space, spanned by all predictor variables, into groups of subjects with similar association patterns between the predictor variables and the survival outcome. Specifically, each tree is grown using a randomly drawn bootstrap sample of the data. Based on a randomly selected subset of the variables, a survival criterion involving survival time and censoring status information (log-rank test) is used to split the tree nodes. Prediction is made by averaging over an ensemble of trees. Important variables were selected based on two measures of the predictiveness of variables in a tree: 1) variable importance (VIMP) [15], and 2) minimal depth of maximal subtree [16]. VIMP is a measure of how important a variable is, which estimates the change in prediction error if that variable is eliminated from analysis. Minimal depth assesses the predictiveness of variables in a tree by estimating the minimal depth relative to the root node. A larger VIMP value or a smaller minimal depth corresponds to better predictiveness of a variable.
For Cox proportional hazard regression models, Harrell's C-index, an extension of the area under the ROC curve (AUC), was calculated to measure the concordance probability between prognostic factors and survival outcome and thus compare the predictability of the selected variables.
To further evaluate the prognostic effects of VitD SNPs on melanoma survival, we examined the relationship between VitD pathway SNPs and Breslow thickness as a marker of melanoma prognosis. Linear regression models were performed to assess the association between SNPs and log transformed Breslow thicknesses. We also estimated the average causal mediation effect using a quasi-Bayesian Monte Carlo causal mediation analysis method [17,18] to investigate the impact of each SNP on Melanoma prognosis mediated by Breslow thickness. Multiple comparisons were adjusted using Benjamini and Hochberg (BH) false discovery rate procedure [19].

Results
The patient demographics and characteristics are displayed in Table 1. Compared to the subjects in GEM, patients in the MD Anderson study were younger, had thicker tumors, more ulceration, more mitoses, and more melanoma deaths. The gender distributions were similar in the two studies. The differences between the two studies suggested more aggressive tumors in the hospital-based MD Anderson study compared to the population-based GEM study.
The variables with positive VIMP values selected by the RSF method are described in Table 2 in which major prognostic variables (age, sex, Breslow thickness, mitosis, and ulceration) are displayed in italic font and the "rs" (Reference Sequence) numbers denote VitD  pathway SNPs. In both the GEM study and MD Anderson study, existing clinical and demographic prognostic factors had larger VIMP values, indicating higher predictability for melanoma survival compared to SNPs. Table 3 gives results of the secondary analyses of including additional prognostic variables (histology, site, sun exposure, and phenotypic index) in the RSF model in the GEM study. We again observed higher prognostic effects of the clinical and demographical factors than the SNPs on melanoma specific survival.
In the GEM study, including both VitD SNPs and major known prognostic factors selected by the RSF method in a Cox regression model yields a C-index (a concordance measure) of 0.85, which provided slight but not significant improvement by using the known prognostic factors alone (age, sex, Breslow thickness, mitoses and ulceration) alone (C-index = 0.83). Similar results were observed in the MD Anderson study; the C-index is 0.85 for combined SNPs and clinical factors, and 0.84 for clinical factors (age, sex, Breslow thickness, mitoses and ulceration) alone. When additional prognostic factors (i.e. histology, site, sun exposure, and phenotypic index) were included in the analysis of GEM study, we did not observe significant prognostic improvements of incorporating the VitD SNPs (C index = 0.84) over using clinical and demographic factors (age, Breslow thickness, mitoses, ulceration, histology, site, sunburn, status, phenotypic index, freckle, education) alone (C index = 0.83). RSF analyses using minimal depth as the measure of predictiveness yielded similar results (data not shown).
Using the Cox proportional hazards model, we identified nine SNPs nominally significantly associated with melanoma survival in the GEM study (P<0.05), and nine such SNPs in the MD Anderson study (Table 4). Among them the commonly studied rs1544410 (BsmI) and rs731236 (TaqI) polymorphisms were significant in both studies, suggesting their potential biological role in melanoma survival. However, after correcting for multiple tests, none of the SNPs reached the FDR cutoff of 0.05. Similar results were observed in the analyses of Breslow thickness as a marker for melanoma prognosis in both GEM and MD Anderson study. No

Discussion
Utilizing the population-based GEM study, we did not observe an enhanced prognostic classification for melanoma by incorporating VitD pathway SNPs into the known major prognostic measures (i.e. age, sex, Breslow thickness, mitoses and ulceration). Using the MD Anderson study as an independent validation, we observed similar results that Breslow thickness, ulceration, mitoses and age are consistently selected as the top prognostic variables. When additional prognostic factors (i.e. histology, site, sun exposure, and phenotypic index) were included in the analysis of the GEM study, the tumor factors (histology and site) were again selected as variables with higher prognostic effects compared to VitD pathway SNPs. Findings using a sophisticated random survival forest (RSF) approach and Cox proportional hazards model and application to two independent melanoma studies yield similar results. While both RSF and Cox models aim to identify variables that best predict the survival outcome, the mechanisms implemented in the two algorithms are different. The Cox regression model is the widely used method for investigating the relationship between covariates of interest and survival outcome, which estimates the log-linear relationship between covariates and the underlying hazard function and provides clinically interpretable results. The RSF method is a nonparametric machine learning method which assesses the prediction accuracy of ensemble trees, and introduced two randomnesses where a randomly drawn bootstrap sample was used to grow a tree and at each split a randomly selected subset of variables was selected as candidates [15,20]. The RSF method was demonstrated to outperform the Cox model when possible non-linear relationships exist [21]. Due to the aforementioned different mechanisms used in the two methods, we observed that SNPs with the highest VIMP values may not necessarily be those identified in Cox regression model. However, the overall conclusion that VitD pathway SNPs do not provide significant prognostic improvement for melanoma over current prognostic system is consistent in both analyses.
Although the two independent studies have different designs, population-based vs. hospital-based, and thus have distinct patient characteristics, we consistently observed higher prognostic effects for the clinical factors known to be associated with melanoma outcome compared to the SNPs in predicting melanoma specific survival. This lack of predictability suggests the limited potential use of adding VitD pathway SNPs to the prognostic system for melanoma. Studies investigating the improvement of including genetic factors in predicting melanoma risk have reported effects with a variety of magnitude. It was reported that the improvement of melanoma risk prediction by adding MC1R to age, sex, and cutaneous melanin phenotypes is modest and too small to be valuable in clinical setting [22]. Other studies have reported statistically significant, but small [23] to modest [24] improvement in prediction of melanoma risk by adding MC1R genotype to traditional demographic and pigmentation characteristics.
It is important to note from our power analysis that the GEM study has a sufficiently large sample size to detect increase in AUC. In the GEM study sample, we have 92% power to detect an increase of 0.04 between a diagnostic test with an AUC of 0.80 and another diagnostic test with an AUC of 0.84 using a one-sided z-test at a significance level of 0.05. The modest improvement in prognostic effects by including the VitD pathway SNPs is not likely due to the sample size issue.
The large-scale genotyping technologies and genetic epidemiology studies of melanoma provide promise for unraveling patients' genetic makeup and developing genetic prognostic markers for melanoma progression. The VitD pathway SNPs were reported in previous publications to be significantly associated with melanoma specific survival [3,4], suggesting its key important role in understanding the biological mechanisms for melanoma progression. To date there are limited evidence for SNPs that significantly predict melanoma survival, and the reported effect sizes of SNP predictability in survival are typically small [25,26]. The findings from this study yielded similar results. We did not find an improvement in melanoma prognosis beyond that attributed to known prognostic variables by including VitD pathway SNPs. The genetic variations contributing to melanoma survival and progression are likely to be multi-dimensional, and may involve complicated biological pathway functionality and gene environment interactions in addition to single SNPs and warrants further investigation.
Besides the strengths of this study, we have noted some limitations. The major limitation of our study is that instead of a genome-wide association study investigating all human genes, we have conducted a confirmatory study which focused on a subset of SNPs in the vitamin D Pathway that were previously reported to be associated with melanoma or biologically functionally relevant. Second, other than the classical vitamin D pathway, we have not investigated the recently reported novel alternative pathways of vitamin D activation which may have complicated the findings. Different from classical vitamin D activation, novel pathways of vitamin D3 and 7-dehydrocholesterol initiated by CYP11A1 were recently reported [27], which were demonstrated to be of significant physiological role [28]. Novel vitamin D3 hydroxyderivatives resulting from CYP11A1 action were detectable in human epidermis and serum and in pig adrenal glands [29]. It was demonstrated that the endogenously produced novel D3 hydroxyderivatives can act as biased agonists of VDR and inverse agonists of the retinoic acid-related orphan receptors (RORα and RORγ) [30,31]. It was also shown that RORα and RORγ are expressed in normal and pathological human skin [30], and that decreased expression of RORα and RORγ is associated with the development and progression of melanoma [32]. Third, we had limited ability to explore some factors previously demonstrated to contribute to melanoma prognosis. The reduced expression of vitamin D receptor (VDR) was reported to be related to shorter overall survival [33]. Low expression of the Vitamin D activating enzyme 1α-Hydroxylase (CYP27B1) were found to be associated with shorter overall survival and disease free survival in melanomas [34]. Fourth, we acknowledge the limitation that we have not collected data on patients' actual vitamin D status such as blood level of 25-hydroxyvitamin D and 1, 25-dihydroxyvitamin D. It would be important to evaluate the role of actual vitamin D status on melanoma prognosis that potentially confounds the SNP effects, which may provide more information and shed light on clinical decision making, such as vitamin D supplement for melanoma patients based on their genetic background.
Future larger genome-wide or sequencing studies exploring the melanoma prognostic effects more comprehensively by including SNPs from alternative pathways of vitamin D activation as well as other biological pathways are suggested. Evaluation of the association between actual vitamin D status and melanoma survival that are independent of SNPs would also contribute to our understanding of the melanoma prognosis. Finally, in terms of the biology we may find more meaningful results when accounting for sun exposure, and therefore our future work will include investigating the prognostic effects of gene-environment interactions. We will also explore combining the prognostic effects of SNPs into that of biologically meaningful genes or pathways, and evaluate the improvement of melanoma prognosis by adding genes or pathways in the future.
Supporting information S1 Fig. A diagram of genes involved in the Vitamin D pathway adapted from a previous report [11]. (PNG) S1 Table. The list of SNPs along with their chromosome locations, minor/major alleles in the population, gene names, minor allele frequency (MAF) in the GEM study participants, and genotyping platform. (DOCX)