Comprehensive Evaluation of One-Carbon Metabolism Pathway Gene Variants and Renal Cell Cancer Risk

Introduction Folate and one-carbon metabolism are linked to cancer risk through their integral role in DNA synthesis and methylation. Variation in one-carbon metabolism genes, particularly MTHFR, has been associated with risk of a number of cancers in epidemiologic studies, but little is known regarding renal cancer. Methods Tag single nucleotide polymorphisms (SNPs) selected to produce high genomic coverage of 13 gene regions of one-carbon metabolism (ALDH1L1, BHMT, CBS, FOLR1, MTHFR, MTR, MTRR, SHMT1, SLC19A1, TYMS) and the closely associated glutathione synthesis pathway (CTH, GGH, GSS) were genotyped for 777 renal cell carcinoma (RCC) cases and 1,035 controls in the Central and Eastern European Renal Cancer case-control study. Associations of individual SNPs (n = 163) with RCC risk were calculated using unconditional logistic regression adjusted for age, sex and study center. Minimum p-value permutation (Min-P) tests were used to identify gene regions associated with risk, and haplotypes were evaluated within these genes. Results The strongest associations with RCC risk were observed for SLC19A1 (Pmin-P = 0.03) and MTHFR (Pmin-P = 0.13). A haplotype consisting of four SNPs in SLC19A1 (rs12483553, rs2838950, rs2838951, and rs17004785) was associated with a 37% increased risk (p = 0.02), and exploratory stratified analysis suggested the association was only significant among those in the lowest tertile of vegetable intake. Conclusions To our knowledge, this is the first study to comprehensively examine variation in one-carbon metabolism genes in relation to RCC risk. We identified a novel association with SLC19A1, which is important for transport of folate into cells. Replication in other populations is required to confirm these findings.


Introduction
One-carbon metabolism refers to a system of interdependent metabolic pathways that facilitate the transfer of one-carbon units and ultimately provide the precursors needed for DNA synthesis and repair as well as DNA methylation. Interconversion of various forms of folate is at the foundation of one-carbon metabolism, and folate deficiency has been shown to cause DNA strand breaks, diminished DNA repair capacity, and aberrant methylation patterns [1]. Folate intake has been associated with risk of a number of cancers, including those of the colorectum, esophagus, stomach, and pancreas [2][3][4]. Little is known about the associations between one-carbon metabolism and renal cancer, though one study suggested folate deficiency is associated with increased risk [5].
A candidate gene analysis by Moore et al. in a large renal cell carcinoma (RCC) case-control study conducted in Central and Eastern Europe examined 10 SNPs in 5 one-carbon metabolism genes and found an association between the MTHFR 677C.T polymorphism and RCC risk [6]. This polymorphism, which results in reduced enzyme activity, has previously been associated with various cancers, with the most consistent evidence pointing to a reduced risk of colorectal cancer [7][8][9][10]. Other genes involved in one-carbon metabolism reactions include TYMS, MTR, MTRR, BHMT, SHMT, and ALDH1L1 (Table 1, Figure S1). The FOLR1, SLC19A1 and GGH genes may also be important since they are involved in folate transport and availability. CBS, CTH and GSS are part of a related pathway in which homocysteine is converted to glutathione, which has been shown to modify cancer risk due to its role in antioxidant reactions and Phase II xenobiotic metabolism [11]. The association of these genes with renal cancer risk is unknown.
We have expanded on the previous findings of Moore et al. to more comprehensively investigate genetic variation across MTHFR and 12 additional candidate genes involved in folate transport, one-carbon metabolism and glutathione synthesis. We examined the relationship between RCC risk and 163 tag SNPs that provided high genomic coverage of 13 target gene regions among RCC cases and controls from Central Europe, an area with some of the highest rates of RCC incidence and mortality worldwide [12].

Study Population
The Central and Eastern European Renal Cell Cancer (CEERCC) Study, details of which have been described previously [13], is a multi-center, hospital-based case-control study conducted from August 1999 to January 2003. Cases consisted of 1,097 patients newly diagnosed with histologically-confirmed RCC, ages 20-79, recruited from centers in Russia (Moscow), Romania (Bucharest), Poland (Lodz) and Czech Republic (Prague, Olomouc, Ceske Budejovice, Brno). Diagnostic information was extracted from hospital records by trained medical staff. Eligible controls included 1,497 patients admitted to the same hospital as cases, but for conditions not related to smoking or genitourinary disorders. Controls were frequency matched to cases based on age (63 years), sex and study center (including hospital of recruitment), and both cases and controls were required to have been residents of the study center area for at least one year at the time of recruitment. Response rates ranged from 90.0-98.6% for cases and from 90.3-96.1% for controls, and all study participants were Caucasian. Blood samples were collected and genomic DNA, extracted from whole blood buffy coat using a standard phenolchloroform extraction method, was obtained for 987 cases (90%) and 1,298 controls (88%). Written informed consent was obtained from all patients and their physicians, and the study was approved by the institutional review boards of the National Cancer Institute (NCI), the International Agency for Research on Cancer (IARC), and each study center.

Data Collection
Trained personnel conducted in-person interviews and used standardized questionnaires to collect information on demographics, education, tobacco smoking, diet, anthropometric measures, and medical, family and occupation histories. All cases were interviewed within three months of RCC diagnosis. Assessment of dietary intake has been described previously [14]. Briefly, the food frequency questionnaire (FFQ) consisted of 23 food items/ categories, selected with local investigators to capture consumption patterns in these specific regions. Frequency of consumption of each food item was assessed using six categories, ranging from never to daily. Food group categories were formed on the basis of intake of similar foods (e.g. cruciferous vegetables included brussel sprouts, broccoli and cabbage). The assessment of vegetable intake, designed to comprehensively cover intake in this region through 1999 [14], was based on the eight specific vegetables commonly eaten in the region (carrots, cabbage, spinach, broccoli, brussel sprouts, onion, pumpkin, tomatoes) plus an additional category for ''other fresh and preserved vegetables''. Tertiles of vegetable intake were calculated based on the consumption frequencies reported among controls.

Genotyping
Genotyping was performed at the NCI's Core Genotyping Facility (CGF) using an Illumina GoldenGateH Oligo Pool All (OPA) assay. CGF staff was blinded to case-control status and samples were blinded and randomized on PCR plates. Duplicate genotyping was performed for a randomly selected 5% of the samples for quality control. Genotyping was performed on a subset of 777 cases (70.8%) and 1,035 controls (70.1%) who provided sufficient quality and quantity of DNA required for use of the IlluminaH OPA assay. The genotyping completion rate was greater than 98% for all SNPs except rs6519519 (97%). The quality control concordance rate was 98% or greater for all SNPs except rs1476413 (94%), rs11121832 (96%), rs234712 (95%), rs4646768 (97%), and rs13069815 (96%). Nine candidate SNPs (Table S1) were genotyped previously at CGF using a validated TaqMan assay, details of which can be found at http:// snp500cancer.nci.nih.gov/home.cfm. Methods and results for these nine SNPs, previously reported by Moore et al. [5], were included in our analyses for gene-based tests as well as the identification and testing of haplotypes in multivariable and stratified analyses.
Genes were selected a priori based on their role in one-carbon metabolism or the closely associated glutathione synthesis pathway (CBS, CTH and GSS). Genotypes were obtained for 167 tag SNPs selected to provide high genomic coverage (80-90%) for 13 candidate genes involved in one-carbon metabolism and glutathione synthesis (see Table 1, Figure S1). Tag SNPs were selected from among common variants (minor allele frequencies $5%) found in Caucasians using a tag SNP method [15] with an r 2 .0.80. The regions 20 kb upstream of the transcription start site and 10 kb downstream of the last exon using HapMap CEU data (http://www.hapmap.org) were included for tag SNP selection to ensure thorough coverage of each target gene. Nonsynonymous SNPs and those with putative functional significance were also included in our analysis. All SNPs included in this study are reported in the National Cancer Institute's SNP500Cancer database (http://snp500cancer.nci.nih.gov) [16]. No deviations from expected Hardy-Weinberg proportions (chi-square tests, p,0.05 for deviation) were observed for genotype frequencies among the controls.

Statistical Analyses
A sequence of analyses was used to comprehensively examine the associations of single SNPs, genes, gene regions, and haplotypes with RCC risk while accounting for multiple testing. Odds ratios (ORs) and 95% confidence intervals (CIs) for the association of each individual SNP with RCC risk were determined using unconditional logistic regression assuming a codominant model of inheritance for SNP genotypes. For linear trend tests the homozygous common, heterozygous and homozygous rare groups were coded as 0, 1 or 2 respectively, corresponding to the number of rare alleles. Logistic regression models were adjusted for age (continuous), sex, and country. Further adjustment for smoking, self-reported hypertension or body mass index did not appreciably change the risk estimates (,5%). Multiplicative interactions between SNPs and these RCC risk factors were evaluated by likelihood ratio tests comparing models with and without interaction terms, but we found no evidence of effect modification. Global significance of associations between individual genes and RCC risk was assessed using the minimum p-value permutation (min-P) test [17,18], which combines information on the set of SNP-RCC associations within a gene and accounts for the correlation between SNPs. Inferences are based on the permutation distribution of the minimum p-value among the set of SNPs in a gene. Min-P tests were adjusted for multiple comparisons among genes using Benjamini and Hochberg's False Discovery Rate (FDR) procedure [19,20]. For genes identified by the single SNP or min-P tests, a haplowalk sliding-window approach was employed in which consecutive 3 SNP windows were examined to identify chromosome regions of interest for further analysis of haplotypes. For each 3 SNP window, the Haplostats package (version 1.3.1) in R (version 2.4.1) was used to reconstruct haplotype frequencies in cases and controls using an expectation maximization (EM) algorithm and to test for association with casecontrol status using generalized linear models (haplo.glm) and a Wald test. SNP windows with an FDR-adjusted p-value below 0.1 were considered to denote a region of interest.
Linkage disequilibrium (LD) between SNPs among control subjects was determined using the Haploview program [21] to calculate r 2 values. Based on the results of the haplowalk procedure and the observed LD structure, haplotype-blocks were selected for further analysis. For these blocks, haplotypes were reconstructed and associations with case-control status (ORs and 95% CIs) were evaluated in Haplostats in R, with adjustment for age (,50, 50-,60, 60-,70 and 70+ years), sex and study center. The most common haplotype served as the reference, and haplotypes with an estimated frequency of less than 5% were combined into a separate ''rare'' haplotype group. Further analyses examined the haplotype associations within strata of vegetable or alcohol intake (based on tertiles of consumption among controls). A likelihood ratio test comparing the fit of models with and without interaction terms was used to evaluate heterogeneity of genotype frequencies among countries, but we did not observe any evidence of heterogeneity. Statistical analyses were conducted using SAS version 9.1 (SAS Institute, Cary, NC), except where otherwise noted.

Results
Demographic and lifestyle characteristics of all participants and genotyped participants in the CEERCC study are shown in Table 2. The distribution of RCC risk factors in successfully genotyped participants was similar to that of all participants. The majority of participants were from the Czech Republic, and cases were more likely to be from that country than controls. Cases were more likely to be obese (BMI.30 kg/m 2 ) and less likely to be current smokers, although the association with smoking was no longer present after adjustment for age, sex and study center.
Results from the individual SNP and gene-based minimum pvalue permutation (min-P) tests are shown in Table 1. The strongest associations with RCC risk were observed for SLC19A1 (P min-P = 0.03) and MTHFR (P min-P = 0.13). Two individual SNPs tagging the SLC19A1 region, rs12483553 and rs17004785, were significantly associated with RCC after adjustment for multiple testing (FDR-adjusted P trend = 0.03 for each). The association for each SNP was slightly attenuated when both were included together in the logistic regression model. The haplowalk analysis also identified SLC19A1 as a region of interest, with an FDRadjusted p-value of 0.02 for the most significant 3-SNP window. Based on the haplowalk analysis and the LD structure among control subjects, we identified a block of four tag SNPs to examine in haplotype analysis (rs12483553, rs2838950, rs2838951, and rs17004785). The A-C-C-C haplotype was associated with a significant 37% increased RCC risk compared to the most frequent haplotype, G-C-G-G (Table 3). The two SNPs identified in the single SNP analysis appear to be driving the haplotype result, as other haplotypes that did not include the minor allele for rs12483553 or rs17004785 did not have different RCC risks relative to the reference haplotype. After stratification, we observed that both the individual SNPs and the A-C-C-C haplotype were significantly associated with RCC risk only among those in the lowest tertile of vegetable intake ( Table 4). The overall distribution of SLC19A1 haplotypes differed significantly (P global = 0.05) between cases and controls within the lowest tertile of vegetable intake, but not within the middle or highest tertile. Power was limited, and tests for interaction between individual   haplotypes and vegetable intake did not yield any significant interactions (Table 4). No differences in haplotype distributions or individual SNP effects were observed in analyses stratified by alcohol intake for any of the genes examined. We utilized data from a recent genome-wide association study (GWAS) of renal cancer [22] with over 3700 Caucasian cases and 8400 controls to look specifically at the association between the two significant SLC19A1 SNPs identified in our study and RCC risk in the GWAS population. Both SNPs were associated with RCC risk (p = 0.01 for rs12483553; p = 0.04 for rs17004785), though not at a level sufficient for genome-wide significance in an agnostic GWAS examination.
A previous analysis by Moore et al. in the CEERCC study population found a significant association between the known functional SNP in MTHFR (rs1801133, 677C.T) and RCC risk [5]. We expanded on the previous result by comprehensively examining the gene region to evaluate whether additional variants in the gene were associated with RCC risk, but none were significant in the single SNP analysis. We identified a region of interest based on the haplowalk analysis and tested haplotypes to Four centers: Brno, Olomouc, Prague, Ceske-Budejovice. 3 Cancer in first-degree relatives. 4 Tertiles based on frequency of vegetable intake among controls. 5 Tertiles for drinkers, based on weighted average of consumption. doi:10.1371/journal.pone.0026165.t002 examine whether they provided additional information beyond the 677C.T SNP (Table 3). Two haplotypes were found to be significantly associated with RCC risk, but the results are consistent with variation at 677C.T driving the association. Stratification by vegetable intake showed that two MTHFR haplotypes were associated with RCC risk in the low or middle tertiles (Table 4), while no haplotypes were significant among those in the highest tertile. However, the effect of vegetable intake on genotype was relatively weak, as the global p-values for the haplotypes were not significant in any vegetable tertile and the tests of interactions had low power but were not significant (Table 4). No significant associations were observed for individual SNPs or the gene-based min-P test for the other one-carbon genes genotyped in this study (Table 1). Therefore we did not examine specific haplotypes for these genes. Results for the individual analyses of all SNPs can be found in Table S1.

Discussion
We conducted an analysis of 163 tag SNPs in 13 genes related to one-carbon metabolism, folate transport, and glutathione synthesis in relation to risk of renal cell carcinoma. We identified one novel gene region that was significantly associated with RCC risk. Specifically, a region proximal to the coding region of SLC19A1 was found to contain two SNPs that were associated with risk of RCC, and a haplotype containing the minor allele for both these SNPs was associated with a significant 39% increased risk. The association was particularly evident among participants who reported low vegetable intake, which may be a proxy for folate intake/status. Supporting our findings, the two significant SLC19A1 SNPs in our study were also associated with RCC in a recent GWAS study of renal cancer [22], though not at a level of genome-wide significance. Our results are novel and require replication in other studies that can examine vegetable or folate intake.
There is a biologic rationale for an involvement of the SLC19A1 gene in cancer risk. SLC19A1 codes for the reduced folate carrier (RFC), which transports reduced folates and antifolate drugs into mammalian cells [23]. Humans are unable to synthesize folates, which are at the foundation of the reactions of one-carbon metabolism, so adequate folate must be consumed, effectively absorbed, and transported to cells and tissues. RFC is ubiquitously expressed, including at high levels in the kidney, and is the major transport system for folates in most tissues [23]. In addition to uptake of folate from the blood, RFC is also involved in absorption of folate in the intestine as well as transport across the basolateral membrane of the renal tubules [23]. Genetic variations in SLC19A1 could result in differing RFC activity and therefore differing folate availability. Folate deficiency has been associated with a number of cancers, including RCC, through mechanisms thought to involve its role in the provision of substrates for DNA synthesis and methylation [2][3][4][5].
Existing evidence for an association between SLC19A1 and cancer risk is sparse. Mouse models suggest that levels of RFC can dramatically impact the neoplastic processes leading to colorectal cancer [23]. Variation in SLC19A1 has not been studied in relation to renal cancer, but epidemiologic studies have examined the association of this gene with other cancers. Most studies examined one particular nonsynonymous SNP (rs1051266, 80G.A), which has been associated with changes in blood concentrations of folate in some studies [24,25] but not others [26,27]. We did not have genotype information for this SNP, though it was moderately correlated with one of our tag SNPs (rs2838951; r 2 = 0.59). Other Table 3. Haplotypes in selected one-carbon metabolism genes and risk of renal cell carcinoma. studies have found possible associations for rs1051266 with risk of bladder, esophageal, and lung cancers [28][29][30], but not with colon, prostate, and breast cancers [31][32][33][34]. A recent study examined tag SNPs of one-carbon genes and found five SNPs in SLC19A1 that were significantly associated with risk of colorectal adenomas, including rs1051266 [35]. One of the specific SLC19A1 tag SNPs found to be associated with RCC risk in our study, rs17004785, was not associated with risk in the studies of colorectal cancer and colorectal adenomas [31,35]. Haplotypes in both SLC19A1 and MTHFR were found to be significantly associated with RCC risk among participants in the lowest tertile of vegetable intake frequency, but no effect of genotype was observed among those in the highest tertile. Vegetables are good sources of folate, particularly in populations such as Eastern and Central Europe where grain products are not fortified with folic acid. Thus vegetable intake may be a rough proxy for folate intake, in which case our results suggest that the effect of genotype for these folate-associated genes may be important only when folate intake is low. A number of studies of MTHFR and cancer risk have demonstrated an increased role of genetic variation under low-folate conditions [10,36], and numerous examples support the theory that diet can modify the impact of genetic variation [37].
To our knowledge, this is the first study to comprehensively examine variation in one-carbon metabolism and folate transport genes in relation to renal cancer risk, including examination of different dietary intakes of folate sources. We used HapMap data with a tag SNP approach to achieve high (80-90%) genomic coverage of the genes of interest, and examined regions upstream and downstream of the coding regions. In addition, the study was conducted in a population without mandatory fortification of grain products with folic acid; similar studies in fortified populations with uniformly high folate intake may not be able to fully examine the interaction between folate intake and genetic effects related to folate transport and metabolism. Other strengths of our study were inclusion of only newly diagnosed and histologically confirmed renal cancers, a high participation rate, and collection of biological samples from a high proportion of participants. The multi-stage analysis and correction for multiple testing reduced the possibility of Type I error. Population stratification is a possible concern, but we found no evidence of heterogeneity by country, and bias from population stratification is unlikely to be substantial in studies of cancer in non-Hispanic European populations [38]. A limitation of our study is that the dietary assessment instrument in the CEERC study was designed specifically to assess intake of the foods most frequently consumed in the study region, but it did not capture the quantity of foods consumed. Diet was assessed after case diagnosis, so recall bias is possible if case status influenced how participants recalled their dietary habits. Given the limitations in the assessment of diet and the need for replication, our analyses should be considered exploratory.
In conclusion, we comprehensively examined genetic variation in 13 genes associated with one-carbon metabolism using a tag SNP approach and identified a novel association between SLC19A1 and Table 4. Haplotypes in selected one-carbon metabolism genes and risk of renal cell carcinoma, stratified by frequency of vegetable intake. renal cell cancer. Additional studies are needed to replicate these findings and better understand the impact of folate intake. Figure S1 Reactions of one-carbon metabolism. A diagram showing the reactions of one-carbon metabolism (simplified), including the role of gene products from the 12 candidate genes in the analysis (GGH not shown) and associated vitamin cofactors. Numbers in brackets [#] correspond to genes listed in Table 1.

Supporting Information
(DOC)