DNA Glycosylases Involved in Base Excision Repair May Be Associated with Cancer Risk in BRCA1 and BRCA2 Mutation Carriers

Single Nucleotide Polymorphisms (SNPs) in genes involved in the DNA Base Excision Repair (BER) pathway could be associated with cancer risk in carriers of mutations in the high-penetrance susceptibility genes BRCA1 and BRCA2, given the relation of synthetic lethality that exists between one of the components of the BER pathway, PARP1 (poly ADP ribose polymerase), and both BRCA1 and BRCA2. In the present study, we have performed a comprehensive analysis of 18 genes involved in BER using a tagging SNP approach in a large series of BRCA1 and BRCA2 mutation carriers. 144 SNPs were analyzed in a two stage study involving 23,463 carriers from the CIMBA consortium (the Consortium of Investigators of Modifiers of BRCA1 and BRCA2). Eleven SNPs showed evidence of association with breast and/or ovarian cancer at p<0.05 in the combined analysis. Four of the five genes for which strongest evidence of association was observed were DNA glycosylases. The strongest evidence was for rs1466785 in the NEIL2 (endonuclease VIII-like 2) gene (HR: 1.09, 95% CI (1.03–1.16), p = 2.7×10−3) for association with breast cancer risk in BRCA2 mutation carriers, and rs2304277 in the OGG1 (8-guanine DNA glycosylase) gene, with ovarian cancer risk in BRCA1 mutation carriers (HR: 1.12 95%CI: 1.03–1.21, p = 4.8×10−3). DNA glycosylases involved in the first steps of the BER pathway may be associated with cancer risk in BRCA1/2 mutation carriers and should be more comprehensively studied.


Introduction
Carrying an inherited mutation in the BRCA1 or BRCA2 gene increases a woman's lifetime risk of developing breast, ovarian and other cancers. The estimated cumulative risk of developing breast cancer by the age of 70 in BRCA1 and BRCA2 mutation carriers varies between 43% to 88%; similarly, between 11% to 59% of mutation carriers will develop ovarian cancer by the age of 70 [1][2][3]. These considerable differences in disease manifestation suggest the existence of other genetic or environmental factors that modify the risk of cancer development. The Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA), was established in 2006 [4] and with more than 40,000 mutation carriers currently provides the largest sample size for reliable evaluation of even modest associations between single-nucleotide polymorphisms (SNPs) and cancer risk. CIMBA studies have so far demonstrated that more than 25 SNPs are associated with the risk of developing breast or ovarian cancer for BRCA1 or BRCA2 carriers. These were identified through genome-wide association studies (GWAS) of breast or ovarian cancer in the general population or through BRCA1and BRCA2-specific GWAS [5][6][7][8]. Cells harboring mutations in BRCA1 or BRCA2 show impaired homologous recombination (HR) [9][10][11] and are thus critically dependent on other members of the DNA repair machinery such as poly ADP ribose polymerase (PARP1) involved in the Base Excision Repair (BER) pathway. The BER pathway is crucial for the replacement of aberrant bases generated by different causes [12]. A deficiency in BER can give rise to a further accumulation of double-strand DNA breaks which, in the presence of a defective BRCA1 or BRCA2 background, could persist and lead to cell cycle arrest or cell death; this makes BRCA-deficient cells extremely sensitive to PARP inhibitors, as previously demonstrated [13]. We hypothesize that SNPs in PARP1 and other members of BER may be associated with cancer risk in BRCA1 and BRCA2 mutation carriers. SNPs in XRCC1, one of the main components of BER, have been recently evaluated within the CIMBA consortium [14], however a comprehensive study has not yet been performed of either XRCC1 or the other genes participating in BER.
In the present study, we used a tagging SNP approach to evaluate whether the common genetic variation in the genes involved in the BER pathway could be associated with cancer risk in a large series of BRCA1/2 mutation carriers using a two-stage approach. The first stage involved an analysis of 144 tag SNPs in 1,787 Spanish and Italian BRCA1/2 mutation carriers. In stage II, the 36 SNPs showing the strongest evidence of association in stage I, were evaluated in a further 23,463 CIMBA mutation carriers included in the Collaborative Oncological Gene-environment Study (COGS) and genotyped using the iCOGS custom genotyping array.

Breast cancer association
In stage I, 144 selected Tag SNPs covering the 18 selected BER genes were genotyped in 968 BRCA1 and 819 BRCA2 mutation carriers from five CIMBA centres (Spanish National Cancer ResearchCentre (CNIO), Hospital Clínico San Carlos (HCSC), Catalan Institute of Oncology (ICO), Demokritos and Milan Breast Cancer Study Group (MBCSG). Of those, 50 were excluded because of low call-rates, minor allele frequency (MAF),0.05, evidence of deviation from Hardy Weinberg Equilibrium (p-value,10 23 ) or monomorphism. Associations with breast cancer risk were assessed for 94 SNPs, as summarized in Table S1. The 36 SNPs that showed evidence of association at p#0.05 were selected for analysis in stage II. Of the 36 SNPs successfully genotyped in the whole CIMBA series comprising 15,252 BRCA1 and 8211 BRCA2 mutation carriers, consistent evidence of association with breast cancer risk (p-trend,0.05) was observed for six SNPs ( Table 1). The strongest evidence of association was observed for rs1466785 in the NEIL2 gene (HR: 1.09, 95% CI (1.03-1.16), p = 2.7610 23 ) for association with breast cancer risk in BRCA2 mutation carriers. We had observed a consistent association in stage I in BRCA2 mutation carriers (HR: 1.25, p = 0.06). The SNP was primarily associated with ERnegative breast cancer (HR: 1.20, 95%CI (1.06-1.37), p = 4610 23 ), although the difference in HRs for ER-positive and ER-negative disease was not statistically significant. The evidence of association in Stage II was somewhat stronger when considering the genotype-specific models, with the dominant being the best fitting (HR: 1.20 95% CI: 1.09-1.37, p = 1610 24 ). The associations remained significant and the estimated effect sizes remained consistent with the overall analysis when the data were reanalyzed excluding samples used in stage I of the study (data not shown). Imputation using the 1000 genomes data showed that there were several SNPs in strong linkage disequilibrium (LD) with rs1466785 showing more significant associations (p,10 23 ) ( Figure 1).

Ovarian cancer association
Due to lack of power we did not perform analysis of associations with ovarian cancer in stage I. However, we performed this analysis for the 36 SNPs tested in stage II. Although they had been selected based on their evidence of association with breast cancer risk, under the initial hypothesis they are also plausible modifiers of ovarian cancer risk for BRCA1 and BRCA2 mutation carriers. We found four SNPs associated with ovarian cancer risk with a ptrend,0.01 in BRCA1 or BRCA2 mutation carriers ( Table 1). The strongest association was found for rs2304277 in OGG1 in BRCA1 mutation carriers (HR: 1.12, 95%CI: 1.03-1.21, p = 4.8610 23 ).
The association was somewhat stronger under the dominant model (HR: 1.19, 95%CI: 1.08-1.3, p = 6610 24 ). Although three other SNPs were found to be associated with ovarian cancer risk in BRCA2 mutation carriers (p-trend,10 23 ), these results were based on a relatively small number of ovarian cancer cases. Imputed data did not show any SNPs with substantially more significant associations with ovarian cancer risk except for rs3093926 in PARP2, associated with ovarian cancer risk in BRCA2 mutation carriers for which there was a SNP, rs61995542, with a stronger association (HR: 0.67, p = 4.6610 24 ) ( Figure S1).

Discussion
Based on the interaction of synthetic lethality that has been described between PARP1 and both BRCA1 and BRCA2, we hypothesize that this and other genes involved in the BER pathway could potentially be associated with cancer risk in BRCA1/2 mutation carriers. Several studies have recently investigated the association of some of the BER genes with breast cancer, however, no definitive conclusions can be drawn, given that some publications suggest that SNPs in these genes can be associated with breast cancer risk with marginal p-values while others rule out a major role of these genes in the disease [15][16][17][18][19][20][21]. There is only one study from the CIMBA consortium which has evaluated the role of three of the most studied SNPs in the XRCC1 gene, c.-77C.T (rs3213245) p.Arg280His (rs25489) and p.Gln399Arg (rs25487), ruling out associations of these variants with cancer risk in BRCA1 and BRCA2 mutation carriers [14]. However, a comprehensive analysis of neither XRCC1 nor the other genes involved in the pathway in the context of BRCA mutation carriers has been performed. In the present study we have assessed the common genetic variation of 18 genes participating in BER by using a two stage strategy.
Eleven SNPs showed evidence of association with breast and/or ovarian cancer at p,0.05 in stage II of the experiment (Table 1). Of those, six showed a p-trend value,0.01 and were therefore considered the best candidates for further evaluation. Only one of those six, rs1466785 in the NEIL2 gene (endonuclease VIII-like 2) showed an association with breast cancer risk while the other five, rs2304277 in OGG1 (8-guanine DNA glycosylase), rs167715 and rs4135087 in TDG (thymine-DNA glycosylase), rs3093926 in PARP2 (Poly(ADP-ribose) polymerase 2) and rs34259 in UNG (uracil-DNA glycosylase) were associated with ovarian cancer risk.
The minor allele of NEIL2-rs1466785 was associated with increased breast cancer risk in BRCA2 mutation carriers; moreover, when considering the genotype-specific risks observed that the best fitting model was the dominant one. NEIL2 is one of the oxidized base-specific DNA glycosylases that participate in the initial steps of BER and specifically removes oxidized bases from transcribing genes [22]. By imputing using the 1000 genome data we found six correlated SNPs in strong LD with rs1466785 (r 2 .0.8), located closer or inside the gene and showing slightly stronger and more significant associations with the disease and therefore being better candidate causal variants. From those, we considered rs804276 and rs804271 as the best candidates given that they showed the most significant associations (p = 6610 24 and p = 8610 24 respectively) and there were available epidemiological or functional data supporting their putative role in cancer. SNP rs804276 has been associated with disease recurrence in patients with bladder cancer treated with Bacillus Calmette-Guérin (BCG) (HR: 2.71, 95%CI (1.75-4.20), p = 9610 26 ) [23]. SNP rs804271 is located in a positive regulatory region in the promoter of the gene, between two potential cis-binding sites for reactive oxygen species responsive transcription factors in which sequence variation has

Author Summary
Women harboring a germ-line mutation in the BRCA1 or BRCA2 genes have a high lifetime risk to develop breast and/or ovarian cancer. However, not all carriers develop cancer and high variability exists regarding age of onset of the disease and type of tumor. One of the causes of this variability lies in other genetic factors that modulate the phenotype, the so-called modifier genes. Identification of these genes might have important implications for risk assessment and decision making regarding prevention of the disease. Given that BRCA1 and BRCA2 participate in the repair of DNA double strand breaks, here we have investigated whether variations, Single Nucleotide Polymorphisms (SNPs), in genes participating in other DNA repair pathway may be associated with cancer risk in BRCA carriers. We have selected the Base Excision Repair pathway because BRCA defective cells are extremely sensitive to the inhibition of one of its components, PARP1. Thanks to a large international collaborative effort, we have been able to identify at least two SNPs that are associated with increased cancer risk in BRCA1 and BRCA2 mutation carriers respectively. These findings could have implications not only for risk assessment, but also for treatment of BRCA1/2 mutation carriers with PARP inhibitors. been proven to alter the transcriptional response to oxidative stress [24]. Moreover, this SNP has been proposed to partly explain the inter-individual variability observed in NEIL2 expression levels in the general population and has been proposed as a potential risk modifier of disease susceptibility [25]. Several studies have been published showing associations between SNPs in NEIL2 and lung or oropharyngeal cancer risk [26,27] but to our knowledge, no association with breast cancer risk has been reported. We hypothesize that the potential association observed in the present study could be explained by the interaction between NEIL2 and BRCA2, each of them causing a deficiency in the BER and HR DNA repair pathways, respectively. This would explain why the breast cancer risk modification due to rs1466785 would only be detected in the context of BRCA2 mutation carriers and not in the general population.
The strongest evidence of association found in BRCA1 carriers was between rs2304277 in the OGG1 gene and ovarian cancer risk. The association was more significant when considering the dominant model. OGG1 removes 8-oxodeoxyguanosine which is generated by oxidative stress and is highly mutagenic, and it has been suggested that SNPs in the gene could be associated with cancer risk [28][29][30][31]. This is an interesting result, given that to date only one SNP, rs4691139 in the 4q35.3 region, also identified through the iCOGS effort, has been found to modify ovarian cancer risk specifically in BRCA1 carriers [32]. SNP rs2304277 is located in the 39UTR (untranslated region) of the gene and is probably not the causal variant, however, in this case imputations through the 1000 Genome did not show better results for a more plausible causal SNP.
We have identified four SNPs associated with ovarian cancer risk in BRCA2 mutation carriers, rs167715 and rs4135087 in the TDG gene, rs34259 in the UNG gene and rs3093926 in PARP2. However, these last results should be interpreted with caution given that the number of BRCA2 carriers affected with ovarian cancer is four-fold lower than for BRCA1 carriers and the statistical power was therefore more limited, increasing the possibility of false-positives. In the case of PARP2, imputed data showed a lower p-value of association (4610 24 ) for another SNP, rs61995542, that had a slightly higher MAF than rs3093926 (0.074 vs. 0.067) ( Figure S1). However, it must still be interpreted with caution due to small number of ovarian cancer cases in the BRCA2 group.
It is worth noting that, four of the five genes for which strongest evidence of association was observed, are all DNA glycosylases participating in the initiation of BER by removing damaged or mismatched bases. Apart from the already mentioned NEIL2 and OGG1, TDG initiates repair of G/T and G/U mismatches commonly associated with CpG islands, while UNG removes uracil in DNA resulting from deamination of cytosine or replicative incorporation of dUMP. We have not found strong associations with SNPs in genes involved in any other parts of the pathway, such as strand incision, trimming of ends, gap filling or ligation. It has been suggested that at least in the case of uracil repair, base removal is the major rate-limiting step of BER [33]. This is consistent with our findings, suggesting that SNPs causing impairment in the function of these specific DNA glycosylases could give rise to accumulation of single strand breaks and subsequently DNA double strand breaks that, in the HR defective context of BRCA1/2 mutation carriers would increase breast and ovarian cancer risk.
The fact that the SNPs tested are located in genes participating in the same DNA repair pathway as PARP1, make them especially interesting, not only as risk modifiers but also because they could have an impact on patients' response to treatment with PARP inhibitors. BRCA1/2 mutation carriers harboring a potential modifier SNP in DNA glycosylases could be even more sensitive to PARPi due to a constitutional slight impairment of the BER activity. This is a hypothesis that should be confirmed in further studies. The design of this study in two stages, the hypothesis-based approach adopted to select genes, and that it is based on the largest possible series of BRCA1 and BRCA2 carriers available nowadays, mean that the results obtained are quite solid However, the study still has some limitations such as the possible existence of residual confounding due to environmental risk factors for which we did not have information.
In summary, we have identified at least two SNPs, rs1466785 and rs2304277, in the DNA glycolylases NEIL2 and OGG1, potentially associated with increased breast and ovarian cancer risks in BRCA2 and BRCA1 mutation carriers, respectively. Our results suggest that glycosylases involved in the first steps of the BER pathway may be cancer risk modifiers in BRCA1/2 mutation carriers and should be more comprehensively studied. If confirmed, these findings could have implications not only for risk assessment, but also for treatment of BRCA1/2 mutation carriers with PARP inhibitors.

Subjects
Eligible subjects were female carriers of deleterious mutations in BRCA1 or BRCA2 aged 18 years or older [6]. A total of 55 collaborating CIMBA studies contributed genotypes for the study. Numbers of samples included from each are provided in Table S2. Statistical analysis. To test for departure from Hardy-Weinberg equilibrium (HWE), a single individual was randomly selected from each family and Pearson's X 2 Test (1df) was applied to genotypes from this set of individuals. The association of the SNPs with breast cancer risk was assessed by estimating hazard ratios (HR) and their corresponding 95% confidence intervals (CI) using weighted multivariable Cox proportional hazards regression with robust estimates of variance [34]. For each mutation carrier, we modeled the time to diagnosis of breast cancer from birth, censoring at the first of the following events: bilateral prophylactic mastectomy, breast cancer diagnosis, ovarian cancer diagnosis, death or date last know to be alive. Subjects were considered affected if their age at censoring corresponded to their age at diagnosis of breast cancer and unaffected otherwise. Weights were assigned separately for carriers of mutations in BRCA1 and BRCA2, by age and affection status, so that the weighted observed incidences in the sample agreed with established estimates for mutation carriers [1]; [34].
We considered log-additive and co-dominant genetic models and tested for departure from HR = 1 by applying a Wald test based on the log-HR estimate and its standard error. Additional independent variables included in all analyses were year of study, centre and country. All statistical analyses were carried out using Stata: Release 10 (StataCorp. 2007. Stata Statistical Software: Release 10.0. College Station, TX: Stata Corporation LP). Robust estimates of variance were calculated using the cluster subcommand, applied to an identifier variable unique to each family.

Methods stage II
iCOGS SNP array. Stage II of the experiment was performed as part of the iCOGS genotyping experiment. The iCOGS custom array was designed in collaboration between the Breast Cancer Association Consortium (BCAC), the Ovarian Cancer Association Consortium (OCAC), the Prostate Cancer Association Group to Investigate Cancer Associated in the Genome (PRACTICAL) and CIMBA. The final design comprised 211,155 successfully manufactured SNPs of which approximately 17.5% had been proposed by CIMBA. A total of 43 SNPs were nominated for inclusion on iCOGS based on statistical evidence of association in stage I of the present study (p#0.05). Of these, 36 were successfully manufactured and genotyped in CIMBA mutation carriers.
iCOGS genotyping and quality control. Genotyping was performed at Mayo Clinic and the McGill University and Génome Québec Innovation Centre (Montreal, Canada). Genotypes were called using Illumina's GenCall algorithm. Sample and quality control process have been described in detail elsewhere [32,35]. After the quality control process a total of 23,463 carriers were genotyped for the 36 selected SNPs.
Statistical analysis. Both breast and ovarian cancer associations were evaluated in stage II. Censoring for breast cancer followed the same approach as in stage I. Censoring for ovarian cancer risk occurred at risk-reducing salpingo-oophorectomy or last follow-up.
The genotype-disease associations were evaluated within a survival analysis framework, by modelling the retrospective likelihood of the observed genotypes conditional on the disease phenotypes [9,34,36,37]. The associations between genotype and breast or ovarian cancer risk were assessed using the 1 d.f. score test statistic based on this retrospective likelihood. To allow for the non-independence among related individuals, we accounted for the correlation between the genotypes by estimating the kinship coefficient for each pair of individuals using the available genomic data [34,38,39]. These analyses were performed in R using the GenABELlibraries and custom-written functions in FORTRAN and Python.
To estimate the magnitude of the associations (HRs), the effect of each SNP was modeled either as a per-allele HR (multiplicative model) or as genotype-specific HRs, and was estimated on the logscale by maximizing the retrospective likelihood. The retrospective likelihood was fitted using the pedigree-analysis software MEN-DEL. The variances of the parameter estimates were obtained by robust variance estimation based on reported family membership. All analyses were stratified by country of residence and based on calendar-year and cohort-specific breast cancer incidence rates for mutation carriers. Countries with small number of mutation carriers were combined with neighbouring countries to ensure sufficiently large numbers within each stratum. USA and Canada were further stratified by reported Ashkenazi Jewish (AJ) ancestry.
Imputation. Genotypes were imputed separately for BRCA1 and BRCA2 mutation carriers using the v3 April 2012 release (Genomes Project et al., 2012) as reference panel. To improve computation efficiency we used a two-step procedure which involved pre-phasing in the first step and imputation of the phased data in the second. Pre-phasing was carried out using the SHAPEIT software [40]. The IMPUTE version 2 software was used for the subsequent imputation [41]. SNPs were excluded from the association analysis if their imputation accuracy was r2,0.3 or MAF,0.005 in any of the data sets. For the final analysis we only took in account those SNPs with an imputation accuracy r2.0.7, MAF.0.01 and being located in the region comprised within 15 kilo bases (kb) downstream and upstream the gene where the genotyped SNP showing an association was located (Table 1). Associations between imputed genotypes and breast cancer risk were evaluated using a version of the score test as described above but with the posterior genotype probabilities replacing the genotypes. Figure S1 p-values of association (2log10 scale) with breast and ovarian cancer risk in BRCA1 and BRCA2 carriers for genotyped and imputed SNPs considering 15 kb upstream and downstream the genes in which SNPs described in Table 1 were located. rs numbers of SNPs from Table 1 are indicated at the top of each panel and in the graph with a purple arrow. For PARP2 gene, the imputed SNP with the strongest association, rs61995542 is indicated with a red arrow. Colors represent the pariwise r 2 . (PPT)