RAD51B in Familial Breast Cancer

Common variation on 14q24.1, close to RAD51B, has been associated with breast cancer: rs999737 and rs2588809 with the risk of female breast cancer and rs1314913 with the risk of male breast cancer. The aim of this study was to investigate the role of RAD51B variants in breast cancer predisposition, particularly in the context of familial breast cancer in Finland. We sequenced the coding region of RAD51B in 168 Finnish breast cancer patients from the Helsinki region for identification of possible recurrent founder mutations. In addition, we studied the known rs999737, rs2588809, and rs1314913 SNPs and RAD51B haplotypes in 44,791 breast cancer cases and 43,583 controls from 40 studies participating in the Breast Cancer Association Consortium (BCAC) that were genotyped on a custom chip (iCOGS). We identified one putatively pathogenic missense mutation c.541C>T among the Finnish cancer patients and subsequently genotyped the mutation in additional breast cancer cases (n = 5259) and population controls (n = 3586) from Finland and Belarus. No significant association with breast cancer risk was seen in the meta-analysis of the Finnish datasets or in the large BCAC dataset. The association with previously identified risk variants rs999737, rs2588809, and rs1314913 was replicated among all breast cancer cases and also among familial cases in the BCAC dataset. The most significant association was observed for the haplotype carrying the risk-alleles of all the three SNPs both among all cases (odds ratio (OR): 1.15, 95% confidence interval (CI): 1.11–1.19, P = 8.88 x 10−16) and among familial cases (OR: 1.24, 95% CI: 1.16–1.32, P = 6.19 x 10−11), compared to the haplotype with the respective protective alleles. Our results suggest that loss-of-function mutations in RAD51B are rare, but common variation at the RAD51B region is significantly associated with familial breast cancer risk.

The ABCS study was supported by the Dutch Cancer Society [grants NKI 2007-3839;2009 4363]

Introduction
Breast cancer is the most frequent cancer among women worldwide and also the leading cause of cancer-related death [1]. Several susceptibility loci for breast cancer have been identified and most of the currently known high-and moderate-penetrance predisposition genes have a role in DNA repair. The major high-penetrance breast and ovarian cancer susceptibility genes BRCA1 and BRCA2 are important for DNA double-strand break (DSB) repair through homologous recombination (HR) [2]. Proteins encoded by the moderate-penetrance genes ATM, BRIP1, and CHEK2 interact with BRCA1 in DNA damage repair whereas PALB2 associates with both BRCA1 and BRCA2 [2,3]. In the HR repair of DSBs, the RAD51 recombinase has a key role. Binding of RAD51 to single-stranded DNA at the break site initiates the repair of a DSB [4]. In humans, there are five RAD51 paralogs RAD51B, RAD51C, RAD51D, XRCC2, and XRCC3, and they promote the binding of RAD51 to the DNA. Rare pathogenic mutations in RAD51 paralogs RAD51C and RAD51D have been identified in breast and ovarian cancer families and confer a high risk specifically for ovarian cancer [5][6][7] whereas a homozygous missense mutation in RAD51C (FANCO) was found in a Fanconi anemia patient [8]. A homozygous XRCC2 mutation has also been detected in a Fanconi anemia patient [9] and rare mutations in the gene in breast cancer families were identified in an exome sequencing study [10]; however, the association of XRCC2 with breast cancer risk could not be confirmed in a large follow-up study [11]. Variants in the RAD51B region, also known as RAD51L1, have been associated with breast cancer risk in genome-wide association studies (GWAS) [12][13][14]. The major-allele of the common polymorphism rs999737 in intron 10 of RAD51B and the minor-allele of rs2588809 in intron 7 have been associated with an increased risk of female breast cancer [12,14]. The rs999737 has also been associated with breast cancer risk among BRCA1 mutation carriers whereas no association was found for rs999737 or rs2588809 with breast cancer subtypes among BRCA1 or BRCA2 mutation carriers [15,16]. Another common polymorphism located in the intron 7 of the gene, rs1314913, has been associated with the risk of male breast cancer [13]. RAD51B is located at 14q24.1 and is expressed widely, with the highest levels in tissues that are active in recombination [17]. The RAD51B and RAD51C proteins form a stable complex which interacts weakly with RAD51 and promotes the assembly of RAD51 nuclear foci [18,19]. Furthermore, RAD51B is part of a larger BCDX2 complex that is formed with RAD51C, RAD51D, and XRCC2 [4]. The BCDX2 complex acts upstream of RAD51 recruitment to DNA damage foci [20]. Haploinsufficiency of RAD51B leads to aberrant HR repair of DNA and causes centrosome fragmentation and aneuploidy in human cells which suggests that loss of the proper biallelic expression of RAD51B may lead to chromosome instability in tumor cells [21].
In the Finnish population, recurrent founder mutations have been observed in many of the known breast and ovarian cancer susceptibility genes, including the RAD51 paralogs RAD51C and RAD51D [22,23]. Thus, the Finnish population is a valuable resource for the identification of new susceptibility alleles.
To identify putative recurrent founder mutations and to study the role of RAD51B in female and male breast cancer predisposition and especially in familial breast cancer, we comprehensively screened the RAD51B gene in 172 Finnish cancer patients. One identified missense mutation was in silico predicted to be pathogenic and was subsequently screened in a larger set of breast cancer cases and population controls from four datasets. In addition, the known RAD51B risk SNPs rs999737, rs2588809, and rs1314913 and RAD51B haplotypes were studied in the previously published 40 studies participating in the Breast Cancer Association Consortium (BCAC) including 44,791 breast cancer cases and 43,583 controls.

Screening of the RAD51B gene
The coding region and the exon-intron boundaries of the RAD51B gene (RefSeq NG_023267.1, NM_133509.3) were screened in 172 cancer patients from Southern Finland: in 87 female breast cancer patients with a family history of female breast cancer or breast and ovarian cancer, in 4 familial ovarian cancer patients, and in 4 female breast cancer patients with a family history of male and female breast cancer from the Helsinki region of Finland as well as in 77 male breast cancer patients (33 from the Helsinki region and 44 from the Tampere region) (S1 Text). All the female breast or ovarian cancer families and 60 of the male breast cancer families and patients were previously screened negative for BRCA1/2 mutations whereas 21 patients had not been tested for BRCA1/2 mutations. Genomic DNA isolated from blood was amplified by PCR and subsequently sequenced using ABI BigDyeTerminator 3.1 Cycle Sequencing kit (Life Technologies) (S1 Table). The sequencing results were analyzed with Variant Reporter Software v1.0 (Life Technologies).

Genotyping of c.541C>T
The identified c.541C>T, p.(Arg181Trp) missense change was genotyped in additional familial and unselected breast cancer patients and population controls from the Helsinki (cases = 2203, regions of Finland and in 1900 cases and 1235 controls from Belarus (S1 Text). The Helsinki and Tampere datasets were genotyped by sequencing the exon 6 and the Oulu and Belarus datasets by high-resolution-melting (HRM) analysis (S1 Table). Written informed consent was obtained from all the participants and the study was approved by the Ethics Committees of Helsinki University Hospital, Tampere University Hospital, Oulu University Hospital, and by the institutional Ethics Commissions at the Minsk Mother and Child Hospital and at Hannover Medical School.

iCOGS genotyping
The common RAD51B polymorphisms rs2588809, rs1314913, and rs999737 were studied in 44,791 invasive breast cancer cases and 43,583 controls from 40 studies (including partially the Helsinki, Oulu, and Belarus studies) participating in the Breast Cancer Association Consortium (BCAC) (S2 Table) [14]. The SNPs were genotyped on the iCOGS array as part of the Collaborative Oncological Gene-environment Study (COGS) as previously described [14] and genotypes for the c.541C>T missense were imputed with SHAPEIT and IMPUTEv2 by using the 1000Genomes project as the reference panel [24]. The analyses were restricted to cases with European ancestry. All participants gave written informed consent and all the studies were approved by the respective Institutional Review Boards or Ethics Committees (S2 Table).

Bioinformatics and statistical methods
The pathogenicity of the variants identified in the sequencing of RAD51B was predicted using the MutationTaster software as it considers both exonic and intronic variants [25]. In addition, PON-P that utilizes results from SIFT, PhD-SNP, PolyPhen-2, SNAP, and I-Mutant 3 was used to predict the pathogenicity of the missense variants [26]. Secondary structure prediction was done with RaptorX [27] and protein-protein interaction with PredictProtein [28]. Statistical analyses were performed in R version 3.0.2 (http://www.r-project.org/). To study the association of the c.541C>T mutation with breast cancer risk, two-sided P-values were calculated using Pearson's chi-squared test or, if the expected number of cell count was less than five, Fisher's exact test. For meta-analysis, the estimates were combined using a fixed-effects metaanalysis, using the inverse variance-weighted method. Student's t-test was used to compare the age at diagnosis between the mutation carriers and non-carriers. Per-allele odds ratios (OR) and confidence intervals (CI) for the common RAD51B polymorphisms were estimated with logistic regression. Multivariate regression models including any two of the three SNPs at a time were used to study the independence of the association signals. Haplotype-specific ORs and CIs were estimated with the HaploStats package in R. A haplotype carrying the majoralleles of rs2588809 and rs1314913 and the minor-allele of the rs999737 was used as a reference. Study and principal components were used as covariates in all analyses with the BCAC dataset to correct for potential population stratification. The analyses for the genotyped SNPs were adjusted for seven and the analysis for the imputed SNP for nine principal components as previously described [14,24]. Separate analyses were performed for all breast cancer cases and for subsets of cases with first-degree family history of breast cancer, and cases with estrogen receptor (ER) positive and negative tumors. Studies where family history information was predominantly missing were excluded from the familial analyses. The imputed c.541C>T missense variant was analyzed with SNPTEST. First, an association test stratified with study was performed to obtain imputation information scores for the individual studies. The final analysis, restricted to BCAC studies with information score ! 0.5, was performed with study and the nine principal components as covariates. The online tool LocusZoom was used to generate a Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) -ARISTEIA. The UKBGS is funded by Breakthrough regional association plot for visualization of the SNP associations and the extent of linkage disequilibrium (LD) [29].

Results
We sequenced the coding region and the exon-intron boundaries of the RAD51B gene in 168 breast (female and male) and 4 familial ovarian cancer patients from Southern Finland. Nine intronic and six missense variants were identified ( Table 1). The c.541C>T, p.(Arg181Trp), missense change was the only variant that was predicted to be pathogenic by both MutationTaster and PON-P software and was selected for further genotyping. Based on Finnish subjects in the ExAC dataset, the minor-allele frequency (MAF) for the c.541C>T variant is estimated to be 1.2% whereas in the non-Finnish Europeans the MAF is 0.05% (Exome Aggregation Consortium (ExAC), Cambridge, MA (URL: http://exac.broadinstitute.org) [May, 2015]). According to RaptorX secondary structure prediction software the arginine in position 181 is located in beta-sheet with the likelihood of 83.4% but, as determined by PredictProtein, the amino acid is not predicted to directly participate in protein-protein interactions. The c.541C>T missense variant was genotyped in additional 3359 female and male breast cancer patients and families, and in 2351 population controls from Southern (Helsinki and Tampere datasets) and Northern Finland (Oulu dataset). Altogether, when combining the two stages of mutation testing, c.541C>T was screened in 2331 patients from Helsinki, 748 patients from Tampere, and 452 patients from Oulu. There was no evidence of association in any of the population-based case-control studies ( Table 2). There was a suggestive association among breast cancer families with at least three first or second-degree relatives affected with breast or ovarian cancer, compared to population controls, in the Helsinki dataset (OR: 2.31, 95% CI: 1.20-4.48, P = 0.010) ( Table 2). The mean age at breast cancer diagnosis was similar between carriers and non-carriers among the familial patients (54.2 for carriers and 53.1 for non-carriers, P = 0.725) and among all cases (55.9 for carriers and 56.4 for non-carriers, P = 0.744). However, no association was observed for familial breast cancer in the Oulu dataset, and there was no evidence of association when all datasets were combined (OR: 1.35, 95% CI: 0.66-2.73, P = 0.410). The variant was also genotyped in 1900 breast cancer cases and 1235 population controls from Belarus but only three carriers were identified among cases and none among controls ( Table 2). We further studied the missense variant c.541C>T and the previously reported common risk SNPs rs2588809, rs1314913, and rs999737 in a large breast cancer dataset from BCAC comprising 40 studies with 44,791 invasive breast cancer cases and 43,583 controls of predominantly European ancestry. The subjects were genotyped on an Illumina Infinium custom chip (iCOGS) for over 200,000 SNPs [14], including the rs2588809, rs1314913, and rs999737 SNPs. Genotypes for over 11 million SNPs, including the c.541C>T missense variant, were imputed by using the 1000Genomes project as a reference panel [24]. For the missense variant, we first performed an association test stratified by study including all the 40 BCAC studies. The overall information score for the c.541C>T was 0.673. To increase the imputation accuracy of this rare variant, studies with information score less than 0.5 in the stratified analysis were excluded from the final analysis (S3 Table). Altogether 26,969 cases and 27,092 controls from 23 studies were included and the information score for the variant in the final analysis was 0.755. The Breast cancer associations for two of the genotyped SNPs, rs2588809 and rs999737, in the BCAC dataset have been published before [14,24,30] and the results were replicated here. The minor-alleles of rs2588809 and rs1314913 and the major-allele of rs999737 were associated with an increased risk of breast cancer with ORs between 1.07-1.08 among all breast cancer cases and also among familial cases with ORs between 1.10-1.15 (Table 3). The associations were also significant among the subset of cases with ER-positive tumors but only rs999737 showed association among the ER-negative subset (Table 3). Rs2588809 and rs1314913 are strongly correlated (r 2 = 0.816), whereas rs999737 is not correlated with them (r 2 = 0.003 with rs2588809 and r 2 = 0.070 with rs1314913) (S1 Fig). In multivariate logistic regression models, both rs2588809 and rs1314913 remained significant after adjustment with rs999737 (OR: 1.07, 95% CI: 1.04-1.09, P = 1.74 x 10 −6 and OR: 1.06, 95% CI: 1.03-1.09, P = 9.24 x 10 −6 , respectively). Likewise, rs999737 remained an independent predictor for risk after adjustment with rs2588809 or rs1314913 (OR: 1.08, 95% CI: 1.06-1.11, P = 2.91 x 10 -11 and P = 2.54 x 10 −11 , respectively). Neither rs2588809 nor rs1314913 remained significant when adjusted with each other (OR: 1.07, 95% CI: 1.00-1.16, P = 0.066 and OR: 1.00, 95% CI: 0.93-1.08, P = 0.977, respectively).
To evaluate the allele combination associated with the highest risk, we performed haplotype analysis for rs2588809, rs1314913, and rs999737. The reference haplotype CCT, carrying the major-alleles of rs2588809 and rs1314913 and the minor-allele of rs999737 had an estimated frequency of 21.0% among controls and 19.4% among cases. The strongest association was observed for the TTC haplotype that carries the risk-alleles of all the three SNPs with an odds ratio of 1.15 for all cases (95% CI: 1.11-1.19, P = 8.88 x 10 −16 ) and 1.24 for familial cases (95% CI: 1.16-1.32, P = 6.19 x 10 −11 ) ( Table 4). This haplotype had an estimated frequency of 12.6% among controls, 13.5% among all cases, and 14.1% among familial cases. The most common haplotype, with over 60% frequency among cases and controls, was the haplotype CCC carrying the risk-allele of rs999737 and also this haplotype associated with an increased risk of breast cancer among all the cases (OR: 1.09, 95% CI: 1.06-1.12, P = 1.86 x 10 −10 ) and the familial cases (OR: 1.14, 95% CI: 1.09-1.20, P = 1.30 x 10 −7 ) when compared to the CCT haplotype.

Discussion
To study the role of RAD51B in breast cancer predisposition, we screened the coding sequence in 172 Finnish breast or ovarian cancer patients. We also studied the common susceptibility variants in the genomic region of the gene in a large dataset from BCAC including 44,791 breast cancer cases and 43,583 controls. Among the Finnish patients, we identified one putatively pathogenic missense variant, c.541C>T, p.(Arg181Trp), that was further screened in a larger set of unselected and familial breast cancer patients and population controls.
In the Helsinki dataset, the c.541C>T mutation had a suggestive association with familial breast cancer but this association was not confirmed in the other Finnish sample sets. In Belarus, the mutation was rare and was identified only in three cases but not among controls. For the large BCAC dataset that had been genotyped on the iCOGS chip, c.541C>T genotypes were imputed. The missense was very rare with a 0.2% MAF and no association with breast cancer was seen among all breast cancer cases or in the subset of familial patients.
Out of the common variants at the RAD51B region, rs999737 and rs2588809 have been associated with female breast cancer and rs1314913 with male breast cancer [12][13][14]30]. For rs999737 and rs2588809 the results from the previous BCAC studies for all cases and by ER status were replicated here [14,30]. The rs1314913 SNP has been previously associated with male breast cancer with an OR of 1.57 (P = 3.02 × 10 −13 ) [13]. It has not been previously reported in female breast cancer, and showed here similar associations as rs2588809 with OR of 1.07 among all cases and 1.09 among ER-positive cases. The rs1314913 SNP is correlated with rs2588809 (r 2 = 0.816) but not with rs999737 (r 2 = 0.070). Our results suggest two independent associations at the region as rs999737 adjusted with rs1314913 or rs2588809 showed significant association whereas neither rs1314913 nor rs2588809 remained significant after adjustment with each other, suggesting both may be tagging another, causative allele.
All three variants were also associated with familial breast cancer. The strongest association was observed for a haplotype carrying the risk-alleles of all the three SNPs both among all cases and familial cases with ORs of 1.15 and 1.24, respectively. This haplotype is observed at a frequency of 12.6% among controls and 13.5% among cases.
To date, only three studies have reported truncating germline mutations in RAD51B: two deleterious mutations were detected among ovarian cancer cases, one splicing mutation in a breast and ovarian cancer family, and one nonsense mutation in a melanoma family [31][32][33]. In an Australian, study no pathogenic mutations were identified among 188 breast cancer families [34]. These reports together with our results suggest that loss-of-function mutations in RAD51B are very rare yet we cannot exclude such mutations and possibly a higher breast cancer risk associated with these. However, the common variants in the RAD51B region are associated with an increased breast cancer risk. Further fine-mapping studies are needed to identify the causative variants underlying the associations, and functional studies to determine whether RAD51B or another gene in the region is the target of these associations. RAD51B is a plausible candidate gene for the association as it functions in HR repair of DNA damage like most of the known breast cancer genes. In conclusion, no pathogenic RAD51B mutations were identified among 172 Finnish breast or ovarian cancer patients. However, we cannot rule out rare risk-variants in the Finnish or other populations. The minor-alleles of the common polymorphisms rs2588809 and rs1314913 and the major-allele of rs999737 were associated with breast cancer risk in the large BCAC dataset. The strongest association was observed for a risk haplotype carrying the riskalleles of all the three SNPs with an OR of 1.15 among all cases and 1.24 among familial cases compared to the haplotype with the respective protective alleles.
Supporting Information S1 Fig. Regional association plot for the RAD51B SNPs rs2588809, rs1314913, and rs999737 and the c.541C>T missense variant. Each variant is represented with a dot and the color of the dot represents the extent of LD (r 2 ) with rs2588809. Association among all breast cancer cases in the BCAC dataset is represented at the-log10 scale. The x axis shows the genomic positions of the variants based on hg19 build. The right y axis shows the estimated recombination rate at the region (centiMorgans/megabase, cM/Mb). LD and recombination rate were estimated using the 1000Genomes Nov 2014 EUR as the reference population. (PDF) S1