Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Cancer mutations in RAD51 and its paralogues

  • Anna L. Valentine ,

    Contributed equally to this work with: Anna L. Valentine, Isabella L. Huth

    Roles Data curation, Formal analysis, Investigation, Software, Validation, Visualization

    Affiliation Biology Program, The Ohio State University, Marion, Ohio, United States of America

  • Isabella L. Huth ,

    Contributed equally to this work with: Anna L. Valentine, Isabella L. Huth

    Roles Data curation, Formal analysis, Investigation, Software, Validation, Visualization

    Affiliation Biochemistry Program, The Ohio State University, Marion, Ohio, United States of America

  • Nika M. Duff,

    Roles Investigation

    Affiliation Biology Program, The Ohio State University, Marion, Ohio, United States of America

  • Aisha Z. Rabbani,

    Roles Investigation

    Affiliation Neuroscience Program, The Ohio State University, Marion, Ohio, United States of America

  • Kateri N. Donahue,

    Roles Data curation, Formal analysis, Writing – review & editing

    Affiliation Biology Program, The Ohio State University, Marion, Ohio, United States of America

  • Wesley A. Bush,

    Roles Formal analysis

    Affiliations Biology Program, The Ohio State University, Marion, Ohio, United States of America, Cancer Biology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, United States of America

  • Renee A. Bouley ,

    Roles Formal analysis, Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    petreaca.1@osu.edu (RCP); bouley.8@osu.edu (RAB)

    Affiliation Department of Chemistry and Biochemistry, The Ohio State University, Marion, Ohio, United States of America

  • Ruben C. Petreaca

    Roles Conceptualization, Data curation, Funding acquisition, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    petreaca.1@osu.edu (RCP); bouley.8@osu.edu (RAB)

    Affiliations Cancer Biology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, United States of America, Department of Molecular Genetics, The Ohio State University, Marion, Ohio, United States of America

Abstract

The RAD51 recombinase is central to repair of DNA damage arising from stalled or collapsed replication forks and DNA double strand breaks. Its essential role is revealed by the fact that this function evolved in bacteria but was retained in eukaryotes. In humans some of the RAD51 functions have been relegated to several paralogues which evolved by gene duplication. In addition to mutations, most cancers are also characterized by increased chromosomal instability manifesting as translocations, deletions, insertions, and other more complex forms of chromosomal re-arrangements. Given the central role of RAD51 in protecting against chromosomal instability it stands to reason that RAD51 mutations that alter its function should register in cancer cells. However, pan-cancer analyses of analyzed cancer genomes show a marked absence of RAD51 loss of function mutations leading to a so-called “RAD51 paradox”: increased chromosomal instability despite normal RAD51 function. One hypothesis is that mutations in the RAD51 paralogues may contribute to the genomic instability, meaning that a lack of mutations in RAD51 may be compensated by an increase of mutations in the paralogues. We queried analyzed cancer genomes from COSMIC and mapped all mutations in RAD51 and its paralogues. This revealed an increase in RAD51B, RAD51C and RAD51D paralogue mutations in human cancers. We used established algorithms to determine the probability that any mutation may affect enzyme function. Although, we did not find many “driver” mutations, numerous paralogue mutations were pathogenic or likely to destabilize enzyme function. In silico 3D structure analysis was then used to analyze the potential effect of some of these mutations on protein structure. Gene expression analysis did not reveal any changes in paralogue expression levels. Further, an evolutionary analysis did not uncover any selective pressure for mutations in RAD51 and its paralogues. A comparison of mutations reported on COSMIC with those reported on ClinVar revealed that many mutations primarily in RAD51C and RAD51D are also hereditary. Thus, it appears that an apparent low level of RAD51 mutations in cancer cells is compensated by an increase in paralogues mutations.

Introduction

Homologous recombination (HR) evolved to facilitate replication of long genomes [1]. The replication machinery can stall or even produce chromosome breaks which increase genomic instability [24]. The HR machinery both rescues stalled replication forks and repairs DNA double strand breaks (DSBs). One clue to the essential function of HR in DNA replication is the fact that it evolved in bacteria and was retained in eukaryotes [1,5]. Another clue is that certain recombination and replication genes may have evolved from the same ancestor through gene duplication [6,7].

Central to recombination is RAD51 (recA in bacteria, also known as RAD51A in humans) which can nucleate single stranded DNA produced by resection of DSBs or stalled replication forks [811]. ATP binding induces a conformation change in the enzyme that increases DNA affinity, promotes filament formation, homologous pairing, and strand exchange [1215]. In yeast, RAD51 is loaded unto the ssDNA by RAD52 [16] but in humans and other higher eukaryotes this function has been replaced by BRCA2 assisted by BRCA1 and PALB2 [17,18]. RAD52 also exists in humans but its function has been relegated to an accessory recombination pathway (single strand annealing) [1921].

Eukaryotes have several RAD51 paralogues that assist with its function [2224]. In humans these paralogues form two distinct complexes RAD51B-RAD51C-RAD51D-XRCC2 (BCDX2) and RAD51C-XRCC3 (CX3) [22,2529]. BCDX2 functions to restrain fork progression under stress such as during low nucleotide pools while CX3 facilitates fork restart [3032].

Mutations in the DNA damage response pathway (DDR) including components of the recombination machinery are common in cancer cells [33,34]. Remarkably, although RAD51 mutations register in cancer cells, none appear to be inactivating or promote tumor formation leading to the so-called “RAD51 paradox” [35]. A hypothesis explaining this paradox is that fast replicating cancer cells require RAD51 to deal with the ensuing replication stress [17]. A clue to this premise is that cancer cells appear to be characterized by over-expression of RAD51 [36,37] and inhibiting RAD51 is a therapeutic approach [38]. Too much genomic instability can kill even cancer cells and RAD51 function is required to “re-stabilize” tumor cells.

Inactivating RAD51 paralogue mutations increase chromosomal instability [39] and promote tumorigenesis [22,4042]. Even certain paralogue polymorphisms appear to be associated with an increase in predisposition to some cancers, primarily breast and ovarian [43,44]. Other RAD51 paralogue mutations can also increase survival of cancer cells [45,46]. Thus, it appears that a lack of inactivating RAD51 mutations in cancer cells is compensated by mutations in its paralogues.

In this report, we used the Catalogue of Somatic Mutations in Cancers (COSMIC) database [47] to characterize all mutations appearing in RAD51 and its paralogues. The goal was to map all mutations and identify the degree to which they affect gene function. Our analysis shows that an apparent lack of inactivating mutations in RAD51 may be compensated by an increase in mutations in its paralogues, primarily RAD51C and RAD51D. Thus, in cancer cells, the HR repair machinery may be destabilized by mutating RAD51 accessory factors.

Materials and methods

2.1. Data accession

Mutation data from COSMIC (https://cancer.sanger.ac.uk/cosmic/) was downloaded as.csv files and analyzed in Excel on June 9, 2023. These data are publicly available and are anonymized so that the authors do not have access to information that could identify individual participants during or after data collection. These files have information on mutation coordinates as well as the cancer type where mutations occur. Data from NCBI ClinVar were downloaded on February 16, 2026 as.csv files and analyzed in excel. Mutations between COSMIC and ClinVar files were compared to see how many somatic mutations are similar to germline.

2.2. High frequency, driver, and destabilizing mutations

We defined high frequency mutations as those that occur in 5 or more patients. Several publications analyzing pan-cancer mutations suggest this interpretation [48,49]. Driver mutations were analyzed using the OpenCravat pan cancer CHASMPlus algorithm [50,51] which is available as web interface here: https://www.opencravat.org/. Mutations with a probability below 0.05 were analyzed. Three other algorithms were also used: Mutation Assessor [52], SIFT [53], and VEST4 [54]. These analyses are shown in S1 Table. Mutations were considered significant for SIFT if the probability was below 0.05 and if they had a high rank score for Mutation Assessor. These mutations were graphed in S2 Fig. For VEST4, the mutations are interpreted as mildly pathogenic if the score is between 0.764–0.861, moderately pathogenic if the score is between 0.861–0.965, and strongly pathogenic if the score is above 0.965. Because VEST4 produced so many more potential pathogenic mutations, we graphed only those with a score above 0.9 in S2 Fig. Please note that for RAD51, all mutations are shifted by one coordinate (e.g., R178 on COSMIC is R177 on OpenCRAVAT output). The interpretation of the mutation is not changed, only the position. This is due to a different isoform that OpenCravat uses to output variants. For all the other genes the positions correspond with COSMIC coordinates.

2.3. Lollipop graphs

Lollipop graphs shown in S2 Fig were made with cBioPortal MutationMapper (https://www.cbioportal.org/visualize) [5557].

2.4. Sequence alignment

For the data in Fig 2B, protein sequences were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/) and aligned with ClustalOmega (https://www.ebi.ac.uk/jdispatcher/msa/clustalo).

thumbnail
Fig 1. Mutation demographics in RAD51 and its paralogues.

A. Total samples reported on COSMIC. For this graph only unique samples are reported. Multiple mutations in the same sample were counted as 1. B. Ratio of coding/non-coding mutations. C. Frequency of different types of coding mutations. Coding mutations were partitioned into missense, nonsense, silent and other (InDels, frameshift and complex). D. High frequency mutations appearing in RAD51 and its paralogues. These are mutations that appear in five or more samples. E. Cancer distribution of coding mutations in RAD51 and its paralogues.

https://doi.org/10.1371/journal.pone.0349105.g001

2.5. Mutation co-occurrence

To identify co-occurrence of mutations between RAD51 and its paralogues, the data from COSMIC was parsed by hand. We identified those samples that had mutations in RAD51 and at least one other paralogue. The probability of mutation co-occurrence was also calculated using the cBioPortal mutual exclusivity calculator [49,55] (https://www.cbioportal.org/). This analysis was done for the pan-cancer “Curated set of non-redundant studies”.

2.6. Protein structure modeling of high-frequency mutations

PyMOL (version 3.1, Shrodinger) was used to generate structural models of mutations using the Mutagenesis function, identify polar interactions, make measurements, and generate figure images. The APBS software suite was used to generate electrostatic surface potentials of the wild-type and mutant structures [58]. The CUPSAT webserver was used to calculate the ΔΔG values and predict protein structure stability [59]. Experimentally determined protein structures were used for mutation modeling when possible. If experimental structures were not available than AlphaFold models were used [60]. AlphaFold models were used in all cases for CUPSAT calculations.

2.7. Gene expression data

Expression data for The Genome Cancer Atlas (TCGA) samples were also downloaded from COSMIC. Gene expression data are normalized to non-cancerous tissues and expressed as Z-values. These values are interpreted as normal expression (−2 > Z > 2), under-expressed (Z < −2) and over-expressed (Z > -2) [61].

2.8. Calculation of evolutionary selective pressure

We used a method described by Zhou et al [62]. We followed the same procedure described in our previous publication [63]. Chi-square score was calculated by hand and probability values were extracted with online calculator (http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/pchisq.html) with one degree of freedom.

Results and discussion

3.1. Distribution of mutations in cancer cells

When we queried COSMIC mutations in RAD51 and its paralogues, we identified that about five times more samples are reported for RAD51B than for the other genes (Fig 1A  and S1 Table). Mutations can occur in both coding regions (e.g., translated into protein) and non-coding regions (e.g., introns and 5’ and 3’ UTRs). When we queried mutations appearing in RAD51 and its paralogues we found an increase in both coding and non-coding mutations in RAD51C, RAD51D and XRCC2 compared to RAD51 (Fig 1B). RAD51B had a much higher non-coding to coding ratio while XRCC3 was similar to RAD51. Thus, most RAD51B samples do not have coding mutations (S1 Table).

We next partitioned mutations by missense, non-sense, silent or other. Although missense mutations could destabilize the enzyme function, nonsense mutations are more severe because they all result in a truncation. The “other” category includes frameshift mutations which often also introduce a stop codon and result in a truncation. This analysis shows an increase in missense mutations in RAD51B, RAD51C and RAD51D but not XRCC2 or XRCC3 when compared to RAD51 (Fig 1C). Longer genes have a tendency to register more mutations than shorter genes and this has sometimes led to false positives: a longer gene appears to be ultra-mutated because it has more residues [64]. However, this does not appear to be the case with RAD51 and its paralogues because all genes are about the same size (Fig 2A). Thus, we interpret these data to mean that the RAD51 paralogues are more likely to be mutated. Remarkably, truncating mutations which are almost always inactivating are not increased in the paralogues. This suggests that even in the paralogues, most mutations that may affect enzyme function are point mutations.

We next looked for mutations that occur in 5 or more samples which some metagenomic studies have suggested that they should be interpreted as frequent [48,49,65]. We are cognizant that this interpretation is somewhat subjective, but we nevertheless chose this analysis hoping to uncover some hotspots. We identified several in RAD51 and its paralogues (Fig 1D). The RAD51D R185Q/W constitutes a hotspot as it occurs in 100 prostate and skin samples. Another mutation occurring at higher frequency is the RAD51B L172S/W (nine patients: large intestine, prostate and soft tissue). We also checked the distribution of mutations in cancer tissues and found that most RAD51 paralogue mutations are found in prostate cancers (Fig 1E). This was not unexpected as certain paralogue polymorphisms and mutations were shown to predispose to prostate cancers [6669]. However, when we partitioned the COSMIC samples by primary tissue, we discovered that more samples were reported for prostate cancers than for any other type of cancer (S1 Fig). Thus, the increase in mutations numbers in prostate cancers may be a consequence of more reported samples.

3.2. Driver mutations

Not all point mutations have an equal impact on cancer initiation and evolution. Various algorithms have been developed to classify mutations into driver and passenger [70]. Driver mutations have a high probability of causing cancer while passenger mutations accumulate as background noise. We used the CHASMPlus machine learning algorithm [51] to characterize all coding point mutations in RAD51 and its paralogues (S1 Table). The algorithm produces a probability value that can aid in the interpretation: mutations with p-values below 0.05 (significant) are considered driver while those with mutations above 0.05 (not-significant) are not. Not unexpectedly, no RAD51 mutation had a significant p-value, further suggesting that RAD51 mutations do not significantly destabilize the enzyme in human cancers. Driver mutations were also not found in RAD51B, RAD51D, XRCC2, or XRCC3. Only two driver mutations were identified in RAD51C (R24L p-value = 0.0299; P21A p-value = 0.0446). Thus, although mutations occur in the RAD51 paralogues, most are not considered significant to drive cancer, at least under the analysis undertaken here.

To visualize where these mutations map on the protein sequences, we generated cartoon diagrams of RAD51 and its paralogues (Fig 2A). RAD51 is characterized primarily by a recA domain which spans over 70% of each sequence. The recA domain contains Walker A and B domains as well as the ATP binding pocket and two DNA binding loops (L1 and L2) [71]. An N-terminal domain (NTD) was shown in yeast to be required for enhancing the activity of RAD51 and increasing stability through phosphorylation by Mec1 (human ATR) and Tel1 (human ATM) [72]. The RAD51 paralogues also have these domains except for XRCC2 which has a recA domain but not NTD. BCDX2 and CX3 complex structural analysis has also revealed that the NTD domains enhances interaction specificity between paralogues and protofilament formation [28]. Maps of truncating and high frequency mutations of RAD51 and its paralogues do not reveal any clustering of mutations in one region of any of the proteins (Fig 2A). The hotspot RAD51B L172K and RAD51D R185Q mutations occur in the recA domain. An alignment of the sequences of all proteins also shows that although the two mutations occur in the same general region, they do not correspond to the same residue (e.g., occurring in a residue in RAD51B that aligns with RAD51D) (Fig 2B).

3.3. Mutation classification with VEST4, Mutation Assessor and SIFT

Although CHASM can calculate the probability of a mutation causing cancer, other algorithms can also calculate the functional impact of mutations. Thus, we assessed mutations with three other algorithms hosted by OpenCravat: VEST4, Mutation Assessor, and SIFT. VEST4 predicts whether a mutation is pathogenic regardless of disease type [73], Mutation Assessor interprets the mutation based on evolutionary conservation [52], while SIFT predicts the effect of a mutation on enzyme function based on amino acid physical characteristics and sequence homology. This analysis revealed many more mutations that could affect enzyme function (S1 Table). To understand where these mutations map onto protein sequence, we generated lollipops (S2 Fig and see Materials and Methods).

VEST4 showed that many more mutations than predicted by CHASM Plus are likely to be pathogenic in all genes studied. Importantly, RAD51 had the most VEST4 mutations with a score above 0.9 (26). This shows that altering the RAD51 protein sequence is likely to cause disease. Similarly, VEST4 revealed many more pathogenic mutations in the other paralogues. The SIFT algorithm output more potentially destabilizing mutations for all genes than any of the three algorithms (S2 Fig). This was not unexpected because destabilizing the primary structure of any protein will have some effect on its function. Finally, Mutation Assessor also predicted several more mutations for each gene that are likely to have a functional impact. We did not observe any general trends in the positions of the mutations predicted with these three algorithms (e.g., clustering to one region of the protein) suggesting that enzyme function could be destabilized by having non-active site mutations. Taken together, these analyses show that many more mutations in RAD51 and its paralogues are likely to destabilize the function of the genes even though they may not be classified as “driver”. Because cancer genomes are characterized by mutations in multiple genes (over 1000), it stands to reason that combinations between less deleterious mutations in RAD51 and its paralogues with mutations in other cell cycle or genome regulating genes is likely to produce combinatorial effects that cause cellular transformation.

3.4. Co-occurring mutations between RAD51 and its paralogues

Because mutations in RAD51 do not appear to be inactivating, we next checked to see if the repair machinery is destabilized by co-occurring mutations between RAD51 and its paralogues. We used the cBioPortal mutual exclusivity calculator to see if there is a tendency for mutations to co-occur [55]. This analysis shows that mutations between any of the two genes studied here have a significant probability to co-occur (S2A Table). A caveat of this analysis is that it usese the cBioPortal data which is similar to COSMIC, but not identical (e.g. cBioPortal and COSMIC largely deposit the same samples but they also have some unique ones). When we counted all COSMIC samples with mutations in RAD51 that also had mutations in any of the paralogues, we found very few instances of co-mutations: out of 158 samples with RAD51 mutations 6 had a RAD51B mutation (3.8%), 6 had a RAD51C mutation (3.8%), 11 had a RAD51D mutation (7%), 5 had a XRCC2 mutation (3.2%) and 1 had a XRCC2 mutation (0.6%) (Fig 2C and S2B Table). These data suggest that most cancer samples on COSMIC do not have mutations in multiple genes withi the RAD41 and paralogues family.

3.5. Protein structure modeling of high-frequency mutations

The high-frequency missense mutations identified in Figs 1D and 2A were modeled onto 3D protein structures to determine their impact on protein structure. Changes in polar tertiary or quaternary structure interactions, electrostatic surface potential, and protein structural stability were analyzed (S3S12 Figs and S3 Table). The RAD51 mutations were modeled onto an X-ray structure of the RAD51-BRCA2 BRC repeat complex (PDB ID: 1N0W) (S3 and S8 Figs) [74]. The RAD51 E177K mutation was found to drastically change the local electrostatic surface potential (Fig 3A and 3B) and disrupt a polar tertiary structure interaction (Fig 3C and 3D). This residue is also located near the interface with BRCA2 and this mutation could disrupt this interaction. The RAD51 E259K mutation also resulted in large changes to the local electrostatic surface potential of the protein (Fig 3E and 3F) and is predicted by CUPSAT to destabilize the protein structure by 1.1 kcal/mol. All of the high-frequency RAD51 missense mutations were predicted to be destabilizing and the E259D had the largest ΔΔG of −2.0 kcal/mol.

thumbnail
Fig 2. Frameshift, nonsense and high frequency point mutations RAD51 and its paralogues.

A. Position of frameshift, non-sense (black) and high frequency (blue) mutations on cartoon diagrams of RAD51 proteins and its paralogues. Diagrams not drawn to scale. B. Protein sequence alignment of RAD51 and its paralogues. This alignment was performed with protein isoforms on which mutations were mapped in A. RAD51 (NP_001157741); RAD51B (NP_001308750); RAD51C (NP_478123.1); RAD51D (NP_001136043); XRCC2 (NP_005422.0); XRCC3 (NP_001093588.1). Please note that coordinates are given for the RAD51 gene only (top sequence).. Percent chance of a mutation in RAD51 co-occurring with at least one other of its paralogues. For example, there is a 3.79% chance that a mutation in RAD51 co-occurs with another mutation in RAD51B. Only coding mutations are considered here.

https://doi.org/10.1371/journal.pone.0349105.g002

thumbnail
Fig 3. Modeling of key high-frequency mutations that affected protein structure.

A. Electrostatic surface potential of RAD51 WT E177 (circled) from a published X-ray structure (PDB ID: 1N0W) [74]. B. Electrostatic surface potential of RAD51 E177K mutant (circled). An acidic surface potential is shown in red, basic in blue, and neutral in red. C. A cartoon representation of RAD51 is shown in pink in complex with BRCA2 in green [74]. The RAD51 residue E259 is shown in light blue sticks and R254 is shown in pink sticks, and BRCA2 residue L1545 is shown in green sticks. Polar interactions are shown with yellow dashed lines between E259 and R254, which are labeled with their respective measurements. D. A model of the RAD51 E259K mutant is shown with the same coloring as panel C. E. An electrostatic surface potential of RAD51 WT E259 (circled) in comparison to F. RAD51 E259K mutant. G. A cartoon representation of RAD51C is shown in light yellow from a published cryo-EM structure (PDB ID: 8OUZ) [75]. The RAD51 residue R249 is shown in cyan sticks and D251 is shown in yellow sticks. Polar interactions are shown with gray dashed lines between R249 and D251, which are labeled with their respective measurements. H. A model of the RAD51C R249C mutant is shown with the same coloring as panel G.

https://doi.org/10.1371/journal.pone.0349105.g003

To model the RAD51B high-frequency mutations at residue 172, an AlphaFold model was used [60] since this residue was not resolved in published X-ray or cryo-EM structures (S4 and S9 Figs). Neither of these mutations were observed to have a significant impact on tertiary structure interactions, electrostatics, or protein stability.

To model the RAD51C high-frequency missense mutations, a cryo-EM structure of the BCDX2 complex (RAD51B-RAD51C-RAD51D-XRCC2) (PDB ID: 8OUZ) was used to model the P21 and R249 mutations (S5 and S10 Figs) [28]. Residue 368 was not resolved in this cryo-EM structure, so an AlphaFold model of RAD51C was used to model the mutations as this position [60]. The P21S/T mutations did not impact tertiary structure interactions or electrostatics. Of the mutations modeled, R249C (Fig 3G and 3H) and R249H were both found to disrupt a polar tertiary structure interaction. The R249H mutation also changed the local electrostatic surface potential of the protein. The two R368 mutations did not affect tertiary structure interactions but did change the local electrostatics. CUPSAT predicted five out of six of the RAD51C mutations to destabilize the protein structure, with R368W being the most destabilizing with a ΔΔG of –5.9 kcal/mol. The R368Q mutation was the only stabilizing mutation with a ΔΔG of +1.03 kcal/mol.

Another cryo-EM structure of the BCDX2 complex (PDB ID: 8FAZ) was used to model RAD51D mutations (S6 and S11 Figs) [28]. Both R185 mutations decreased tertiary structure interactions and altered the electrostatic surface potential of the protein. Both were found to be destabilizing to the protein structure by CUPSAT but the R185W mutation had a larger impact than R185Q.

No experimentally determined structures of human XRCC3 have been published, so an AlphaFold model was used to model the high-frequency missense mutations (S7 and S12 Figs). The two mutations observed as H183 did not impact tertiary structure interactions or local electrostatics of the protein. However, both were predicted to destabilize the protein structure by CUPSAT with ΔΔG values of −1.55 and -2.61 kcal/mol for H183N and H183P, respectively.

Overall, the structural analyses demonstrates that the majority high-frequency mutations in RAD51, RAD51C, and RAD51D strongly impacted the protein structure. All of the RAD51 mutations modeled were predicted to be destabilizing to the overall protein structure by CUPSAT. The E177K RAD51 mutation impacted protein structure in all three metrics analyzed and could impact the interaction with BRCA2. In RAD51C, the R368W mutation was the most destabilizing mutation of all analyzed in all proteins. In addition, it was observed that R249C disrupted tertiary structure interactions and the electrostatic surface potential. Both RAD51D high-frequency mutations effected protein structure in all three metrics. Whereas RAD51B mutations did not impact protein structure and the XRCC3 mutations only had a mild impact.

3.6. Expression and selection pressure

Another hallmark of cancer is changes in gene expression. COSMIC reports gene expression values for TCGA samples which represents a subset of the data queried here because mutation is reported for both TCGA and non-TCGA samples. Gene expression is given in Z-values which are interpreted as normal between −2 and 2. Values below −2 are indicative of under-expression and those above 2 of over-expression [61]. A pan-cancer analysis of gene expression for RAD51 and its paralogues shows that they are within the normal range (Fig 4A and S4 Table). Thus, changes in expression of one of the paralogues is unlikely to be a mechanism for compensating for mutations in RAD51.

thumbnail
Fig 4. Gene expression and selection pressure.

A. Gene expression for The Genome Cancer Atlas (TCGA) samples. The data show pan-cancer Z-values distribution for each gene. The Y-axis is logarithmic for better visualization. B. Selection pressure for RAD51 and its paralogues. Selection pressure was calculated as descried in [62].

https://doi.org/10.1371/journal.pone.0349105.g004

If a mutation provides an advantage to a cancer cell, it will undergo selective pressure and be statistically significantly more represented in cancer patients. To calculate selective pressure in the genes studied here, we employed a common statistical analysis that can reveal positive, neutral, or negative selection [62]. This analysis shows that all genes are under neutral selection as the probability values are insignificant (Fig 4B and S5 Table). Therefore, mutations in RAD51 and its paralogues are not biased by selective pressure.

3.7. Comparisons of mutations reported on COSMIC with those reported on ClinVar

NCBI ClinVar reports multiple germline mutations in RAD51 and its paralogues (S6 Table). ClinVar mutations have been categorized as benign, pathogenic, likely pathogenic, uncertain significance and conflicting classification of pathogenicity. We extracted all the pathogenic, likely pathogenic, and conflicting classification of pathogenicity for all genes and compared them with those reported on COSMIC (Tables 1 and S6). This analysis identified several germline mutations that are also reported on COSMIC including some of the high frequency mutations (bold in Table 1). Thus, some of the mutations reported on COSMIC may have been inherited and are not somatic.

thumbnail
Table 1. Mutations reported on both COSMIC and ClinVar.

https://doi.org/10.1371/journal.pone.0349105.t001

RAD51C had the most reported germline mutations (Table 1). A recent report investigated the effect on HR of some of these mutations [46] and found that they range from no effect (e.g., WT levels) to almost complete abrogation of HR. They’ve also identified that most RAD51C mutations that disrupt HR also disrupt interaction with other HR paralogues and a cluster of mutations affect DNA binding and ATP hydrolysis.

In this report, we did not follow our analysis with experimental validation which limits our interpretation of these findings. However, given the previous analysis of RAD51C mutations, we suspect that some of these mutations will affect HR. Therefore, further validation of these data is necessary to understand which mutations are likely to destabilize the repair machinery and produce chromosomal instability phenotypes.

Conclusion

Here, we used in silico protein structure and machine learing algorithms to investigate the potential effects of cancer mutations in RAD51 and its paralogues. We identified many more predicted deleterious mutations in the RAD51B and RAD51C paralogues than RAD51 suggesting that destabilizing the paralogues may be sufficient to affect HR. Indeed, as mentioned above, certain RAD51C mutations can abrogate HR without mutations in the other RAD51 paralogues [46]. The structural modeling showed that high-frequency mutations in RAD51, RAD51C, and RAD51D strongly affected protein structure whereas the high-frequency mutations in RAD51B and XRCC3 did not. Thus, an apparent lack of mutation in RAD51 in cancer cells may not necessarily mean that the repair machinery is not destabilized. This may also explain the “RAD51 paradox”: mutations in paralogues are sufficient to affect HR. Although, our study only provides an in silico analysis and would need to be validated by experiments, these findings nevertheless highlight how mutations in multiple components of the HR machinery work together to destabilize repair.

Supporting information

S1 Fig. Samples reported on COSMIC partitioned by primary cancer.

https://doi.org/10.1371/journal.pone.0349105.s001

(PDF)

S2 Fig. Lollipops showing mutations predicted pathogenic or having a functional impact using SIFT, MutationAssessor, and VEST4.

https://doi.org/10.1371/journal.pone.0349105.s002

(PDF)

S3 Fig. Polar tertiary structure interactions for high-frequency mutations in RAD51A.

https://doi.org/10.1371/journal.pone.0349105.s003

(PDF)

S4 Fig. Polar tertiary structure interactions for high-frequency mutations in RAD51B.

https://doi.org/10.1371/journal.pone.0349105.s004

(PDF)

S5 Fig. Polar tertiary structure interactions for high-frequency mutations in RAD51C.

https://doi.org/10.1371/journal.pone.0349105.s005

(PDF)

S6 Fig. Polar tertiary structure interactions for high-frequency mutations in RAD51D.

https://doi.org/10.1371/journal.pone.0349105.s006

(PDF)

S7 Fig. Polar tertiary structure interactions for high-frequency mutations in XRCC3.

https://doi.org/10.1371/journal.pone.0349105.s007

(PDF)

S8 Fig. Electrostatic surface potential calculations in RAD51A.

https://doi.org/10.1371/journal.pone.0349105.s008

(PDF)

S9 Fig. Electrostatic surface potential calculations in RAD51B.

https://doi.org/10.1371/journal.pone.0349105.s009

(PDF)

S10 Fig. Electrostatic surface potential calculations in RAD51C.

https://doi.org/10.1371/journal.pone.0349105.s010

(PDF)

S11 Fig. Electrostatic surface potential calculations in RAD51D.

https://doi.org/10.1371/journal.pone.0349105.s011

(PDF)

S12 Fig. Electrostatic surface potential calculations in XRCC3.

https://doi.org/10.1371/journal.pone.0349105.s012

(PDF)

S1 Table. COSMIC mutation files and OpenCravat algorithms outputs.

https://doi.org/10.1371/journal.pone.0349105.s013

(XLSX)

S2 Table. Mutation co-occurrence between the genes studied here.

https://doi.org/10.1371/journal.pone.0349105.s014

(XLSX)

S3 Table. Summary of structural analysis results of co-occurring and high-frequency mutations.

https://doi.org/10.1371/journal.pone.0349105.s015

(DOCX)

S4 Table. Pan-cancer expression profiles for RAD51 and its paralogues shown as Z-values.

https://doi.org/10.1371/journal.pone.0349105.s016

(XLSX)

S5 Table. Calculation of selection pressure for RAD51 and its paralogues.

https://doi.org/10.1371/journal.pone.0349105.s017

(XLSX)

S6 Table. Comparisons of COSMIC mutations with those reported on ClinVar.

https://doi.org/10.1371/journal.pone.0349105.s018

(XLSX)

References

  1. 1. Cavalier-Smith T. Origins of the machinery of recombination and sex. Heredity (Edinb). 2002;88(2):125–41. pmid:11932771
  2. 2. Burssed B, Zamariolli M, Bellucco FT, Melaragno MI. Mechanisms of structural chromosomal rearrangement formation. Mol Cytogenet. 2022;15(1):23. pmid:35701783
  3. 3. Hosea R, Hillary S, Naqvi S, Wu S, Kasim V. The two sides of chromosomal instability: drivers and brakes in cancer. Signal Transduct Target Ther. 2024;9(1):75. pmid:38553459
  4. 4. Branzei D, Foiani M. Maintaining genome stability at the replication fork. Nat Rev Mol Cell Biol. 2010;11(3):208–19. pmid:20177396
  5. 5. Rocha EPC, Cornet E, Michel B. Comparative and evolutionary analysis of the bacterial homologous recombination systems. PLoS Genet. 2005;1(2):e15. pmid:16132081
  6. 6. Leipe DD, Aravind L, Grishin NV, Koonin EV. The bacterial replicative helicase DnaB evolved from a RecA duplication. Genome Res. 2000;10(1):5–16. pmid:10645945
  7. 7. Lin Z, Kong H, Nei M, Ma H. Origins and evolution of the recA/RAD51 gene family: evidence for ancient gene duplication and endosymbiotic gene transfer. Proc Natl Acad Sci U S A. 2006;103(27):10328–33. pmid:16798872
  8. 8. Lusetti SL, Cox MM. The bacterial RecA protein and the recombinational DNA repair of stalled replication forks. Annu Rev Biochem. 2002;71:71–100. pmid:12045091
  9. 9. Cejka P, Symington LS. DNA end resection: mechanism and control. Annu Rev Genet. 2021;55:285–307.
  10. 10. Morati F, Modesti M. Insights into the control of RAD51 nucleoprotein filament dynamics from single-molecule studies. Curr Opin Genet Dev. 2021;71:182–7. pmid:34571340
  11. 11. Howard-Flanders P, West SC, Stasiak A. Role of RecA protein spiral filaments in genetic recombination. Nature. 1984;309(5965):215–9.
  12. 12. Shin Y, Kim SY, Greene EC. ATP hydrolysis-driven structural transitions within the Saccharomyces cerevisiae Rad51 and Dmc1 nucleoprotein filaments. J Biol Chem. 2025;301(9):110528. pmid:40721016
  13. 13. Brouwer I, Moschetti T, Candelli A, Garcin EB, Modesti M, Pellegrini L, et al. Two distinct conformational states define the interaction of human RAD51-ATP with single-stranded DNA. EMBO J. 2018;37(7):e98162. pmid:29507080
  14. 14. Petassi M, et al. Lineage-specific amino acids define functional attributes of the protomer-protomer interfaces for the Rad51 and Dmc1 recombinases. bioRxiv. 2024.
  15. 15. Story RM, Steitz TA. Structure of the recA protein-ADP complex. Nature. 1992;355(6358):374–6. pmid:1731253
  16. 16. Krogh BO, Symington LS. Recombination proteins in yeast. Annu Rev Genet. 2004;38:233–71. pmid:15568977
  17. 17. Lopez BS. RAD51-mediated homologous recombination is a pro-tumour driver pathway. Oncogene. 2025;44(42):4006–16. pmid:40993217
  18. 18. Amunugama R, Fishel R. Homologous recombination in eukaryotes. Prog Mol Biol Transl Sci. 2012;110:155–206. pmid:22749146
  19. 19. Blasiak J. Single-strand annealing in cancer. Int J Mol Sci. 2021;22(4).
  20. 20. Mortensen UH, Lisby M, Rothstein R. Rad52. Curr Biol. 2009;19(16):R676-7.
  21. 21. Bhargava R, Onyango DO, Stark JM. Regulation of single-strand annealing and its role in genome maintenance. Trends Genet. 2016;32(9):566–75. pmid:27450436
  22. 22. Bhattacharya D, Sahoo S, Nagraj T, Dixit S, Dwivedi HK, Nagaraju G. RAD51 paralogs: expanding roles in replication stress responses and repair. Curr Opin Pharmacol. 2022;67:102313. pmid:36343481
  23. 23. Flores-Vega JJ, Puente-Rivera J, Sosa-Mondragón SI, Camacho-Nuez M, Alvarez-Sánchez ME. RAD51 recombinase and its paralogs: Orchestrating homologous recombination and unforeseen functions in protozoan parasites. Exp Parasitol. 2024;267:108847. pmid:39414114
  24. 24. Bonilla B, et al. RAD51 gene family structure and function. Annu Rev Genet. 2020;54:25–46.
  25. 25. Rein HL, Bernstein KA, Baldock RA. RAD51 paralog function in replicative DNA damage and tolerance. Curr Opin Genet Dev. 2021;71:86–91. pmid:34311385
  26. 26. Simo Cheyou E, Boni J, Boulais J, Pinedo-Carpio E, Malina A, Sherill-Rofe D, et al. Systematic proximal mapping of the classical RAD51 paralogs unravel functionally and clinically relevant interactors for genome stability. PLoS Genet. 2022;18(11):e1010495. pmid:36374936
  27. 27. Masson JY, Tarsounas MC, Stasiak AZ, Stasiak A, Shah R, McIlwraith MJ, et al. Identification and purification of two distinct complexes containing the five RAD51 paralogs. Genes Dev. 2001;15(24):3296–307. pmid:11751635
  28. 28. Greenhough LA, Liang C-C, Belan O, Kunzelmann S, Maslen S, Rodrigo-Brenni MC, et al. Structure and function of the RAD51B-RAD51C-RAD51D-XRCC2 tumour suppressor. Nature. 2023;619(7970):650–7. pmid:37344587
  29. 29. Greenhough LA, Galanti L, Liang C-C, Boulton SJ, West SC. Cryo-electron microscopic visualization of RAD51 filament assembly and end-capping by XRCC3-RAD51C-RAD51D-XRCC2. Science. 2026;391(6788):eaea1546. pmid:41196948
  30. 30. Berti M, Teloni F, Mijic S, Ursich S, Fuchs J, Palumbieri MD, et al. Sequential role of RAD51 paralog complexes in replication fork remodeling and restart. Nat Commun. 2020;11(1):3531. pmid:32669601
  31. 31. Saxena S, Dixit S, Somyajit K, Nagaraju G. ATR signaling uncouples the role of RAD51 paralogs in homologous recombination and replication stress response. Cell Rep. 2019;29(3):551-559.e4. pmid:31618626
  32. 32. Somyajit K, Saxena S, Babu S, Mishra A, Nagaraju G. Mammalian RAD51 paralogs protect nascent DNA at stalled forks and mediate replication restart. Nucleic Acids Res. 2015;43(20):9835–55. pmid:26354865
  33. 33. Jiang M, Jia K, Wang L, Li W, Chen B, Liu Y, et al. Alterations of DNA damage repair in cancer: from mechanisms to applications. Ann Transl Med. 2020;8(24):1685. pmid:33490197
  34. 34. Chae YK, Anker JF, Carneiro BA, Chandra S, Kaplan J, Kalyan A, et al. Genomic landscape of DNA repair genes in cancer. Oncotarget. 2016;7(17):23312–21. pmid:27004405
  35. 35. Matos-Rodrigues G, Guirouilh-Barbat J, Martini E, Lopez BS. Homologous recombination, cancer and the “RAD51 paradox”. NAR Cancer. 2021;3(2):zcab016. pmid:34316706
  36. 36. Klein HL. The consequences of Rad51 overexpression for normal and tumor cells. DNA Repair (Amst). 2008;7(5):686–93. pmid:18243065
  37. 37. Schild D, Wiese C. Overexpression of RAD51 suppresses recombination defects: a possible mechanism to reverse genomic instability. Nucleic Acids Res. 2010;38(4):1061–70. pmid:19942681
  38. 38. Demeyer A, Benhelli-Mokrani H, Chénais B, Weigel P, Fleury F. Inhibiting homologous recombination by targeting RAD51 protein. Biochim Biophys Acta Rev Cancer. 2021;1876(2):188597. pmid:34332021
  39. 39. Takata M, Sasaki MS, Tachiiri S, Fukushima T, Sonoda E, Schild D, et al. Chromosome instability and defective recombinational repair in knockout mutants of the five Rad51 paralogs. Mol Cell Biol. 2001;21(8):2858–66. pmid:11283264
  40. 40. Suwaki N, Klare K, Tarsounas M. RAD51 paralogs: roles in DNA damage signalling, recombinational repair and tumorigenesis. Semin Cell Dev Biol. 2011;22(8):898–905. pmid:21821141
  41. 41. Rodrigue A, Coulombe Y, Jacquet K, Gagné J-P, Roques C, Gobeil S, et al. The RAD51 paralogs ensure cellular protection against mitotic defects and aneuploidy. J Cell Sci. 2013;126(Pt 1):348–59. pmid:23108668
  42. 42. Garcin EB, Gon S, Sullivan MR, Brunette GJ, Cian AD, Concordet J-P, et al. Differential requirements for the RAD51 paralogs in genome repair and maintenance in human cells. PLoS Genet. 2019;15(10):e1008355. pmid:31584931
  43. 43. Grešner P, Jabłońska E, Gromadzińska J. Rad51 paralogs and the risk of unselected breast cancer: a case-control study. PLoS One. 2020;15(1):e0226976. pmid:31905201
  44. 44. Golmard L, Castéra L, Krieger S, Moncoutier V, Abidallah K, Tenreiro H, et al. Contribution of germline deleterious variants in the RAD51 paralogs to breast and ovarian cancers. Eur J Hum Genet. 2017;25(12):1345–53. pmid:29255180
  45. 45. Sullivan MR, Prakash R, Rawal Y, Wang W, Sung P, Radke MR, et al. Long-term survival of an ovarian cancer patient harboring a RAD51C missense mutation. Cold Spring Harb Mol Case Stud. 2021;7(2):a006083. pmid:33832919
  46. 46. Prakash R, Rawal Y, Sullivan MR, Grundy MK, Bret H, Mihalevic MJ, et al. Homologous recombination-deficient mutation cluster in tumor suppressor RAD51C identified by comprehensive analysis of cancer variants. Proc Natl Acad Sci U S A. 2022;119(38):e2202727119. pmid:36099300
  47. 47. Tate JG, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47(D1):D941–7.
  48. 48. Juul RI, Nielsen MM, Juul M, Feuerbach L, Pedersen JS. The landscape and driver potential of site-specific hotspots across cancer genomes. NPJ Genom Med. 2021;6(1):33. pmid:33986299
  49. 49. Gao J, Chang MT, Johnsen HC, Gao SP, Sylvester BE, Sumer SO, et al. 3D clusters of somatic mutations in cancer reveal numerous rare mutations as functional targets. Genome Med. 2017;9(1):4. pmid:28115009
  50. 50. Pagel KA, Kim R, Moad K, Busby B, Zheng L, Tokheim C, et al. Integrated informatics analysis of cancer-related variants. JCO Clin Cancer Inform. 2020;4:310–7. pmid:32228266
  51. 51. Tokheim C, Karchin R. CHASMplus reveals the scope of somatic missense mutations driving human cancers. Cell Syst. 2019;9(1):9-23.e8. pmid:31202631
  52. 52. Reva B, Antipin Y, Sander C. Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 2007;8(11):R232. pmid:17976239
  53. 53. Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC. SIFT missense predictions for genomes. Nat Protoc. 2016;11(1):1–9. pmid:26633127
  54. 54. Carter H, Douville C, Stenson PD, Cooper DN, Karchin R. Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics. 2013;14 Suppl 3(Suppl 3):S3. pmid:23819870
  55. 55. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4. pmid:22588877
  56. 56. Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6(269):pl1. pmid:23550210
  57. 57. de Bruijn I, Kundra R, Mastrogiacomo B, Tran TN, Sikina L, Mazor T, et al. Analysis and visualization of longitudinal genomic and clinical data from the AACR project GENIE biopharma collaborative in cBioPortal. Cancer Res. 2023;83(23):3861–7. pmid:37668528
  58. 58. Jurrus E, Engel D, Star K, Monson K, Brandi J, Felberg LE, et al. Improvements to the APBS biomolecular solvation software suite. Protein Sci. 2018;27(1):112–28. pmid:28836357
  59. 59. Parthiban V, Gromiha MM, Schomburg D. CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res. 2006;34(Web Server issue):W239-42.
  60. 60. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
  61. 61. Cheadle C, Vawter MP, Freed WJ, Becker KG. Analysis of microarray data using Z score transformation. J Mol Diagn. 2003;5(2):73–81. pmid:12707371
  62. 62. Zhou Z, Zou Y, Liu G, Zhou J, Wu J, Zhao S, et al. Mutation-profile-based methods for understanding selection forces in cancer somatic mutations: a comparative analysis. Oncotarget. 2017;8(35):58835–46. pmid:28938601
  63. 63. Bliss HJ, Tron J, Bush W, Bouley RA, Petreaca RC. PRMT5 genetic interactions with DNA double strand break repair genes. PLoS One. 2025;20(10):e0331499. pmid:41066315
  64. 64. Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499(7457):214–8. pmid:23770567
  65. 65. Ghandi M, Huang FW, Jané-Valbuena J, Kryukov GV, Lo CC, McDonald ER 3rd, et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 2019;569(7757):503–8. pmid:31068700
  66. 66. Nowacka-Zawisza M, Wiśnik E, Wasilewski A, Skowrońska M, Forma E, Bryś M, et al. Polymorphisms of homologous recombination RAD51, RAD51B, XRCC2, and XRCC3 genes and the risk of prostate cancer. Anal Cell Pathol (Amst). 2015;2015:828646. pmid:26339569
  67. 67. Nowacka-Zawisza M, Raszkiewicz A, Kwasiborski T, Forma E, Bryś M, Różański W, et al. RAD51 and XRCC3 polymorphisms are associated with increased risk of prostate cancer. J Oncol. 2019;2019:2976373. pmid:31186630
  68. 68. Piombino C, et al. Homologous recombination repair deficiency in metastatic prostate cancer: new therapeutic opportunities. Int J Mol Sci. 2024;25(9).
  69. 69. De Sarkar N, Dasgupta S, Chatterjee P, Coleman I, Ha G, Ang LS, et al. Genomic attributes of homology-directed DNA repair deficiency in metastatic prostate cancer. JCI Insight. 2021;6(23):e152789. pmid:34877933
  70. 70. Pon JR, Marra MA. Driver and passenger mutations in cancer. Annu Rev Pathol. 2015;10:25–50. pmid:25340638
  71. 71. Amunugama R, He Y, Willcox S, Forties RA, Shim K-S, Bundschuh R, et al. RAD51 protein ATP cap regulates nucleoprotein filament stability. J Biol Chem. 2012;287(12):8724–36. pmid:22275364
  72. 72. Woo T-T, Chuang C-N, Higashide M, Shinohara A, Wang T-F. Dual roles of yeast Rad51 N-terminal domain in repairing DNA double-strand breaks. Nucleic Acids Res. 2020;48(15):8474–89. pmid:32652040
  73. 73. Douville C, Masica DL, Stenson PD, Cooper DN, Gygax DM, Kim R, et al. Assessing the pathogenicity of insertion and deletion variants with the variant effect scoring tool (VEST-Indel). Hum Mutat. 2016;37(1):28–35. pmid:26442818
  74. 74. Pellegrini L, Yu DS, Lo T, Anand S, Lee M, Blundell TL, et al. Insights into DNA recombination from the structure of a RAD51-BRCA2 complex. Nature. 2002;420(6913):287–93. pmid:12442171
  75. 75. Rawal Y, Jia L, Meir A, Zhou S, Kaur H, Ruben EA, et al. Structural insights into BCDX2 complex function in homologous recombination. Nature. 2023;619(7970):640–9. pmid:37344589