Complex Breakpoints and Template Switching Associated with Non-canonical Termination of Homologous Recombination in Mammalian Cells

A proportion of homologous recombination (HR) events in mammalian cells resolve by “long tract” gene conversion, reflecting copying of several kilobases from the donor sister chromatid prior to termination. Cells lacking the major hereditary breast/ovarian cancer predisposition genes, BRCA1 or BRCA2, or certain other HR-defective cells, reveal a bias in favor of long tract gene conversion, suggesting that this aberrant HR outcome might be connected with genomic instability. If termination of gene conversion occurs in regions lacking homology with the second end of the break, the normal mechanism of HR termination by annealing (i.e., homologous pairing) is not available and termination must occur by as yet poorly defined non-canonical mechanisms. Here we use a previously described HR reporter to analyze mechanisms of non-canonical termination of long tract gene conversion in mammalian cells. We find that non-canonical HR termination can occur in the absence of the classical non-homologous end joining gene XRCC4. We observe obligatory use of microhomology (MH)-mediated end joining and/or nucleotide addition during rejoining with the second end of the break. Notably, non-canonical HR termination is associated with complex breakpoints. We identify roles for homology-mediated template switching and, potentially, MH-mediated template switching/microhomology-mediated break-induced replication, in the formation of complex breakpoints at sites of non-canonical HR termination. This work identifies non-canonical HR termination as a potential contributor to genomic instability and to the formation of complex breakpoints in cancer.


Introduction
Double strand breaks (DSBs) are dangerous lesions, the misrepair of which can contribute to genomic instability and cancer predisposition, premature aging and immunological deficiency in mammals [1][2][3]. A major trigger to chromosome breakage occurs during attempted replication across a damaged DNA template [4][5][6][7][8]. Such replication-associated DSBs may be repaired by sister chromatid recombination (SCR)-a potentially error-free pathway of homologous recombination (HR) in which the broken chromosome uses the neighboring sister chromatid as a template for repair [9][10][11][12]. Germ line mutation of HR genes contributes to hereditary breast/ovarian cancer susceptibility, Fanconi anemia and other cancer-prone or developmental disorders [1,[13][14][15]. Other recognized DSB repair pathways include classical non-homologous end joining (C-NHEJ), alternative end-joining (A-EJ, i.e., end-joining in the absence of one or more C-NHEJ genes) and single strand annealing (SSA) [2]. A-EJ is characterized by the dominant use of microhomology (MH)-mediated end joining (MMEJ)-rejoining events in which the two DNA ends share short stretches of homology at the breakpoint [16,17].
Cancer genomes commonly reveal complex patterns of chromosomal rearrangement. This complexity may take the form of multiple breakpoints at the site of a chromosome rearrangement with insertion of short stretches of DNA sequence derived from ectopic loci [18][19][20]. The breakpoints of cancer rearrangements frequently reveal MH, but homeologous breakpoints (i.e., breakpoints with extensive but imperfect homology) and breakpoints with untemplated nucleotide addition (N-addition) are also observed [18]. Such complex rearrangements could entail rejoining of simultaneously arising chromosome breaks, break-induced copying from ectopic templates, or both [21].
A major pathway of HR repair in somatic cells is "Synthesis-dependent strand annealing" (SDSA) [22]. SDSA entails DNA end resection, loading of the Rad51 recombinase onto single stranded (ss)DNA and Rad51-mediated homologous invasion of the donor DNA molecule, such as the neighboring sister chromatid, by one of the two DNA ends. Extension of the invading/nascent strand by repair synthesis is followed by its release ("displacement") and termination of SDSA normally occurs by annealing (i.e., homologous pairing) of the displaced nascent strand with complementary ssDNA sequences on the resected second end of the DSB. The majority of HR events triggered by a DSB resolve by "short tract" gene conversion (STGC), which typically entails repair synthesis of <100 base pairs from the donor [23][24][25]. A proportion of HR events resolve as "long tract" gene conversions (LTGC), in which several kilobases (up to~10 kb) of the neighboring, undamaged sister chromatid are copied into the break site of the damaged chromosome [26,27]. LTGC and crossing over can produce similar rearrangements in the context of an HR reporter. Where studied, these outcomes have proven to be mediated by LTGC and not by crossing over [26,[28][29][30]. Genetic inactivation of the major hereditary breast/ovarian cancer predisposition HR genes BRCA1 or BRCA2, or of other HR genes such as the Rad51 paralogs Rad51C, XRCC2 or XRCC3 biases HR in favor of LTGC [28][29][30][31][32][33][34]. Thus, understanding the mechanisms underlying LTGC in mammalian cells may yield insight into mechanisms of genomic instability in HR-defective hereditary breast/ovarian cancer-predisposition syndromes.
Very long gene conversions in Saccharomyces cerevisiae are mediated by break-induced replication (BIR), which can copy >100 kilobases from the donor molecule [35][36][37]. The BIR copying mechanism in S. cerevisiae is conservative, rather than the semi-conservative mechanism of a conventional replication fork [38,39]. BIR in S. cerevisiae is dependent on the Pif1 helicase and entails a migrating bubble mechanism [39,40]. Gene conversions in S. cerevisiae that ultimately resolve as BIR may reveal homologous template switches during the early stages of the process, suggesting that the initial steps of BIR can be mediated by less robust copying mechanisms [41]. Further, spontaneous somatic gene conversions in S. cerevisiae reveal a bimodal distribution of tract lengths, with median peaks at 6 kb and >50 kb [42]. Taken together, these studies suggest that classical BIR and LTGC, although topologically similar processes, retain some mechanistic differences.
If the site of HR termination lacks homology with the second (non-invading) end of the DSB, the classical SDSA mechanism of termination by annealing with the resected second end of the DSB is not available. Under these circumstances, HR termination may be mediated by end joining mechanisms [26,27,43,44]. Breakpoints of non-canonical HR termination often reveal MH, suggesting a role for A-EJ in this process [43,44]. However, the genetic regulation of non-canonical HR termination in mammalian cells is currently undefined. In Drosophila melanogaster, non-homologous termination of HR repair of a transposase-induced break is independent of the C-NHEJ gene LIG4 and is mediated by the error-prone DNA polymerase PolΘ, encoded by the POLQ gene [45,46]. Here, we use a previously described mammalian reporter of LTGC between sister chromatids [27] to analyze mechanisms of non-canonical LTGC termination in XRCC4 conditional and isogenic XRCC4 null mouse embryonic stem (ES) cells [47,48]. Our work reveals that non-canonical termination of HR in mammalian cells is independent of XRCC4 and can lead to the formation of complex breakpoints, mediated by template switching. This suggests that non-canonical termination of HR may contribute to the formation of complex breakpoints in the cancer genome.

Results
Non-canonical termination of mammalian HR does not require XRCC4 We previously described a HR reporter that enables positive selection of both short tract (STGC) and long tract gene conversions (LTGC) between sister chromatids in response to a site-specific DSB induced by the rare-cutting homing endonuclease I-SceI (Fig 1) [27]. Briefly, we positioned two artificial exons of the gene encoding blasticidin S deaminase (here termed "BsdR") in a non-productive orientation between the two GFP copies of an HR reporter. Parental cells, or products of STGC, remain blasticidin sensitive (BsdR -; Fig 1A). In contrast, LTGC duplicates the BsdR cassette, thereby allowing expression of wild type (wt) BsdR by splicing ( Fig 1A). LTGC is experimentally defined here as a gene conversion of >1.03kb-sufficient to duplicate exon B of the blasticidin cassette.
The most abundant I-SceI-induced HR product is STGC, in which the broken copy of GFP is converted to wild type GFP, leaving the reporter structure otherwise unchanged ( Fig 1A). In wild type cells, approximately 5% of all I-SceI-induced GFP + products resolve by LTGC [28,29,47,49]. LTGC frequently results in triplication of the GFP copies within the repaired sister chromatid (Fig 1A). However, a small proportion of I-SceI-induced LTGCs are terminated in regions lacking homology with the second end of the DSB [26,27,29]. These LTGCs must be terminated by non-canonical mechanisms ( Fig 1A).
To study the contribution of C-NHEJ to non-canonical HR termination, we introduced the above-noted "long tract" HR/SCR reporter into mouse embryonic stem (ES) cells carrying biallelic conditional ("floxed") alleles of XRCC4 (XRCC4 fl/fl ES cells) [48,50]. We identified individual clones in which a single, intact copy of the reporter had been integrated into the ROSA26 locus, as described previously and in Materials and Methods [49]. We transduced two distinct XRCC4 fl/fl HR/SCR reporter ES cell clones with adenovirus encoding the Cre recombinase and screened Cre-treated cells for derivative clones that either had or had not undergone biallelic Cre-mediated deletion of XRCC4. Southern and western blotting identified XRCC4 Δ/Δ and XRCC4 fl/fl derivatives of these cells (examples in Fig 1B). We transfected XRCC4 fl/fl and, in parallel, XRCC4 Δ/Δ HR/SCR reporter ES cells with I-SceI (with appropriate controls as described in Materials and Methods), and scored HR products as the frequency of I-SceIinduced GFP + and BsdR + events (LTGCs). The ratio LTGC:Total HR (BsdR + GFP + : Total GFP + ) is a measure of the probability that a given HR event will resolve as LTGC. This value was~3% in each cell type, suggesting that XRCC4 does not directly influence the probability of engaging LTGC during I-SceI-induced HR.

Microhomology-mediated end joining mediates non-canonical LTGC termination
The unrearranged parental reporter and the major "GFP triplication" LTGC product produce predictable patterns of hybridization following gDNA digestion with a panel of restriction endonucleases (Fig 2). We made the assumption that non-canonical termination of LTGC normally entails rejoining with the second end of the DSB and used the specific pattern of Southern blot hybridizations to deduce the likely site of non-canonical LTGC termination in XRCC4 fl/fl or XRCC4 Δ/Δ LTGC clones. Two such examples are shown in Fig 3. We were able to clone the breakpoints of six XRCC4 fl/fl and three XRCC4 Δ/Δ non-canonical LTGC termination products (see Materials and Methods). The cloned breakpoints did indeed reflect rejoining to the second end of the DSB, which had undergone varying degrees of resection (Fig 4). Each breakpoint revealed use of MMEJ or untemplated nucleotide addition (N-addition) at the breakpoint. It has been suggested that N-addition breakpoints of the type observed here might also be products of MMEJ-type rejoining [45]. There were no blunt-ended non-homologous breakpoints in this limited sample and no breakpoints were suggestive of dual homologous invasions by both ends of the original I-SceI-induced DSB. Thus, non-canonical termination of HR can occur in the absence of the C-NHEJ gene XRCC4 and entails use of MMEJ/N-addition rejoining mechanisms, implicating A-EJ as a contributing mechanism.

Analysis of an aberrant LTGC product of XRCC4 Δ/Δ HR/SCR reporter cells
We used a similar restriction mapping approach to analyze one aberrant LTGC product identified in XRCC4 Δ/Δ HR/SCR reporter cells. As discussed above, aberrant LTGC products characteristically reveal off-size or additional GFP-hybridizing bands by Southern blotting. One such aberrant clone is shown in Fig 5. Southern analysis appeared to show two groups of GFP- hybridizing bands with distinct intensities. Importantly, these groups were not separated by recloning of the cells, indicating that all the GFP fragments visualized by Southern blotting reside within one nucleus. We interpret the Southern blot pattern as a case of non-canonical LTGC termination (blue arrow-heads Fig 5) in which LTGC termination occurred between the . Note that each of these restriction endonucleases, which cut target sites between the two GFP copies within the parental reporter, generate an additional 3.2kb GFP-hybridizing band in the context of the "GFP triplication" outcome. (B) Genomic DNA from parental and "GFP triplication" LTGC clones, as shown, was digested with the restriction enzymes shown (code as described above) and analyzed by Southern blotting (GFP probe). The 3.2kb amplification product characteristic of the "GFP triplication" LTGC outcome is marked with an arrowhead. SacI and HindIII sites within the reporter. However, all restriction fragments involving enzymes beyond the SacI site (i.e., HindIII, EcoRI and SpeI) reveal off-size GFP-hybridizing bands (Fig 5B). These fragments do not match restriction fragment patterns of ROSA26 sequence up to 50 kb beyond the second end of the DSB. This suggests that LTGC termination in this case entailed incorporation of ectopic chromosomal sequences. We interpret the fainter GFP-hybridizing bands in this Southern blot (orange arrow-heads) as possible products of the second end of the break (Fig 5C). If so, the rearrangement underlying this aberrant LTGC product could entail a gross chromosomal rearrangement (GCR) initiated by non-canonical LTGC termination. Alternatively, the ectopic sequences (grey bars) depicted in Fig 5A and 5B might be part of one single insertion of several kilobases between the site of LTGC termination The presence or absence of the 3.2kb amplification product in each restriction digest helps to localize the site of LTGC termination within the reporter. (A) XRCC4 fl/fl clone in which termination of LTGC occurred between HindIII and EcoRI sites within the HR reporter. EcoRI and SpeI digests lack the 3.2kb amplification product. (B) XRCC4 Δ/Δ clone in which termination of LTGC occurred between SacI and HindIII sites within the HR reporter. HindIII, EcoRI and SpeI digests lack the 3.2kb amplification product. In this clone, the right hand arms of the SpeI and HindIII digests are much smaller (SpeI) or larger (HindIII) than would be predicted. This is explained by the deletion of~3.5kb from the second end of the DSB, as revealed by sequencing (see Fig 6B). and the second end of the break. In this regard, the solitary~9 kb SpeI fragment in Fig 5A, which appears to have a higher intensity than all other bands, could potentially span this insertion, while retaining GFP sequences from both sides of the termination breakpoint. However, our attempts to amplify such a putative insertion product between the two ends of the break have not yet been successful. The notion that non-canonical LTGC termination might lead to GCR is consistent with the expected greater availability of free DNA ends in XRCC4 Δ/Δ cells, where efficient C-NHEJ mechanisms are compromised. This clone is an example of noncanonical LTGC termination that presents with an aberrant LTGC pattern by Southern blotting. However, until this and other aberrant LTGC products are mapped and sequenced, it would not be valid to conclude that all aberrant LTGC outcomes arise from non-canonical LTGC termination.

Complex breakpoints associated with non-canonical termination of LTGC
In one XRCC4 fl/fl clone in which LTGC had been terminated by non-canonical mechanisms, sequencing revealed two distinct breakpoints: one homologous and one N-addition breakpoint. The homologous breakpoint reflected incorporation of sequences from the episomal I-SceI expression vector within the repaired sister chromatid (Fig 6A). The vector sequence had been incorporated at a site of perfect and extensive homology between the chromosomally integrated HR/SCR reporter and the episomal plasmid, based upon shared rabbit β-globin intron sequences [27,53]. Following LTGC using the sister chromatid as template, a template switching mechanism allowed the displaced nascent strand to invade homologous sequences on the episomal plasmid. After further nascent strand synthesis of !342 bp (the exact point of homologous invasion of the episomal plasmid is not definable), the newly extended nascent strand Complex Breakpoints and Template Switching during Non-canonical HR Termination  was displaced from the plasmid template and was joined to the second end of the I-SceIinduced chromosomal break, with insertion of one nucleotide at this second (non-homologous) breakpoint (Fig 6A). Thus, non-canonical termination of LTGC can entail homologous template switching-a phenomenon known to be associated with LTGC and BIR in S. cerevisiae [41,54]. A second complex breakpoint of non-canonical LTGC termination was present in one XRCC4 Δ/Δ clone. Sequencing of the breakpoint revealed an inversion/duplication rearrangement of the second end of the DSB (Fig 6B; Southern blot analysis of this clone is shown in Fig  3B), involving at least two breakpoints in close proximity to one another. The first breakpoint entailed a 21bp insertion at the site of non-canonical LTGC termination, showing 16bp identity with several heterologous loci in the mouse genome (if templated, this 21bp insertion could represent two independent breakpoints). The second was a 4bp MH breakpoint generated during ligation to the second end of the DSB, with an accompanying complex deletion/inversion/ duplication rearrangement of the second end of the DSB. Although the mechanisms underlying this complex rearrangement are a matter of speculation, the rearrangement suggests that the nascent strand, having been displaced from the donor sister chromatid during LTGC termination, underwent further rounds of MH-mediated template switches and short nascent strand extension-a process termed "microhomology-mediated BIR" (MMBIR) [55] . Fig 7  depicts how this MMBIR rearrangement could have arisen through a fork stalling and template switching (FoSTeS) mechanism [56]. Notably, the 146 bp inversion fragment (Fig 6B) is of a size consistent with FoSTeS-type copying from a lagging strand donor.

Discussion
We used the positive selective power of a HR/SCR reporter to capture rare LTGCs in which HR had been terminated by non-canonical mechanisms in XRCC4 fl/fl and XRCC4 Δ/Δ mouse ES cells. Rejoining with the second end of the chromosomal break entails use of XRCC4-independent MMEJ (i.e. A-EJ), in agreement with previous studies in D. melanogaster [45,46]. A notable finding of the current study is that non-canonical HR termination in mammalian cells may entail homologous template switching or MH-mediated template switching (i.e., MMBIR) prior to rejoining with the second DNA end, leading to the formation of complex breakpoints at the site of HR termination. Long gene conversions during gap repair in D. melanogaster have been proposed to entail cycles of invasion and displacement of the nascent strand, with an implied potential for template switching [57]. Both homologous template switches and MMBIR have been described in S. cerevisiae during LTGC/BIR, suggesting that these errorprone mechanisms of HR termination are evolutionarily conserved [41,54,58]. Our findings provide direct evidence of homologous template switching during mammalian HR, highlighting the extreme reactivity of the displaced nascent strand and its potential significance as an instigator of genomic instability. Given the likely importance of template switching mechanisms in the formation of complex breakpoints in cancer cells, our findings suggest that aberrant HR termination may underlie some of the complex breakpoints observed in cancer genomes [18][19][20][21].
A striking feature of the breakpoints associated with non-canonical LTGC termination is the frequent use of MMEJ/insertional rejoining mechanisms. The channeling of repair into an MMEJ mechanism is likely best explained by the DNA structures that are presented for rejoining. Both the displaced nascent strand and the resected second end of the break possess correctly oriented 36bp repeat, contiguous with unrearranged ROSA26 sequence. Bold underlined blue: 4bp MH breakpoint. Hypothetical model of this complex breakpoint is presented in Fig 7. doi:10.1371/journal.pgen.1006410.g006 extended 3' ssDNA tails. These are poor substrates for Ku binding and, hence, for C-NHEJmediated rejoining, leading to a preference for A-EJ [59]. Completion of non-canonical LTGC by MMEJ-mediated rejoining to the second end of the DSB may suppress more deleterious outcomes, such as template switching, BIR and chromosome translocation, at sites of non-canonical HR termination. Direct testing of this hypothesis must await the development of more readily quantifiable systems for studying non-canonical HR termination in mammalian cells. However, this idea is strongly corroborated by work on the A-EJ mediator PolΘ, which suppresses genomic instability in mammalian cells and prevents large deletions at sites of replication arrest or at transposase-induced gaps in model organisms [46,[60][61][62][63][64]. Conversely, unrestrained LTGC in BRCA mutant and other HR-defective cells might channel HR towards these deleterious outcomes as a mechanism of genomic instability in tumorigenesis [28][29][30]. In the cell lines studied here, non-canonical LTGC termination accounts for~3% of all LTGCs in XRCC4 fl/fl cells, corresponding to~0.1% of all measured GFP + I-SceI-induced HR events. These low frequencies may nonetheless be highly significant for genomic instability and cancer predisposition, since cancer initiation and progression result from stochastic events on a "per cell" basis. The significance of non-canonical termination of LTGC may be greater than is suggested by the above calculations, since the repetitive structure of the HR reporter used here presents two opportunities for HR termination by annealing: during STGC and in the termination of LTGC by "GFP triplication" (Fig 1). In contrast, when gene conversion occurs within non-repetitive sequences, STGC alone provides an opportunity for HR to be terminated by annealing. In this more natural setting, presumably all LTGCs must resolve either by noncanonical termination mechanisms or by BIR. In this regard, it is relevant that mammalian cells lacking the major hereditary breast/ovarian cancer predisposition genes BRCA1 or BRCA2 or other cancer predisposition HR genes reveal a bias towards LTGC [28,[31][32][33][34]. This bias is even more marked at stalled replication forks, where >80% of HR events may resolve as LTGCs in BRCA/HR-defective cells [30]. In this setting, the arrival of a converging replication fork and the activity of stalled fork endonucleases may be additional determinants of genomic instability [65]. The work described here identifies mechanisms by which dysregulated LTGC may contribute to genomic instability in BRCA/HR-defective cells and in general tumorigenesis.

Materials and Methods
Plasmids-The sister chromatid recombination reporter was previously characterized. Expression plasmids for I-SceI and GFP were described previously [27,49]. New constructs described here were generated by standard cloning procedures.
Cell Lines and Cell Culture-XRCC4 fl/fl mouse embryonic stem (ES) cells were obtained from Catherine Yan and Frederick Alt and have been described previously [48]. ES cells were maintained in ES medium on either irradiated MEF feeder cells or gelatinized plates. To generate SCR reporter stable lines, 20μg of KpnI-linearized SCR reporter plasmid was electroporated into 2x10 7 XRCC4 fl/fl ES cells and cells were seeded into 60mm dishes with neomycin resistant feeder mouse embryonic fibroblasts and 400μg/mL G418 (Sigma-Aldrich) was added to the medium 1 day after electroporation. Beginning 1 week after continuous selection, G418-resistant colonies were isolated and screened by Southern blotting for single-copy SCR reporter integration. To generate isogenic XRCC4 fl/fl , XRCC4 fl/Δ and XRCC4 Δ/Δ SCR cell lines, adeno-Cre infection was performed as described previously [49], followed by screening of derivative cell lines by Southern blotting.
Recombination Assays-1.6x10 5 trypsinized ES cells were transfected with 0.5μg plasmid DNA using Lipofectamine 2000 (Invitrogen) in a 24-well plate. Transfection efficiency was measured by parallel transfection of wtGFP expression vector (at 1:10 dilution in empty vector). GFP + frequencies were measured 72 hr post-treatment by flow cytometry using an FC500 (Beckman Coulter) as described previously [27]. To assay LTGC events, cells were counted and replated at 1-3x10 5 cells per gelatinized 100mm dish in triplicate into media containing 5μg/ mL blasticidin (Invitrogen). Approximately 2 weeks later, blasticidin resistant colonies were stained and counted or expanded for molecular analysis. Plating efficiency was determined by plating 3-5x10 2 cells per gelatinized 100mm dish in triplicate into media lacking selection. HR measurements were corrected for background levels of HR events, transfection efficiency and plating efficiency, as described previously [49].
PCR and Sequencing-Breakpoints were amplified using AccuPrime Taq DNA Polymerase High Fidelity (Invitrogen) according to manufacturers instructions. The PCR products were excised from the gel and purified using the QIAquick Gel Extraction Kit (QIAGEN) and subsequently cloned into the pGEM-T Easy vector (Promega). Sequencing was performed at the Dana-Farber/Harvard Cancer Center DNA Resource Core.