Conditional knockout of RAD51-related genes in Leishmania major reveals a critical role for homologous recombination during genome replication

Homologous recombination (HR) has an intimate relationship with genome replication, both during repair of DNA lesions that might prevent DNA synthesis and in tackling stalls to the replication fork. Recent studies led us to ask if HR might have a more central role in replicating the genome of Leishmania, a eukaryotic parasite. Conflicting evidence has emerged regarding whether or not HR genes are essential, and genome-wide mapping has provided evidence for an unorthodox organisation of DNA replication initiation sites, termed origins. To answer this question, we have employed a combined CRISPR/Cas9 and DiCre approach to rapidly generate and assess the effect of conditional ablation of RAD51 and three RAD51-related proteins in Leishmania major. Using this approach, we demonstrate that loss of any of these HR factors is not immediately lethal but in each case growth slows with time and leads to DNA damage and accumulation of cells with aberrant DNA content. Despite these similarities, we show that only loss of RAD51 or RAD51-3 impairs DNA synthesis and causes elevated levels of genome-wide mutation. Furthermore, we show that these two HR factors act in distinct ways, since ablation of RAD51, but not RAD51-3, has a profound effect on DNA replication, causing loss of initiation at the major origins and increased DNA synthesis at subtelomeres. Our work clarifies questions regarding the importance of HR to survival of Leishmania and reveals an unanticipated, central role for RAD51 in the programme of genome replication in a microbial eukaryote.


Introduction
Homologous recombination (HR) has critical roles in the genome maintenance of all organisms, mainly through repair of double stranded DNA breaks (1). HR is a multistep repair process initiated by resection of the ends of double-stranded DNA breaks to generate single stranded DNA overhangs. This processing provides access to a key player in HR: the Rad51 recombinase (RecA in bacteria, RadA in archaea) (2), which catalyses invasion of the single-stranded DNA into intact homologous duplex DNA, allowing templatedirected repair of the broken DNA site. During evolution, duplications of the Rad51 gene have resulted in so-called Rad51 paralogues (3), a set of factors that are found in variable numbers in different organisms and whose spectrum of roles remain somewhat undefined, at least in part because they can belong to a number of protein complexes. Nonetheless, Rad51 paralogues have been implicated in directly modulating HR, acting on Rad51 HR intermediates (4)(5)(6), and in wider repair activities for cell cycle progression (7,8).
HR reactions mediated by Rad51 (9)(10)(11)(12) and modulated by the Rad51 paralogues (13) are also required for resolving DNA replication impediments, by promoting protection and restart of stalled replication forks during replication stress. An even more intimate association between HR and DNA replication has been described in bacteria and archaea, where  and RadA (18) can mediate DNA replication when origins (the genome sites where DNA synthesis begins during replication) have been removed.
In addition to its roles in promoting genome stability, HR can drive to genome variation, which can cause diseases (19), as well as being a means for targeted sequence change during growth, such as during mating type switching in yeast (20). Genome variation due to HR is found widely in trypanosomatid parasites, which are single-celled microbes that cause human and animal diseases worldwide. In Trypanosoma brucei, HR factors have been clearly implicated in the directed recombination of Variant Surface Glycoprotein (VSG) genes during host immune evasion by antigenic variation (21), as well as in maintenance of the massive subtelomeric VSG gene archive (22,23). In T. cruzi, HR has been suggested to be a driver of variability in multigene families (24,25) and in cell hybridisation (26). Finally, in Leishmania, HR related factors have been implicated in mediating the formation or maintenance of episomes, which appear to form stochastically, arise genome-wide and have been implicated in acquisition of drug resistance (27)(28)(29)(30)(31)(32).
Whether the same roles for HR extend to widespread, stochastic formation of aneuploidy is unknown (33), but this other form of genome-wide variation has also been implicated in adaptation of the parasite, such as during life cycle transitions and in response to drug pressure (34-38).
Despite emerging evidence for HR roles in Leishmania genome change, is it possible that the reaction has wider and deeper functions in genome maintenance and transmission in the parasite? One reason for asking this question is recent observations suggesting that origin number and distribution in Leishmania is unusual, since one study detected only a single site of DNA replication initiation per chromosome (39), while a later study suggested >5000 sites (40). These data indicate either a pronounced paucity or huge overabundance of origins relative to all other eukaryotes. Alternatively, the disparity in the datasets may be due to a widespread, unconventional route for initiation of DNA synthesis, acting alongside a small number of conventional origins, perhaps indicating novel strategies for DNA replication that may link with genome plasticity (41). A second reason for asking about the importance of HR for genome transmission in Leishmania is other work that has led to uncertainty about the importance of HR factors for survival of the parasite. Leishmania encodes a highly conserved, canonical Rad51 recombinase (30,42,43), as well three Rad51 paralogues, referred to as RAD51-3, RAD51-4 and RAD5-6 (29), a slightly smaller repertoire of nonmeiotic Rad51-related proteins than is found in T. brucei (44)(45)(46)(47). In L. donovani (48), unlike in L. infantum (30), it has proved impossible to make RAD51 null mutants. Furthermore, while null mutants of  are viable in L. infantum, RAD51-3 has been described as being essential, and RAD51-6 nulls were not recovered in the same experiments (29). Mutation in Rad51 or its relatives has never been shown to be lethal in any single celled eukaryote, notably including T. brucei (49), or in prokaryotes, making these observations in Leishmania particularly striking.
In this work we sought to resolve the question marks over essentiality of HR factors in Leishmania and to test for roles in DNA replication by using conditional gene knockout (KO), comparing the short-and longterm effects of ablating RAD51 and each of its three RAD51 paralogues. Our data show that loss of each gene is, over time, increasingly detrimental to Leishmania fecundity, demonstrating that black and white definitions of essential or non-essential are too limiting for HR genes in the parasite. In addition, we show that the functions provided by the RAD51 paralogues are non-overlapping in Leishmania, and we reveal that RAD51 plays an unexpected, central role in genome replication, where it is required for proper activation of DNA replication origins.

A combined CRISPR/Cas9 and DiCre approach for assessment of gene function in L. major
In order to compare the effects of ablating RAD51 and each of the three known L. major RAD51 paralogues, we adopted a rapid approach to generate cell lines for conditional induction of a gene KO. For this, we used a cell line constitutively expressing Cas9 and DiCre (Fig.1A). In this strategy, we first used CRISPR/Cas9mediated HR to exchange the endogenous copy of the genes by a copy flanked by loxP sites. In addition, each construct translationally fused copies of the HA epitope to the C-terminus of the gene's ORF. PCR showed this approach to be very efficient for RAD51 and the three RAD51 paralogues, since selection using only puromycin resulted in all wild type (WT) copies of each gene being replaced by floxed and tagged versions after a single transformation (Fig.S1). Because RAD51 and the RAD51 paralogue mutants may generate similar phenotypes (44,46), since each contributes to HR (29), we used the same approach to modify the L. major gene encoding the orthologue of T. brucei PIF6. This factor has not been functionally examined in Leishmania, but in T. brucei PIF6 is the sole known nuclear Pif1 helicase homologue (50). Since Pif1 helicases have been implicated in modulating DNA replication passage through barriers and during termination(51, 52), and thus operate in distinct aspects of nuclear genome maintenance compared with Rad51 paralogues, we considered this gene could provide a valuable control for the effects of conditional gene KO of the Rad51-related proteins. A number of attempts failed to replace all WT endogenous L. major PIF6 gene copies with HA-tagged versions, whereas each copy could be floxed with untagged gene variants ( Fig.S1). Growth curves showed that addition of loxP sites or the HA tag did not lead to any significant growth impairment for any of the five genes (Fig.S2).
Next, KO induction of each gene was attempted by rapamycin-mediated DiCre activation in logarithmically growing cultures of each cell line (Fig.1A, Fig.S3). PCR analysis using DNA from cells after a number of induction rounds ('passages'), where cells were grown from low to high density and repeated by dilution, showed that complete gene excision was achieved after passage 2 for all genes (Fig.S3). Controls without addition of rapamycin showed no gene excision, and unexcised gene copies were undetectable even after >15 passages in the presence of rapamycin (Fig.S3). The rapidity of DiCre mediated loss of the gene products was confirmed by western blotting (RAD51 and paralogues) and : signal for all four HA-tagged proteins was no longer detectable after 48 h of the second round of KO induction (Fig.1B), and PIF6 cDNA could not be PCR-amplified (Fig.1C). KO induction of RAD51 and of each of the RAD51 paralogues, but not of PIF6, resulted in increased levels of H2A (53) in western blotting analysis (Fig.1D), suggesting accumulation of nuclear DNA damage after loss of any L .major RAD51-like protein, but not after ablation of PIF6 (at least during unperturbed growth; see below). To attempt to answer the question of whether or not RAD51 and the RAD51 paralogues are essential in Leishmania (29, 48), we measured growth of the parasites for a prolonged period after DiCre-induced gene excision (Fig.1E). At least until passage 4, no growth defect was seen due to loss of RAD51, any of the RAD51 paralogues or PIF6. However, when kept in culture for longer periods (Fig.1E), the RAD51 KO cells and each of the RAD51 paralogues KO cells, but not the PIF6 KO cells, showed marked growth defects, suggesting HR factors that contribute to the catalysis of homology-directed strand exchange might be critical for long-term Leishmania viability in culture. Accordingly, flow cytometry analysis showed that prolonged cultivation after KO induction of RAD51 and the RAD51 paralogues, but not PIF6, resulted in an increased proportion of cells with less than 1C DNA (Fig.S4), suggesting increased genomic instability, perhaps reflecting the increased levels of H2A.
Taken together, the phenotypes seen after induced loss of the five genes suggest some overlap in functions of RAD51 and its relatives, but a distinct role for PIF6. In addition, the PIF6 data demonstrate that prolonged exposure to rapamycin, or effects of DiCre and Cas9 expression, are negligible in these conditions.

Loss of RAD51 or RAD51-3, but not RAD51-4 or RAD51-6, impairs DNA synthesis in L. major
To ask if the impaired growth seen in four of the five induced KO cells is due to a common defect, we tested the extent of DNA synthesis after each gene deletion. To do this, rapamycin induced and uninduced cells were subjected to a short pulse of IdU labelling followed by immunostaining under denaturing conditions and flow cytometry detection, allowing us to track the level and pattern of DNA synthesis in each cell cycle stage ( Fig.2A). Loss of RAD51 or RAD51-3, but not loss of RAD51-4, RAD51-6 or PIF6, resulted in a reduced percentage of IdU-positive cells in the population. Quantification of IdU signal in individual S-phase cells at 48 and 72 h of the second round of KO induction confirmed these effects (Fig.2B): a significant reduction in IdU fluorescence was found at both time points in the rapamycin-induced RAD51 and RAD51-3 KO cells compared with their cognate uninduced controls, whereas no such reduction was seen after KO of RAD51-4, RAD51-6 or PIF6. These data suggest that only loss of RAD51 or RAD51-3 affects DNA synthesis, meaning the growth impairment seen after loss of RAD51 and its relatives, though similar in extent, might not have a common basis, or loss of DNA synthesis is not the main reason for growth reduction after the induced KO of RAD51 or RAD51-3.
Next, we asked if the observed increases in DNA damage after induced KO of the RAD51-like genes all had the same basis by examining H2A levels across the cell cycle. For this, rapamycin induced and uninduced cells were arrested in G1 using 5 mM hydroxyurea (HU) and then released from arrest by removing HU, sampling at the point of arrest and at various times after release for western blotting (Fig.2C). The patterns of H2A accumulation revealed notable differences in the cell cycle functions of the five genes (Fig.2C). KO induction of RAD51 or RAD51-3 resulted in a pronounced increase of H2A levels in cells navigating through S-phase up to G2/M, suggesting roles related to the resolution, before cell division, of genome injuries that arise during DNA replication. In contrast, loss of RAD51-4 or RAD51-6 did not show any clear evidence for increased  H2A signal compared with uninduced controls after HU release, suggesting a more limited contribution to tackling replication-associated damage, which seems consistent with the absence of changes in IdU uptake after KO. PIF6 KO displayed a yet further difference, with increased levels of H2A only ~6 h after HU release (Fig.2C), when much of the population had passed through S-phase (Fig.S5).
These data indicate that loss of PIF6 does in fact result in nuclear damage, but this is more limited than is seen after loss of the RAD51-like genes, and suggests that if the putative helicase has a role in resolving replication problems, this is concentrated in the final steps of DNA replication or even in post-replication steps of the cell cycle.
DNA content in all the samples analysed for H2A levels was next analysed by flow cytometry (Fig.S5).
Intriguingly, no pronounced changes in cell cycle progression after release from HU arrest were observed upon KO induction of any of the HR genes or PIF6 (Fig.S5). Thus, L. major cell cycle checkpoints do not seem to be enacted by the gene KOs, despite clear DNA damage accumulation after loss of RAD51 or its paralogues. Altogether, these data suggest that Leishmania RAD51 and RAD51-3, specifically amongst the genes examined, have roles in promoting effective DNA synthesis and their absence results in increased nuclear genome damage during S-phase.

Ablation of RAD51, RAD51-3 or RAD51-4 results in genome-wide mutagenesis
We next sought to determine if loss of the HR genes results in genome instability by using short-read Illumina sequencing of DNA from RAD51, RAD51-3 and RAD51-4 KO cells after growth for two and six passages in the presence of rapamycin, as well as in the same cells grown without rapamycin. In each case, mapping of reads to the genome showed specific loss of sequence around the loxP-flanked gene, confirming KO induction (Fig.3A). By measuring the number of single nucleotide polymorphisms (SNPs) in the induced and uninduced cells after passage two and six, we then calculated the number of new SNPs that arose during the intervening growth period (Figs.3B,C). For each KO, loss of the HR gene resulted in an increased number of new SNPs over four passages relative to the uninduced control cells (Fig.3B).
Strikingly, when new SNP density was plotted individually for each of the 36 chromosomes, it became apparent that the increase in SNPs after gene KO was not random across the genome, as the smaller chromosomes accumulated a higher density of new SNPs than the larger chromosomes (Fig.3C). In addition, there was a notable peak in new SNP density proximal to the 'strand switch regions' (SSRs) within each chromosome where multigene transcription initiation and/or termination occurs (Fig.3D), and a subset of which are where MFA-seq has mapped DNA replication initiation (i.e. are predicted replication origins)(39). This SSR proximity mapping also revealed a difference between the three gene KOs: following loss of RAD51 a decrease in SNP accumulation was seen at both origin-active and -inactive SSRs relative to uninduced controls, whereas the opposite effect was observed after RAD51-3 KO, and no change in SSRproximal SNP density was found after RAD51-4 KO. Thus, the two HR factors whose loss was found to affect global DNA synthesis were also seen to affect SNP accumulation around SSRs, unlike RAD51-4 KO, which did not show an effect on DNA synthesis.
To ask if the effects described above are limited to SNPs, we also measured insertions and deletions (InDels) in the same cells (Fig.S6). In this case, only KO of RAD51-3 was found to result in an increase in new InDels across the genome (Fig.S6A), though even here the effect was modest. Nonetheless, the same pattern of increased new mutation density in the small chromosomes relative to the large was detectable in all cells (Fig.S6B). Around the SSRs, a clear peak of new InDels in the uninduced cells was less apparent than was seen for new SNPs (compare Fig.S6C with Fig.3D), though InDels appeared to accumulate around both origin-active and -inactive SSRs upon RAD51 KO (Fig.S6C). In contrast, such a trend was seen only at origin-active SSRs upon RAD51-3 KO, and no difference was seen in RAD51-4 KO cells compared with controls ( Fig.S6C). In total, therefore, InDel accumulation upon HR KO was more modest than SNPs, but any distinct accumulation around SSRs was most clearly detected after loss of RAD51 or RAD51-3, the two factors our data implicate in global DNA synthesis.
To examine the effects of loss of either RAD51 or RAD51-3 further, we examined survival of the KO cells relative to uninduced controls in the presence of increasing concentrations of phleomycin and camptothecin, both of which cause DNA double-strand breaks, and HU, which inhibits ribonucleotide reductase and impairs DNA synthesis (Fig.S7). As expected for predicted DNA double-strand break repair factors, KO induction of either RAD51 or RAD51-3 lead to increased sensitivity to phleomycin and camptothecin. However, only RAD51-3 KO led to increased sensitivity to HU, further suggesting distinct roles for RAD51 and RAD51-3 roles during DNA synthesis. Consistent with this interpretation, differences were also observed in the patterns of SNP and InDel accumulation after exposure to and release from HU treatment (Fig.S8). Both genome-wide (Figs.S8A,B) and in each chromosome (Figs.S8C,D), fewer HUinduced new SNPs or InDels were detected in RAD51 KO cells compared with uninduced, whereas KO of RAD51-3 resulted in higher levels of HU-induced SNPs or InDels relative to uninduced cells. Despite the global decrease in HU-induced SNPs after loss of RAD51, SNPs were found to increase around SSRs in RAD51 KO cells after HU treatment compared with uninduced cells (Fig.S8E), whereas no such effect was seen for InDels (Fig.S8F). Again, this phenotype differed from that seen after RAD51-3 loss: no accumulation of SSR-proximal HU-induced SNPs was seen in the RAD51-3 KO cells compared with uninduced ( Fig.S8E), whereas a very modest enrichment of InDels was observed (Fig.S8F). Taken together, these HU data reinforce the view that RAD51 and RAD51-3 play distinct roles in maintenance of the L. major genome, despite loss of either causing impaired DNA synthesis.

Generation of conditional double gene mutants by CRISPR/Cas9 and DiCre in Leishmania
To date, the analysis of single gene conditional KOs has implicated only RAD51 and RAD51-3 amongst the four HR factors in DNA synthesis. One explanation may be that RAD51-4 and RAD51-6 act redundantly in Leishmania. To test this, we used the combined CRISPR/Cas9, DiCre approach to attempt to make conditional double gene KOs of the Rad51-paralogues in all possible combinations (Fig.4A). First, CRISPR/Cas9 was used to generate floxed copies of either RAD51-3 or RAD51-4 using gene-specific puromycin-resistance constructs (described above). Next, the resulting cell lines were subjected to a second round of CRISPR/Cas9 engineering to delete the endogenous copies of RAD51-4 or RAD51-6 in the Western blotting (Fig.4B) showed that HA-tagged RAD51-3 or RAD51-4 protein in the null mutants was undetectable after rapamycin induction of gene KO. The RAD51-4 and RAD51-6 null mutants, prior to induction of conditional excision of the second gene, showed increased levels of H2A compared with the background cell line (Fig.4C). DiCre-mediated gene excision, leading to the three different double gene KOs, did not result in further increases in the levels of the phosphorylated histone. The lack of further impairment when two genes are lost relative to one was broadly consistent with growth curves (Fig.S11A) and with flow cytometry analysis of DNA content (Fig.S11B): conditional excision of the second gene did not lead to worsening of growth or to increased numbers of aberrant cells with less than 1C content compared with uninduced single gene null cells. Intriguingly, the RAD51-4 and RAD51-6 null mutants, even without induction of the second gene KO and having been selected as clones, appeared to recapitulate the longterm growth impairment seen upon prolonged cultivation after DiCre excision of a single HR gene (compare IdU labelling followed by FACS analysis of each of the three cell lines (Fig.4D), before and after rapamycin induction, did not show a change in the proportion of cells that incoporated the nucleotide analogue.
Quantification of IdU fluorescence levels of individual S-phase cells (Fig.4E) did not reveal a consistent pattern of fluorescence decrease upon simultaneous KO of two RAD51 paralogs. However,a signal reduction was detected 72 h after RAD51-3 KO induction in both the RAD51-4 and RAD51-6 null backgrounds, consistent with loss of RAD51-3 alone impeding DNA replication (Fig.2B). In contrast, though a mild loss of fluorescence was seen at 48 h when KO of RAD51-4 was induced in the RAD51-6 null background, this was not seen at 72 h (Fig.4E). These data reinforce the suggestion that RAD51-3 alone among the three RAD51 paralogues plays a role in L. major DNA replication, and indicate that RAD51-4 and RAD51-6 do not obviously act redundantly in such a function.
To ask if the induction of a double RAD51 paralogue gene KO had similar effects on genome instability relative to what was seen after induction of single gene KOs (Fig.3), we again performed Illumina sequencing of DNA from the RAD51-4 null RAD51-3-HA flox cells after two and six passages of growth with and without rapamycin. Read mapping to the genome showed the RAD51-4 gene was absent and that DiCre induction lead to removal of RAD51-3 (Fig.5A). SNPs accumulated to a greater extent across the genome ( . Thus, these data, allied to previous DNA synthesis and cell cycle-dependent damage analysis, suggest a complex mixture of common and separate functions for these two RAD51 paralogues, probably related to epistatic interactions among them.

Loss of RAD51 impairs initiation of DNA replication at the main origins in Leishmania
Our analyses until this point indicated impairment of DNA synthesis after loss of either RAD51 or RAD51-3, including evidence for distinct effects of the gene KOs. However, if and how the two factors might contribute to the programme of DNA replication in L. major was not known. To address this question, we performed MFA-seq analysis, comparing read depth across the chromosomes in DNA extracted from replicating and non-replicating cells, thereby identifying sites of replication initiation and patterns of fork progression (39,54,55). To do this, we used a modified version of MFA-seq compared to what we have described previously: rather than using FACS to isolate S-and G2-phase cells, we isolated DNA from L. major cells in logarithmic (log) or in stationary phase growth, sequenced each using Illumina technology, and mapped the ratio of reads in the two populations. Fig.6A shows these data as Z-scores across two chromosomes, where positive signal represents regions where the mean read depth in the log phase cells is greater than the mean of the stationary cells ( Fig.S13 shows further chromosomes). This modified MFA-seq approach will be described in detail elsewhere (BioRXiv 10.1101/799429), but two primary features of Leishmania replication are revealed relative to MFA-seq based on S/G2 phase read depth ratios using cell cycle-sorted cells (39). First, we confirm the predominant use of single origins in each chromosome, most of which are centrally disposed in the molecules and each coinciding with SSRs. Second, we now detect weaker sites of replication initiation that were not seen previously and are proximal to one or both telomeres in the chromosomes (see -RAP data in Fig.6A, which are highly comparable with WT cells; BioRXiv 10.1101/799429).
To ask about the effect of RAD51 or RAD51-3 loss on DNA replication, we performed MFA-seq in cells induced for KO by growth in rapamycin (48 h, second round of induction), as well as in control cells without rapamycin grown for the same time. Comparing induced RAD51 KO cells with uninduced cells revealed a striking variation in MFA-seq pattern: loss of MFA signal was seen at the main origin that had previously been mapped within each chromosome (39), while an increase in MFA-seq signal was seen at the extremities of the chromosomes (Fig.6A, Fig.S12 ). In the induced RAD51-3 KO cells, the same differential effect was not obvious: though there was potentially some loss of MFA-seq signal at the main origin, increase in subtelomeric signal was not apparent. To examine this genome-wide, we generated metaplots of the MFA-seq signal in the different cells (Fig.6B,C). Profiling of MFA signal around the main origin ( Fig.6B) revealed considerable consistency in amplitude and width of the peaks, both for the 36 origins within one cell and between the two uninduced cells, confirming there is little variation in timing or efficiency of DNA synthesis initiation at all main origins in the WT L. major population (39). The same profiling confirmed that loss of MFA-seq signal around the main origins was a genome-wide effect upon RAD51 KO, with lowered amplitude and width, and was less pronounced upon KO of RAD51-3. These data indicate that loss of RAD51 leads to a more pronounced decrease in replication initiation activity at the main origins compared with RAD51-3 loss, which appears consistent with decreased SNP accumulation at SSRs after RAD51 KO but not after RAD51-3 KO (Fig 3D). Profiling of the MFA-seq mapping at the chromosome ends ( Fig.6C) showed that loss of RAD51, but not RAD51-3, resulted in a gain of signal in the subtelomeres, which again was strikingly consistent in amplitude and width across all chromosomes. These data indicate that loss of DNA synthesis at the main origins due to ablation of RAD51 is accompanied by increased DNA synthesis in the subtelomeres, a shift in the replication programme that is not clearly seen after loss of RAD51-3. In addition, these observations further confirm that RAD51 and RAD51-3 play distinct roles in Leishmania DNA replication.

Discussion
Homologous recombination is known to be important for episome formation in Leishmania (29-31), but the depth and breadth of how the process contributes to genome plasticity and transmission has not been fully explored. Here, we sought to answer two main questions. First, are RAD51 and RAD51 paralogues essential in Leishmania, given the conflicting data on the ability to generate and propagate null mutants in different species? Second, given the potentially novel distribution of mapped origins in the Leishmania genome, does HR play a central role in Leishmania genome duplication? Using a rapid, conditional knockout approach that combines CRISPR/Cas9 gene modification and DiCre gene excision, we show that loss of any RAD51-like gene is not immediately lethal to promastigote L. major but is increasingly detrimental over time. Thus, we suggest that binary definitions of essential or non-essential for HR-related genes inadequately describe their contribution to parasite biology. In addition, our data reveal that loss of RAD51 and RAD51-3, uniquely among the four HR proteins examined, impairs DNA replication, though their roles are distinct, with RAD51 playing an unexpected and central role in DNA replication initiation.
Null mutants of RAD51 and RAD51-4 have been described in L. infantum (29, 30), whereas RAD51 mutants could not be recovered by CRISPR/Cas9 gene targeting in L. donovani (48). In addition, RAD51-3 and RAD51-6 null mutants were not recovered by two-step homology-directed gene deletion in L. infantum (29). The conditional KO approach used here aids understanding of this complex biology, since we show that excision of any of these HR factors has no immediate impact of parasite fitness, but rather causes a progressive slowing of growth, presumably due to accrual of problems. Nonetheless, it is clear that Leishmania null mutants of RAD51 (in L. infantum)(30), and the RAD51 paralogues RAD51-4 and RAD51-6 (this work, in L. major), can be generated. Similarly, null mutants of each RAD51-related gene have been described in T. brucei (44,46). It is possible such variation in importance reflects species-specific aspects of HR function. However, it is also conceivable that conventional, two step transformation approaches to generate null mutant clones allows for selection of cells possessing compensatory changes that lead to survival -an adaptation that may not always be recovered when using CRISPR-Cas9 to simultaneously ablate both alleles (48), or that emerged in the timeframe we have used during conditional gene excision (this work). What such adaptations might be is unclear, but the abundant mutations we describe after loss of HR genes may provide a genetic basis for their generation. In addition, RAD51-directed HR is not the sole route for repair of DSBs in any trypanosomatid (56-62), though whether such pathways can increase in activity in the absence of RAD51-directed HR has not been tested. What aspects of Leishmania genome function degrade with time after loss of RAD51 and its relatives remain to be fully characterised, though our data suggest these effects may differ for the different RAD51 paralogues, and for RAD51.
Conditional gene KO loss of RAD51, RAD51-3 and RAD51-4 led to genome-wide increases in SNPs, as well as increased amounts of in InDels after RAD51-3 loss. Altogether, these data demonstrate a widespread role for these related proteins in Leishmania genome stability. Though we did not test for similar effects after RAD51-6 KO, similar outcomes might be expected. In addition, we did not attempt to test for larger genome changes, but translocations have been described in MRE11 (27), which can guide RAD51 function.
Our mapping of SNPs also revealed two unusual features of the patterns of mutagenesis in L. major. First, there was a notable chromosome size-dependence on levels of SNP accumulation, with smaller chromosomes tending to have higher SNP densities when compared with the larger chromosomes. This was common to each conditional gene KO, indicating it is general feature of L. major chromosome biology.
However, the basis of this effect is unclear. Might it relate to differences in gene expression or nucleosome occupancy? No data that we are aware of supports such a suggestion, and the commonality of multigene transcription in all chromosomes appears to argue against it (63). If not gene expression, then the chromosome size-dependence of SNP density might reflect the limitations of predominant DNA replication initiation at a single origin (39). The second feature was pronounced accumulation of SNPs around SSRs.
Here, loss of the HR factors has somewhat different effects: for RAD51 KO, a decrease in SSR-proximal SNP accumulation was seen, whereas an increase was seen upon RAD51-3 KO and no change was found after KO of RAD51-4. These data suggest that the SNPs reflect the effects of differing roles of the proteins on damage repair. Given that both RAD51 and RAD51-3 loss affects DNA synthesis, while loss of RAD51-4 does not, an explanation could be that the SNPs are generated due to clashes between the transcription and replication machineries at SSRs, providing a source for such pronounced mutation rates at these sites. In T.
brucei it is known that the Origin Recognition Complex (ORC) binds to potentially all SSRs and its loss affects levels of RNA at these loci (54). In addition, RNA-DNA hybrids, which have been mapped to sites of replication-transcription clashes in other eukaryotes (64), form prominently at transcription start sites in T.
brucei (65). Thus, it seems conceivable that SSRs are also sites of such interaction in Leishmania, though no equivalent mapping of ORC or RNA-DNA hybrids has been reported.
Loss of IdU uptake after induced KO of RAD51 or RAD51-3 indicates a role for both HR factors in DNA synthesis and DNA replication, but several lines of evidence suggest these roles are not the same: distinct cell cycle timing of H2A accumulation, distinct patterns of SNP accumulation around SSRs, and differing changes in DNA replication dynamics after their loss. Impaired nucleotide uptake after RAD51 loss appears to be explained by a shift in the programme of L. major DNA replication, with loss of efficient DNA replication initiation at the single primary origin in each chromosome and increased subtelomeric DNA replication initiation. The finding that RAD51 KO cells are impaired in growth and nucleotide uptake argues that increased replication from the subtelomeres is insufficient to compensate for loss of the primary initiation events. In the case of RAD51-3, the more modest loss of replication at the main origin, and no clear increase in subtelomeric replication, seems most readily explained by a widespread impairment in genome replication, perhaps because the RAD51 paralogue is needed to guide processes involved in promoting replication in the face of genome-wide impediments. In this regard, the replication phenotypes after loss of RAD51-3 may be comparable with effects seen after mutation of T. brucei MCM-BP (66), a poorly understood factor that modulates activity of the replicative MCM helicase (67, 68). In addition, such a role for L. major RAD51-3 may be akin to roles for mammalian Rad51 paralogues on stalled DNA replication forks, with the differing effects of loss of L. major RAD51-4 or RAD51-6 suggesting similar functional compartmentalisation amongst the paralogues (13).
How might RAD51 provide a central role in Leishmania DNA replication? One possibility is that DNA replication initiation at the single SSR-localised origin in each chromosome is directly driven by RAD51, perhaps even by catalysing HR. Such an origin-specific function for RAD51 or RAD51-directed HR has no precedent and, indeed, would be distinct from origin activity in T. brucei, where these sites are conventionally defined by binding ORC (54, 69). Nonetheless, ORC binding has not been mapped in any Leishmania genome and precedents for recombination-directed replication initiation exist in viruses (70-72), bacteria (17), polypoid archaea (18) in Tetrahymena (73), albeit normally without a focus on defined genome sites. In addition, human Rad51 acts with MCM8-9, an alternate MCM helicase complex, to initiate DNA replication (74); though this DNA synthesis pathway is origin-independent in humans, no work has tested MCM8 or MCM9 function in any trypanosomatid. Alternatively, and perhaps more likely, RAD51 may play a more indirect role in Leishmania genome replication. The earliest acting origin-active SSRs in T.
brucei co-localize with centromeres (54, 75), and recent work has mapped one putative component of the centromere-binding kinetochore, KKT1, to each MFAseq-mapped origin SSR in L. major (76). These loci might then be vulnerable to breakage, such as during chromosome segregation, and RAD51-directed HR may be required to repair any breaks, such as after mitosis in order to allow proper licensing and firing of origins at these loci. Such a scenario could be compatible with data from other eukaryotes of important roles for RAD51-directed HR is maintaining centromere function (77-79). In T. brucei, the presence of further ORC-defined origins in each chromosome may compensate for loss of RAD51 causing impaired centromere-focused origin function, but such a mutation in Leishmania might be more detrimental, given the presence of only a single major centromere-focused origin in each chromosome (39). A distinct suggestion for an indirect role of RAD51 is that the recombinase does not play an origin-or centromerefocused role, but instead is needed to support DNA replication genome-wide, given previous suggestions that replication from a single major origin in each Leishmania chromosome would be insufficient to replicate all chromosomes in S-phase (39, 41). Such a function could be compatible with originindependent roles of HR-directed replication (discussed above) and, in the absence of RAD51, complete chromosome replication is lost during S-phase, leading to impaired mitosis and reduced numbers of cells that license the main origins. This scenario may also explain the increased levels of MFA-seq reads in the subtelomeres. As we did not detect this replication signature when previously sorting S-phase cells (39), it is likely that the subtelomeric DNA replication reaction occurs either late in S-phase, or even post-S-phase.
Thus, the increased subtelomeric MFA-seq signal in RAD51 KO cells may reflect greater numbers of cells stalling in late-or post-S phase. Alternatively, subtelomeric DNA synthesis may be a back-up strategy for the primary SSR/origin-focused DNA replication reaction and, when the primary reaction is compromised by loss of RAD51, it assumes greater importance. As it stands, we do not know the nature of the subtelomeric DNA synthesis, but these data indicate it is distinct from DNA replication emanating from the main SSR-focused origins and is RAD51-independent. Whether or not this subtelomeric DNA replication relates to a recently proposed form of telomere maintenance (34) is worthy of further investigation.

DNA constructs and cell line generation.
A background cell line was established in which DiCre is expressed from the ribosomal locus, while both Cas9 and T7 polymerase are expressed from the tubulin array. For this, WT cells were transfected with plasmid pGL2339 (80), previously digested with PacI and PmeI, to generate DiCre-expressing cells. Correct integration into the ribosomal locus was confirmed by PCR. Then, the DiCre-expressing cells were transfected with plasmid pTB007 (81), previously digested with PacI, to generate the DiCre/Cas9/T7-expressing cell line. Correct integration of Cas9/T7-encoding cassette was confirmed by PCR. In this way, the Cas9/T7 system, as previously described (81), was used to flank all copies of a GOI with LoxP sites, in a single round of transfection to generate the GOI Flox cell lines used here.
Donor fragments for Cas9-mediated genome editing were generated by PCR (Table S1). For this, the ORFs encoding each GOI were PCR-amplified using genomic DNA as template. PCR products of RAD51 (LmjF.28.0550), RAD51-4 (LmjF.11.0230), RAD51-6 (LmjF.29.0450) and PIF6 (LmjF.21.1190) were cloned between NdeI and SpeI restriction sites in the vector pGL2314 (83). PCR product of RAD51- 3 (LmjF.33.2490) was cloned into the SpeI restriction site of vector pGL2314. The resulting constructs contained the GOI flanked by LoxP sites (pGL2314GOI Flox ) and were used as templates in PCR reactions to generate the donor fragments flanked by sequences homologues (30 nucleotides) to the targeting integration sites. PCR products were ethanol precipitated and transfected into the DiCre/Cas9/T7-expressing cell line. Correct integration into the expected locus was confirmed by PCR analysis. The strategy used to generate sgRNAs was essentially as previously described (81). Briefly, sgRNAs were generated in vivo upon transfection with appropriate DNA fragment generated by PCR. These fragments contained the sequence for the T7 polymerase promoter, followed by the 20 nucleotides of sgRNA targeting site and 60 nucleotides of sgRNA scaffold sequence. The Eukaryotic Pathogen CRISPR guide RNA/DNA Design Tool (http://grna.ctegd.uga.edu) was used to generate the 20 nucleotide sequences for sgRNA targeting sites.
The default parameters and the highest scoring 20 nucleotide sgRNA sequences were chosen.
Western blotting. Whole cell extracts were prepared by collecting cells and boiling them in NuPAGE™ LDS Sample Buffer (ThermoFisher). Extracts were resolved on 4 -12 % gradient Bis-Tris Protein Gels (ThermoFisher) and then transferred to Polyvinylidene difluoride (PVDF) membranes (GE Life Sciences).
Before probing for specific proteins, membranes were blocked with 10% (w/v) non-fat dry milk in phosphate-buffered saline supplemented with 0.05% Tween-20 (PBS-T). Primary antibody incubation was performed for 2 h at room temperature with PBS-T supplemented with 5% non-fat dry milk. Membranes were washed with PBS-T and then incubated with secondary antibodies in the same conditions as the primary antibodies. For HRP-conjugated secondary antibodies, ECL Prime Western Blotting Detection Reagent (GE Life Sciences) was used for band detection as visualized with Hyperfilm ECL (GE Life Sciences).
For IRDye-conjugates secondary antibodies, Odyssey Imaging Systems (Li-COR Biosciences) was used for band detection and visualization.
Heatmaps, violin plots and metaplots were generated using Prism Graphpad. Underlying data for metaplots and coverage tracks were generated using deepTools(91).

Marker Frequency Analysis (MFAseq).
After processing, reads were compared essentially using methods described previously (39), though with modifications. Briefly, the number of reads in 0.5 kb windows along chromosomes was determined. The number of reads in each bin was then used to calculate the ratio between exponentially growing and stationary phase cells, scaled for the total size of the read library. Ratio values were converted into Z scores values in a 5 kb sliding window (steps of 500bp), for each individual chromosome. MFAseq profiles for each chromosome were represented in a graphical form using Gviz(92).   and diluted back to that density every 4 -5 days to complete five passages (P1 to P5); growth profile was also evaluated after cells were kept in culture for more than 15 passages (>P15); cell density was assessed every 24 h, and error bars depict standard deviation from three replicate experiments.     14.

Detection of cells in S phase
Kogoma T & von Meyenburg K (1983) Table S1 Primers used in this study                   of the indicated cell lines in the presence or absence of RAP; cells were seeded at 10 5 cells/ml in day 0 and re-seeded every 4 -5 days to complete five passages (P1 to P5); growth profile was also evaluated after cells were kept in culture for more than 15 passages (>P15); cell density was assessed every 24 h and error bars depict standard error of the mean (S.E.M.). (B) Representative histograms from FACS analysis to determine the distribution of cell population according to DNA content in cells kept in culture for more than 15 passages; 30,000 cells were analysed per sample; 1C and 2C indicate one DNA content (G1) and double DNA content (G2/M), respectively.  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34

Figure S13
Figure S13: Genome-wide mapping of replication initiation upon RAD51 and RAD51-3 KO. Graphs show the distribution of sites of DNA synthesis initiation across the indicated chromosomes in the indicated cell lines, in each case grown in the absence (-RAP) or the presence (+RAP) of rapamycin. MFA-seq is represented by Z-scores across the chromosomes, calculated by comparing read depth coverage of DNA from exponentially growing cells relative to stationary cells; the bottom track for each chromosome displays coding sequences, with genes transcribed from right to left in red, and from left to right in blue .