Interhomolog polymorphism shapes meiotic crossover within RAC1 and RPP13 disease resistance genes

During meiosis chromosomes undergo DNA double-strand breaks (DSBs), which can produce crossovers via interhomolog repair. Meiotic recombination frequency is variable along chromosomes and concentrates in narrow hotspots. We mapped crossovers within Arabidopsis thaliana hotspots located within the RAC1 and RPP13 disease resistance genes, using varying haplotypic combinations. We observed a negative non-linear relationship between interhomolog divergence and crossover frequency, consistent with polymorphism suppressing crossover repair of DSBs. Anti-recombinase mutants fancm, recq4a recq4b, figl1 and msh2, or lines with increased HEI10 dosage, are known to show increased crossovers. Surprisingly, RAC1 crossovers were either unchanged or decreased in these genetic backgrounds. We employed deep-sequencing of crossovers to examine recombination topology within RAC1, in wild type, fancm and recq4a recq4b mutant backgrounds. The RAC1 recombination landscape was broadly conserved in anti-recombinase mutants and showed a negative relationship with interhomolog divergence. However, crossovers at the RAC1 5’-end were relatively suppressed in recq4a recq4b backgrounds, indicating that local context influences recombination outcomes. Our results demonstrate the importance of interhomolog divergence in shaping recombination within plant disease resistance genes and crossover hotspots.

along chromosomes and concentrates in narrow hotspots. We mapped crossovers within 23 Arabidopsis thaliana hotspots located within the RAC1 and RPP13 disease resistance 24 genes, using varying haplotypic combinations. We observed a negative non-linear 25 relationship between interhomolog divergence and crossover frequency, consistent with 26 polymorphism suppressing crossover repair of DSBs. Anti-recombinase mutants fancm, 27 recq4a recq4b, figl1 and msh2, or lines with increased HEI10 dosage, are known to show 28 increased crossovers. Surprisingly, RAC1 crossovers were either unchanged or decreased 29 in these genetic backgrounds. We employed deep-sequencing of crossovers to examine 30 recombination topology within RAC1, in wild type, fancm and recq4a recq4b mutant 31 backgrounds. The RAC1 recombination landscape was broadly conserved in anti-32 recombinase mutants and showed a negative relationship with interhomolog divergence. 33 However, crossovers at the RAC1 5'-end were relatively suppressed in recq4a recq4b 34 backgrounds, indicating that local context influences recombination outcomes. Our results 35 demonstrate the importance of interhomolog divergence in shaping recombination within 36 plant disease resistance genes and crossover hotspots. pathways show increased non-interfering crossovers, which are also known as class II 104 events (Mercier et al, 2015). This likely occurs as a consequence of reduced dissolution of 105 interhomolog strand invasion events, which are alternatively repaired by the non-interfering 106 crossover pathway(s) ( including MUS81 (Berchowitz et al, 2007;Higgins et al, 2008). Hence, alternative repair 108 pathways act on SPO11-dependent DSBs during meiosis to balance crossover and non-109 crossover outcomes.

111
Due to the formation of interhomolog joint molecules during recombination, sequence 112 polymorphisms between recombining chromosomes can result in mismatched base pairs 113 (Chakraborty & Alani, 2016). During the mitotic cell cycle DNA mismatches or short 114 insertion-deletions (indels) caused by base mis-incorporation during replication, or 115 exogenous DNA damage, can be detected by MutS-related heterodimers  Robertson, 2000). MutS recognition of mismatches and the subsequent promotion of repair 117 plays a major anti-mutagenic role in vivo (Harfe & Jinks-Robertson, 2000). MutS complexes 118 also play anti-crossover roles during meiosis when heterozygosity leads to sequence mis-119 matches, following interhomolog strand invasion (Alani et al, 1994;Hunter et al, 1996; 120 Emmanuel et al, 2006). Accumulating evidence also indicates that class I and II crossover 121 repair pathways show differential sensitivity to levels of interhomolog polymorphism. For 122 example, Arabidopsis fancm mutations show increased crossovers in inbred, but not in 123 hybrid contexts, whereas figl1 and recq4a recq4b mutations are effective at increasing 124 crossovers in both inbreds and in hybrids (Ziolkowski et  located on the edge of pericentromeric heterochromatin, in a region of higher than average 182 crossover frequency and interhomolog polymorphism (Fig. 1A).

184
We next examined the RAC1 and RPP13 regions using genome-wide maps of chromatin occupancy was assessed using MNase-seq data and observed to be enriched within the 187 gene exons and depleted within intergenic promoter, intron and terminator regions (Fig. 1B).

194
Both RAC1 and RPP13 show low levels of DNA methylation, in contrast to the ATENSPM3 195 EnSpm/CACTA (AT1TE36570) element adjacent to RAC1, which is densely cytosine 196 methylated, nucleosome-dense and suppressed for SPO11-1-oligos (Fig. 1B). The RAC1 197 promoter intergenic region contains short fragments of several transposable elements, 198 including HELITRONY3 and ATREP15 Helitrons (Fig. 1B). Transposable elements in these 199 families have low DNA methylation, are nucleosome-depleted and show higher levels of Crossover hotspots within the RAC1 and RPP13 disease resistance genes 206 In order to experimentally measure crossover frequency and patterns within RAC1 and 207 RPP13 we used pollen-typing . This method 208 uses allele-specific PCR amplification from F 1 hybrid genomic DNA extracted from gametes, 209 in order to quantify and sequence crossover molecules ( Fig. 2A  from individuals that are heterozygous over a known crossover hotspot ( Fig. 2A). Allele-214 specific primers annealing to polymorphic sites flanking the region of interest are used to 215 PCR amplify single crossover or parental molecules, using diluted DNA samples ( Fig. 2A).

216
Titration is used to estimate the concentrations of amplifiable crossover and parental 217 molecules. These values are used to calculate genetic distance (cM = 218 (crossovers/(crossovers+parentals))×100) ( Fig. 2A). Sanger sequencing of PCR products 219 amplified from single crossover molecules is performed to identify internal recombination 220 points, to the resolution of individual polymorphisms ( Fig. 2A). Together this information 221 describes the recombination rate (cM/Mb) topology throughout the PCR amplified region 222 Choi et al, 2017). It is also possible to mass amplify crossover 223 molecules, which may be pooled and then used with high throughput paired-end sequencing 224 to identify crossover locations ( Fig. 2A) .

226
To investigate whether RPP13 was associated with crossover hotspots we designed and 227 optimised Col/Ler allele-specific oligonucleotides flanking this resistance gene (Fig. S1). We 228 performed DNA titrations to quantify crossover and parental molecules across RPP13 and 229 observed a genetic distance of 0.055 cM, equivalent to 9.78 cM/Mb across the 5,626 bp 230 amplicon (Table S1). To analyse Sanger sequenced crossovers we plot their frequency 231 against panmolecules, where we include all bases from both accessions ( Fig. S2 and Tables  232  S2-S5). For example, the RPP13 amplicon is 5,431 bp in Col, 5,526 bp in Ler and 5,626 bp 233 in the Col/Ler panmolecule, with 195 inserted bases from Ler and 100 from Col ( Fig. S2 and 234 Table S5). We sequenced 44 single crossover molecules and observed clustering of 235 recombination events at the 5'-end of RPP13, overlapping regions encoding the coiled-coil 236 and NB-ARC domains (Fig. 2C). RPP13 shows a peak crossover rate of 125 cM/Mb (Table  237 S6), compared to the genome average of 4.82 cM/Mb for male Col/Ler F 1 hybrids (Drouaud 238 et al, 2007). Crossovers were also observed in the adjacent gene At3g46540 (Fig. 2C). One 239 crossover was observed in a single 5 bp interval within At3g46540, which results in a very 240 high recombination estimate (Table S6). However, this likely reflects a sampling artefact, 241 rather than the presence of a true hotspot. The region of highest crossover activity within 242 RPP13 overlaps with nucleosome-occupied, H3K4me3-modified exon sequences (Fig. 2D). 243 In contrast, highest SPO11-1-oligos occur in flanking nucleosome-depleted intergenic 244 regions (Fig. 2D).

246
The RAC1 gene is located within a 9,482 bp (Col/Ler) pollen-typing PCR amplicon (Fig. 3). 247 We previously reported analysis of 181 single crossover molecules within the RAC1 248 amplicon (Choi et al, 2016), which we combined with an additional 59 events here to give a 249 total of 240 crossovers (Table S8). We observed a peak rate of 61.7 cM/Mb within RAC1 250 ( Fig. 3A and Table S8). An adjacent gene contained within the amplicon, At1g31550 251 (GDSL), also showed intragenic crossovers ( Fig. 3A) 3A and Tables S8-S10). In each case, we 287 calculated the number of crossovers and polymorphisms in adjacent 500 bp windows (Fig. 4  288 and Table S11). SNPs were counted as one and indels were counted according to their 289 length. In all cases, a significant negative relationship was observed between crossovers 290 and polymorphisms (all RAC1 windows, Spearman's r=-0.685 P=1.11×10 -8 ) ( Fig. 4 and 291 Table S11). This was also seen when analysing the RPP13 Col×Ler data in the same 292 manner (Spearman's r=-0.890, P=2.43×10 -4 ) ( Fig. 4E and Table S12). We fitted a non-linear 293 model to the data using the formula y = log(a)+b×x^(-c), where y is the number of 294 crossovers, x is polymorphisms, a is the intercept and b and c are scale parameters. 295 Together this shows a strong, negative, non-linear association between interhomolog 296 polymorphism density and crossover frequency within RAC1 and RPP13. 297 298 Interestingly, the RAC1 promoter region, which contains transposon fragments, shows 299 higher  5 and Tables S14-S17). The mean 322 number of crossovers and parental molecules per μl were used to test for significant 323 differences, by constructing 2×2 contingency tables and performing Chi-square tests. We 324 compared three biological replicates of wild type Col/Ler F 1 hybrids using this method, which 325 did not show significant differences (Fig. 5A-5C and Tables S14-S17 observed that RAC1 genetic distance significantly decreased in both the anti-recombination 331 single mutants recq4a recq4b, fancm, figl1 and msh2, and the multiple mutants recq4a 332 recq4b fancm and figl1 fancm (Fig. 5A-5C and Tables S14-S17). Furthermore, when we 333 compared wild type with lines containing additional HEI10 copies we did not observe a 334 significant difference in RAC1 crossover frequency ( Fig. 5D and Tables S14-S17). 335 Therefore, in backgrounds with either increased class I (HEI10) and class II (fancm, figl1, 336 recq4a recq4b) crossovers, the RAC1 hotspot is unexpectedly resistant to increasing 337 recombination frequency. 338 339 RAC1 crossover topology in fancm and recq4a recq4b anti-recombinase mutants 340 To analyse RAC1 crossover distributions in wild type versus fancm, recq4a recq4b and 341 recq4a recq4b fancm anti-recombinase mutants, we mass amplified crossovers and 342 performed pollen-seq (Choi et al, , 2016. In this approach, allele-specific PCR 343 amplification is performed using multiple independent reactions seeded with an estimated 344 ~1-3 crossovers per reaction ( Fig. 2A). Crossover concentrations are first estimated using 345 previous titration experiments (Fig. 5 and Table S14). Mass amplified allele-specific PCR 346 products are then pooled, sonicated and used for sequencing library construction (Choi et al,347 2017, 2016). These libraries are subjected to paired-end 2×75 bp read sequencing ( Table  348 S18). We generated two biologically independent libraries for each genotype, sampling 349 either ~300 or ~1,000 crossovers (Fig. 6A). was expected due to the allele-specific primer orientation used during pollen-typing 359 amplification. Consistent with these read pairs representing crossover molecules, their width 360 distributions are similar to that of the sonicated PCR amplification products, prior to adapter 361 ligation (Fig. S4). The crossover reads were then matched to the Col/Ler panmolecule, and 362 counts were added to intervening sequences. These values were then normalized by 363 dividing by the total number of crossover read pairs per library. Finally, the profiles were 364 weighted by RAC1 genetic distance (cM), measured previously via DNA titration (  (Fig. 6A, Fig. S5 and Table S15). Crossovers occurred mainly within the 371 gene transcribed regions and were reduced in the highly polymorphic intergenic regions, in 372 all genotypes (Fig. 6A and Fig. S5). In wild type, highest crossover frequency was observed 373 at the RAC1 5'-end, with distinct peaks associated with the first and second exons, in 374 addition to elevated crossovers occuring within the last three LRR domain-encoding exons 375 (Fig. 6A and Fig. S5). Crossovers were also evident at the 5' and 3' ends of the adjacent 376 gene (GDSL), although at a lower level to those observed in RAC1 (Fig. 6A and Fig. S5). In 377 fancm the crossover profile was similar, except for in the first RAC1 exon where crossover 378 frequency was reduced compared to wild type (Fig. 6A). In the recq4a recq4b and fancm 379 recq4a recq4b mutants we observed that the RAC1 5' crossover peaks in exons 1 and 2 380 were both reduced (Fig. 6A). The RAC1 LRR crossover peaks in recq4a recq4b and fancm 381 recq4a recq4b backgrounds were also less broad and became focused towards the end of 382 exon 5 (Fig. 6A). The 5'-end of GDSL was also reduced in the recq4a recq4b and fancm 383 recq4a recq4b mutants (Fig. 6A).

385
To investigate the relationship between crossovers and polymorphisms, we calculated 386 recombination (crossover read pairs/cM) and polymorphism vlaues in 250 bp windows, 387 against the RAC1 Col/Ler panmolecule. Consistent with our previous observations, all 388 genotypes showed a significant negative correlation between crossovers and polymorphism 389 levels (250 bp adjacent windows, Spearman's: WT r=-0.636, fancm r=-0.580, recq4a recq4b 390 r=-0.567, fancm recq4a reqc4b r=-0.568) (Table S19). As described above, a non-linear 391 model fitted the data using the formula y = log(a)+b×x^(-c), where y is the crossovers, x is 392 polymorphisms, a is the intercept and b and c are scale parameters (Fig. 6B). Hence, the 393 suppressive effect of polymorphisms was observed at RAC1 in both wild type and anti-394 recombinase mutant backgrounds. At RAC1 and RPP13 we observe a strong non-linear, negative relationship between 433 interhomolog polymorphism and crossover formation. Hence, polymorphisms may contribute 434 to variation in crossover:non-crossover ratios, by suppressing crossover maturation 435 downstream of DSB formation. This is consistent with observations made in budding yeast 436 where progressive addition of SNPs at the URA3 hotspot reduced crossovers, with a 437 simultaneous increase in noncrossovers (Borts & Haber, 1987 acting as a hybrid-specific anti-recombinase at the megabase scale (Emmanuel et al, 2006).

442
The inhibitory effect of interhomolog polymorphism on crossover formation may also account 443 for discrepancies observed between SPO11-1-oligos and crossovers at the fine-scale (Choi 444 et al, 2018 In addition to interhomolog polymorphism, chromatin marks may differentially influence 455 meiotic recombination pathways and alter crossover:noncrossover ratios. For example, we 456 observe that H3K4me3 is elevated at the 5'-ends of RAC1, GDSL and RPP13, which 457 correlates with regions of high crossover activity. Although it is also notable that substantial 458 3'-crossovers also occur in these genes, where H3K4me3 occurs at lower levels. Although 459 H3K4me3 levels do not strongly correlate with SPO11-oligo levels in animals, fungi or plants 460 (Lange et  In this study we measured RAC1 crossover frequency in backgrounds with, (i) elevated 480 HEI10 dosage and thereby increased class I activity (Ziolkowski et  of low recombination that are under selection (Barton & Charlesworth, 1998). Therefore, the 515 relationship between interhomolog polymorphism and crossovers is complex, with both 516 negative and positive relationships, depending on the species and scale analysed. 517 518 519 520 521 522 523 524

526
Plant material 527 Arabidopsis lines used in this study were the HEI10 line 'C2' (Ziolkowski et al, 2017), recq4a-528 4 (Col, N419423) recq4b-2 (Col, N511130) (Hartung et al, 2007), recq4a (Ler W387*) 529 (Séguéla -Arnaud et al, 2015), fancm-1 (Col, 'roco1') (Crismani et al, 2012), fancm (Ler, 530 ml20), figl1-1 (Col, 'roco5') (Girard et al, 2015), figl1 (Ler, ml80) and msh2-1 (Col, 531 SALK_002708) (Leonard et al, 2003). Genotyping of Col recq4a-4, Col recq4b-2, Ler recq4a 532 and HEI10 T-DNA was performed as described (Serra et al, 2018). Col and Ler wild type or 533 mutant backgrounds were crossed to obtain F 1 hybrids, on which pollen-typing was 534 performed. The msh2-1 allele was introduced into Ler-0 background by six successive 535 backcrosses. Genotyping of msh2-1 was performed by PCR amplification using msh2-F (5'-536 AGCGCAATTTGGGCATGTCT-3') and msh2-R (5'-CCTCCCATGTTAGGCCCTGTT-3') 537 oligonucleotides for the wild type allele, and msh2-F and msh2-T-DNA (5'-538 ATTTTGCCGATTTCGGAAC-3') oligonucleotides for the msh2-1 allele. 539 540 RPP13 and RAC1 pollen-typing and Sanger sequencing 541 Pollen-typing was performed as described . Genomic DNA was extracted 542 from hybrid F 1 pollen (Col×Ler, Col×Wl or Col×Mh), and used for nested PCR amplifications 543 using parental or crossover configurations of allele-specific oligonucleotide primers (Tables  544  S20- polymorphisms. For analysis we PCR amplified and sequenced the target regions from Col, 552 Ler, Wl and Mh accessions, and used these data to generate Col×Ler, Col×Wl or Col×Mh 553 panmolecules, which include all bases from both accessions (Fig. S2). To analyse the 554 relationship between crossovers and polymorphisms we used adjacent 500 bp windows 555 along the panmolecules and assigned crossover and polymorphism counts, where SNPs 556 were counted as 1, and indels as their length in base pairs. When crossover events were 557 detected in SNP intervals that overlapped window divisions the crossover number was 558 divided by the proportional distance in each window. For example, if two crossovers were 559 detected in a 150 bp interval, of which 50 bp were in window A and 100 bp in window B, we 560 counted 2×(50/150) = 0.67 crossover in window A, and 2×(100/150) = 1.33 crossover in 561 window B. A non-linear model was fitted to the data using the formula; y = log(a)+b×x^(-c), 562 where y is the number of crossovers, x is polymorphisms, a is the intercept and b and c are 563 scale parameters. 564 565 RAC1 mass crossover sequencing 566 Multiple independent RAC1 crossover PCR amplifications were performed, where each 567 reaction was estimated to contain between 1-3 crossover molecules, based on previous 568 titration experiments. RAC1 crossover amplification products were then pooled, 569 concentrated by isopropanol precipitation and gel purified. only. To identify crossover read pairs, we filtered for read pairs having a centromere-577 proximal match to Col and a centromere-distal match to Ler, on opposite strands, which is 578 consistent with RAC1 pollen-typing amplification. Read pair coordinates were then converted 579 into pancoordinates using the Col/Ler key table (Table S2). A value of 1 was assigned to all 580 panmolecule coordinates between each crossover read pair. This process is repeated for all 581 read pairs and final values normalized by the total number of crossover read pairs, and 582 finally weighted by genetic distance (cM). 583 584 Data Access 585 The fastq files associated with mass crossover sequencing have been uploaded to 586 ArrayExpress accession E-MTAB-6333 ("Meiotic crossover landscape within the RAC1 587 disease resistance gene").