Crystal structure of an RNA/DNA strand exchange junction

Short segments of RNA displace one strand of a DNA duplex during diverse processes including transcription and CRISPR-mediated immunity and genome editing. These strand exchange events involve the intersection of two geometrically distinct helix types—an RNA:DNA hybrid (A-form) and a DNA:DNA homoduplex (B-form). Although previous evidence suggests that these two helices can stack on each other, it is unknown what local geometric adjustments could enable A-on-B stacking. Here we report the X-ray crystal structure of an RNA-5′/DNA-3′ strand exchange junction at an anisotropic resolution of 1.6 to 2.2 Å. The structure reveals that the A-to-B helical transition involves a combination of helical axis misalignment, helical axis tilting and compression of the DNA strand within the RNA:DNA helix, where nucleotides exhibit a mixture of A- and B-form geometry. These structural principles explain previous observations of conformational stability in RNA/DNA exchange junctions, enabling a nucleic acid architecture that is repeatedly populated during biological strand exchange events.


Introduction
Although structural and mechanistic information is available for various types of DNA strand exchange processes [1][2][3][4][5][6][7][8], comparatively little is known about RNA/DNA strand exchange. In this reversible process, a strand of RNA hybridizes to one strand of a DNA duplex while displacing the other strand, requiring concomitant disruption of DNA:DNA base pairs and formation of RNA:DNA base pairs. This process occurs most notably at the boundaries of Rloops, such as those left by transcriptional machinery [9], those employed by certain transposons [10,11], or those created by CRISPR-Cas (clustered regularly interspaced short a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 palindromic repeats, CRISPR-associated) enzymes during prokaryotic immunity or eukaryotic genome editing [12][13][14][15]. Structural insight into RNA/DNA strand exchange could therefore improve our understanding of how transcriptional R-loops are resolved and how CRISPR-Cas enzymes such as Cas9 manipulate R-loops to efficiently reject off-target DNA and recognize on-target DNA.
The defining feature of RNA/DNA strand exchange is the junction where the RNA:DNA helix abuts the DNA:DNA helix. Previous experiments on exchange junctions containing an RNA-5 0 end and a DNA-3 0 end (an "RNA-5 0 /DNA-3 0 junction," which is the polarity generated by Cas9) showed the component DNA:DNA duplex to be more thermodynamically stable than a free DNA helix end, perhaps due to interhelical RNA:DNA/DNA:DNA stacking [16]. While stacking in DNA-only junctions is thought to occur as it would in an uninterrupted Bform duplex [8,17,18], an analogous structural prediction cannot be made for RNA/DNA junctions because the two component helices are predisposed to different geometries: B-form for the DNA:DNA helix and a variant of A-form for the RNA:DNA helix [19][20][21]. A conformation that preserves base stacking across such a junction must reconcile base pairs that are flat and centered (B-form) with base pairs that are inclined and displaced from the helical axis (A-form). While prior structural studies of Okazaki fragments reckoned with a similar geometric puzzle [22], Okazaki fragments bear an RNA-3 0 /DNA-5 0 polarity (opposite of the polarity addressed here) and lack the strand discontinuity that defines exchange junctions. Thus, the structural basis for the putative stacking-based stability in RNA-5 0 /DNA-3 0 junctions remains unknown.
Here we present the X-ray crystal structure of an RNA-5 0 /DNA-3 0 strand exchange junction, which undergoes an A-to-B transition without loss of base pairing or stacking across the exchange point. This structure reveals the principles of global helical positioning and local adjustments in nucleotide conformation that allow RNA:DNA duplexes to stack on DNA: DNA duplexes in the RNA-5 0 /DNA-3 0 polarity. This model also complements previously determined cryo-electron microscopy structures of DNA-bound Cas9 for which poor local resolution in the original maps prevented accurate modeling of the leading R-loop edge.

Results
Inspired by previous crystallographic studies of double-stranded DNA dodecamers [23,24], we designed crystallization constructs that contained a "template" DNA strand (12 nucleotides) and two "exchanging" RNA and DNA oligonucleotides that were complementary to each half of the template DNA strand. In different versions of these constructs, we varied the polarity (RNA-5 0 /DNA-3 0 vs. RNA-3 0 /DNA-5 0 ) and the internal termini, which were either flush (exchanging oligonucleotides were 6-mers) or extended with a one-nucleotide flap that was not complementary to the template strand (exchanging oligonucleotides were 7-mers, "flapped"). Only the flapped construct in the RNA-5 0 /DNA-3 0 polarity ( Fig 1A) yielded welldiffracting crystals (anisotropic resolution of 1.6 to 2.2 Å). Thus, all results discussed here describe a flapped RNA-5 0 /DNA-3 0 strand exchange junction, which is the polarity previously observed to stabilize the component DNA:DNA duplex [16].
We determined the X-ray crystal structure of the exchange junction (Table 1, S1 Fig). In this structure, the asymmetric unit contains three molecules (a "molecule" comprises one DNA 12-mer and its complementary RNA and DNA 7-mers). The crystal lattice is largely stabilized by nucleobase stacking interactions both within and between molecules. Along one lattice direction, Molecules 1 and 2 form a continuous network of stacked helices, in which the external RNA:DNA duplex terminus of each Molecule 1 stacks on the equivalent terminus of Molecule 2, with a similar reciprocal interaction for the external DNA:DNA duplex termini (a "head-to-head" and "foot-to-foot" arrangement) ( Fig 1B). Along another lattice direction, symmetry-related instances of Molecule 3 create a head-to-foot helical network (Fig 1C). Compared to Molecules 1 and 2, Molecule 3 is poorly ordered (Fig 1D), and its atomic coordinates appear less constrained by the data due to diffraction anisotropy (see Methods). In the Molecule 3 helical network, two base pairs formed between the flapped nucleotides of Molecules 1 and 2 bridge the duplex ends. The bridging nucleotides form a type I adenine-adenine (ribonucleotide) base pair and a type XV hemiprotonated cytosine-cytosine (deoxyribonucleotide) base pair [25] (Fig 1C and 1E).
The three molecules of the asymmetric unit exhibit canonical Watson-Crick base pairing at all twelve nucleotides of the template DNA strand, and they are generally similar in conformation (RMSD Mol1,Mol2 = 0.70 Å; RMSD Mol1,Mol3 = 1.5 Å, RMSD Mol2,Mol3 = 1.8 Å) (Fig 2A). The most dramatic differences are between Molecules 1/2 and Molecule 3. For example, Molecule 3's flapped nucleotides form no intermolecular base pairs, and the conformation of the DNA flap is flipped relative to Molecules 1/2. Additionally, the external three base pairs of Molecule 3's DNA:DNA helix tilt slightly toward the major groove as compared to the equivalent    (Fig 1C and 1E) and intramolecular stacking ( Fig 2B), but also by hydrogen bonds between sugar hydroxyls and backbone phosphates. Specifically, at the junction-proximal phosphodiester within the DNA:DNA helix, the pro-S p and pro-R p oxygens are hydrogen-bonded to the terminal 3 0 hydroxyl of the flapped DNA nucleotide and the terminal 5 0 hydroxyl of the flapped RNA nucleotide, respectively. Additionally, the pro-S p oxygen of the flapped DNA nucleotide is hydrogen-bonded to the 2 0 hydroxyl of the flapped RNA nucleotide (Fig 2B). If the flaps were longer than one nucleotide, as would occur during biological strand exchange events, the hydrogen bonds to the terminal 3 0 /5 0 hydroxyls would be perturbed. However, in Molecule 3, the flipped deoxycytidine conformation precludes all the mentioned extrahelical hydrogen bonds, yet the base-paired nucleotides within the junction are conformationally similar to the same region in Molecules 1 and 2 (Fig 2A). Therefore, we expect that the structural features of interest to this work-that is, the conformation of the base-paired nucleotides immediately adjacent to the junction-would be populated by junctions bearing flush RNA/DNA ends or flaps of arbitrary length. On the other hand, the flap conformations and the intermolecular base pairs observed here are peculiarities of the crystal lattice. During biological strand exchange processes, these overhung nucleotides would be unpaired and disordered [8]. To understand the nature of the transition in helical geometry across the junction, we performed alignments of regularized A-form and B-form DNA:DNA helices with the observed RNA:DNA and DNA:DNA helices, respectively. These alignments revealed that the DNA: DNA helix closely approximates perfect B-form geometry, especially in the nucleotides closest to the junction (Fig 3A-3C). Likewise, the RNA strand of the RNA:DNA helix closely approximates A-form geometry (Fig 3A-3C). On the other hand, the DNA strand of the RNA:DNA helix deviates from its A-form trajectory in the three nucleotides that approach the exchange point, where the backbone is compressed toward the minor groove (Fig 3B and 3D).
Interestingly, calculation of z P , a geometric parameter that differentiates A-form from Bform base steps [26], indicated that the RNA:DNA base step adjacent to the exchange point is A-like, while the base steps in the center of the RNA:DNA helix are intermediate in their A/B character (Fig 4A). This result indicates an important distinction between strand trajectory (in terms of global alignment to a regularized A-form or B-form helix) and the local nucleotide conformations that underlie the trajectory. In the RNA:DNA helix, the departure from Aform trajectory observed at junction-adjacent nucleotides appears to result from non-A conformations at more junction-distal nucleotides. Other indicators of helical geometry also suggest a mixture of A and B character across the RNA:DNA helix (S2 Fig).
To probe helical geometry with strand specificity, we calculated χ and δ, nucleotide torsion angles that differ in A-form vs. B-form helices [27]. These parameters revealed that the irregularities observed in the paired base step parameters (Fig 4A and S2 Fig) arise entirely from the template DNA strand, which flips between A-and B-like conformations within the RNA:DNA hybrid (Fig 4B and S3 Fig). In contrast, the RNA strand is entirely A-like, and all nucleotides of the DNA:DNA helix are B-like except at position 12 of the continuous strand, which is likely due to an end effect. These observations agree with the conclusions drawn from the alignments https://doi.org/10.1371/journal.pone.0263547.g002 (Fig 3A), and they highlight the DNA strand of the RNA:DNA helix as the structure's most geometrically irregular region, which may enable the junction-adjacent deviation in trajectory.
In addition to the distortions in the continuous DNA strand, the geometric switch also seems to depend on the break in the discontinuous strand, which facilitates a marked jump in the backbone trajectory across the exchange point ( Fig 3C). This feature reflects a global jump in helical positioning that is visualized most clearly in the aligned regularized A-form and Bform duplexes, whose helical axes are tilted and misaligned with respect to each other (the helical axes are tilted from parallel by 14˚, Mol1; 18˚, Mol2; 2˚, Mol3) (Figs 2A, 3B and 3C). Axis misalignment is detectable in the large positive y-displacement value across the central base step, which deviates dramatically from the expected value (0 Å) for either an A-form or Bform duplex (Fig 4C). This observation emphasizes the exchange point as a special base step with noncanonical alignment, made possible by discontinuity in the exchanging strands.

Discussion
Together, our data suggest that stacking an RNA:DNA helix on a DNA:DNA helix does not require deviation of the RNA strand or either strand of the DNA:DNA helix from their native A-form or B-form conformations, respectively. Instead, continuous stacking appears to result from a combination of three structural principles. First, alternating A-like and B-like nucleotide conformations in the hybrid's DNA strand compress the strand relative to a pure A-form trajectory (Figs 3B, 3D, 4B and 5A). Due to A-form base pair inclination (~20˚from perpendicular to the helical axis) in RNA:DNA duplexes, the DNA naturally juts further along the helical axis than the RNA at the RNA-5 0 end. This slanted RNA:DNA end can be stacked upon a flat DNA:DNA end through strand-specific compression-that is, compression of the hybrid's protruding DNA strand (Fig 5A). Second, an alternative to strand compression is to tilt the helical axes themselves, which occurs in Molecules 1 and 2 but not Molecule 3 (Figs 2A  and 5A). Third, the helical centers are misaligned at the exchange point (Figs 3B, 3C and 4C), which effectively aligns the off-center base pairs of the A-form duplex with the centered base pairs of the B-form duplex (Fig 5B).
This new structure is best examined in the context of previous structural studies of RNA: DNA/DNA:DNA junctions emulating Okazaki fragments, which include a chimeric (A) For a given base step, the parameter z P is the mean of the z-displacement of the two phosphorus atoms from the dimer's reference xy-plane. Note that z P is defined by a pair of dinucleotides, so there are only 11 data points for a 12-bp helix, and integral x-values lie between the base pairs in the diagram. This parameter was originally introduced for its utility in distinguishing A-form from B-form base steps. Black, DNA; red, RNA. (B) χ and δ are the two nucleotide torsion angles that best distinguish A-form from B-form geometry. Note that these torsion angles are defined for each individual nucleotide, so there are 24 data points for a 12-bp helix. Integers in red refer to individual nucleotides, as indicated in the schematic at the bottom. Dashed ellipses were drawn to match those depicted in [27]. (C) Y-displacement. Similar to z P , this parameter describes base steps (pairs of dinucleotides), not individual nucleotides. This parameter cannot distinguish A-form from B-form geometry. Instead, note that the base step across the exchange point dramatically departs from both A-form and B-form geometry.
https://doi.org/10.1371/journal.pone.0263547.g004 (covalently continuous) RNA-DNA strand. When crystallized, these fragments assumed an entirely A-form conformation, even within the DNA:DNA duplex [28][29][30][31][32]. However, in solution, Okazaki fragments resembled the present structure in that they were A-like within the RNA:DNA helix and B-like within the DNA:DNA helix [22,[33][34][35][36]. Solution structures also exhibited a tilt between the RNA:DNA/DNA:DNA helical axes and intermediate nucleotide geometry within the DNA of the hybrid. Because intermediate geometry is a known feature of the DNA of any RNA:DNA hybrid [19,20], it may be the natural inclination of this more geometrically ambiguous strand to accommodate the A-to-B transition as it does in the present structure. Notably, dramatic misalignment of the RNA:DNA/DNA:DNA helical centers is observed only in the present structure and is likely enabled by the break in the exchanging strands, which is not a feature of Okazaki fragments.
Because stable stacking of another duplex on a DNA:DNA terminus is expected to inhibit duplex melting [37], the structural principles illuminated here may explain the rigidity that we previously observed in the DNA:DNA duplex of RNA-5 0 /DNA-3 0 exchange junctions [16]. However, it is also possible that different sequences or environments promote different conformational preferences than those observed in this crystal structure. Previously, we also observed that the DNA:DNA duplex in junctions of the opposite polarity (RNA-3 0 /DNA-5 0 ) is destabilized relative to a non-exchanging terminus [16]. Unfortunately, because that junction type failed to crystallize under our tested conditions, this odd asymmetry in junction structure remains unexplained.
Nevertheless, the stacked RNA-5 0 /DNA-3 0 structure determined here represents a key conformation that is likely populated throughout RNA/DNA exchange events, including those mediated by the genome-editing protein Cas9. Branch migration is crucial to Cas9 target search, which involves repeated R-loop formation (RNA invades a DNA:DNA duplex) and resolution (DNA invades an RNA:DNA duplex) until the true target is located [15]. During this process, the leading R-loop edge likely passes through interhelically stacked states between base pair formation and breakage events. Consistent with this prediction, in some cryo-electron microscopy structures depicting Cas9-bound R-loops, the leading (RNA-5 0 /DNA-3 0 ) Rloop edge appeared interhelically stacked [38,39]. While local resolution was insufficient to enable accurate atomic modeling of the exchange junction from the original electron microscopy maps, our high-resolution crystal structure provides a new geometric standard for modeling this kind of junction.
Importantly, exchange junctions are dynamic structures, and each time an R-loop grows or shrinks, stacking must be disrupted at the junction [8]. Thus, in addition to the stacked structure determined here, which can be interpreted as a ground state, strand exchange also requires passage through unstacked conformations, some of which may resemble the junction structures seen in other Cas9-bound R-loops [40,41]. A complete model of RNA/DNA strand exchange, then, will rely on a structural and energetic understanding of the junction in both stacked and unstacked states, and it will account for the effects of the proteins acting in R-loop formation and resolution.

Crystallization and data collection
Initial screens were performed using Nucleix and Protein Complex suites (Qiagen) in a sitting-drop setup, with 200 nL of sample added to 200 nL of reservoir solution by a Mosquito instrument (SPT Labtech) and incubated at either 4˚C or 20˚C. Several conditions yielded crystals within one day, and initial hits were further optimized at a larger scale. The crystal used for the final dataset was produced as follows: 0.5 μL of sample was combined with 0.5 μL reservoir solution (0.05 M sodium succinate (pH 5.3), 0.5 mM spermine, 20 mM magnesium chloride, 2.6 M ammonium sulfate) in a hanging-drop setup over 500 μL reservoir solution, and the tray was stored at 20˚C. Crystals formed within one day and remained stable for the 2.5 weeks between tray setting and crystal freezing. A crystal was looped, submerged in cryoprotection solution (0.05 M sodium succinate (pH 5.3), 0.5 mM spermine, 20 mM magnesium chloride, 3 M ammonium sulfate) for a few seconds, and frozen in liquid nitrogen. Diffraction data were collected under cryogenic conditions at the Advanced Light Source beamline 8.3.1 on a Pilatus3 S 6M (Dectris) detector.

Data processing, phase determination, and model refinement
Preliminary processing of diffraction images was performed in XDS [43,44]. Unmerged reflections underwent anisotropic truncation, merging, and anisotropic correction using the default parameters of the STARANISO server (v3.339) [45], and a preliminary structural model was included in the input to estimate the expected intensity profile. The best-fit cut-off ellipsoid imposed diffraction limits of 1.66 Å, 2.18 Å, and 1.64 Å based on a cut-off criterion of I/σ(I) = 1.2. The "aniso-merged" output MTZ file was used for downstream processing. Using programs within CCP4 (v7.1.015), R free flags were added to 5% of the reflections, and reflections outside the diffraction cut-off surface were removed.
Phases were determined by molecular replacement with Phaser-MR [46], as implemented in Phenix v1.19.2-4158 [47]. The search model comprised two components (unconstrained with respect to each other), both generated in X3DNA v2.4 [48] and each representing one half of the base-paired portion of the crystallization construct. The first component was a 6-base-pair RNA:DNA duplex with perfect A-form geometry and sequence 5 0 -GCUUAC-3 0 / 5 0 -GTAAGC-3 0 (created using the program "fiber" with the -rna option, followed by manual alteration of the DNA strand in PyMOL v2.4.1). The second component was a 6-base-pair DNA:DNA duplex with perfect B-form geometry and sequence 5 0 -GATGCT-3 0 / 5 0 -AG CATC-3 0 (created with "fiber" option -4). Successful phasing was achieved by searching for three copies of each of these components (six components total). Additional phosphodiesters and nucleotides were built in Coot v0.9.2 [49], and the model underwent iterative refinements in Phenix. Phasing and preliminary refinements were initially performed using an earlier (lower-resolution) dataset that had similar unit cell parameters to the final dataset described above.
The initial model, which was refined into a map generated from the earlier dataset, was rigid-body docked into the final-dataset-derived map and underwent further iterative refinements, beginning with resetting of the atomic B-factors, simulated annealing, and addition of ordered solvent. Non-crystallographic symmetry restraints were applied in early rounds of refinement to link the torsion angles of the three molecules within the asymmetric unit; these restraints were removed in the final rounds of refinement. TLSMD [50,51] was used to determine optimal segmentation for Translation/Libration/Screw (TLS) refinement (each 7-mer comprised a separate segment, and the 12-mers were each divided into three segments: nucleotides 1-4, 5-8, [9][10][11][12]. Refinement using Phenix's default geometry library yielded dozens of bond lengths and angles that were marked as outliers by the PDB validation server, so the faulty parameters were rigidified ad hoc (that is, their estimated standard deviation values in the library files were made smaller, with no change to the mean values). The final three cycles of refinement were performed in Phenix with adjustments to XYZ (reciprocal-space), TLS (segments as indicated above), and individual B-factors. In Table 1, STARANISO and Phenix were used to calculate the data collection statistics and the refinement statistics, respectively. The composite omit map displayed in S1 Fig was generated by Phenix's CompositeOmit job ("anneal" method; 5% of atoms omitted in each group; missing F obs left unfilled; R free -flagged reflections included).
The final R free value (0.284) is higher than expected for a structure refined using diffraction data at a resolution of 1.6 Å [52]. However, it is important to note that the highest-resolution shell has a completeness of just 6%, and completeness only rises above 95% at~2.3 Å, due mostly to the anisotropic nature of the diffraction data. Additionally, due to diffraction anisotropy, the 2mF o -DF c map appears distorted along certain dimensions, affecting interpretation of Molecule 3 most negatively. Therefore, the geometric details of Molecule 3's phosphate backbone are poorly constrained, and Molecule 1 or 2 should instead be considered as the most accurate representation of the structure. Anisotropy also prevented identification of water molecules around Molecule 3. Furthermore, the mF o -DF c map revealed several globular patches of positive density in the major and minor grooves of all molecules, 3.5-4 Å away from the nearest nucleic acid atom. Because these patches bore no recognizable geometric features, attempts to model them with buffer components failed to improve R free , so they were left unmodeled. Any of the mentioned issues may contribute to the high R free value.
Beyond the anisotropy, the overall high B-factors in this structure produce 2mF o -DF c density that is "blurred" (S1 Fig) [53]. To enhance high-resolution features of the map for visual inspection and figure preparation, Coot's Map Sharpening tool was used. B-factor adjustments used for sharpening are reported in the figure legend. Sharpening only effectively revealed high-resolution features for Molecule 1 or 2, as density from Molecule 3 is too anisotropically distorted.

Structure analysis and figure preparation
Structural model and map figures were prepared in PyMOL. Alignments were performed using PyMOL's "align" function without outlier rejection. Regularized A-form and B-form DNA:DNA duplexes were prepared using X3DNA's "fiber" program (options -1 and -4, respectively), using the same sequence present in the helical portion of the crystallization construct (except RNA was modeled as the corresponding DNA sequence). While the A-form DNA:DNA helix may not perfectly represent a regularized version of the RNA:DNA helix with our sequence [19,20], "fiber" does not permit generation of RNA:DNA helices with generic sequence, and the general geometric features of A-form DNA:DNA vs. A-form RNA:DNA are expected to be similar enough to support the conclusions drawn in this work. Base step and nucleotide geometric parameters were calculated using the "find_pair" and "analyze" programs within X3DNA. On graphs of these parameters, dashed lines indicating the expected value for A-form or B-form DNA were calculated by performing an equivalent analysis on the X3DNA-generated regularized A-form/B-form helices and taking the average across all base steps/nucleotides, unless indicated otherwise. Nucleotides with A/B character exhibit a spread of values around those indicated by the dashed lines (as represented more accurately by the dashed ellipses in Fig 4B), and the dashed lines are drawn merely to guide the reader's eye to general trends. Angles between the helical axes of the DNA:DNA and RNA:DNA duplex were calculated as the angle between the helical axis vectors of the aligned regularized A-form and B-form helices. Graphs were prepared using matplotlib v3.3.2 [54]. Final figures were prepared in Adobe Illustrator v25.4.1.