DNA recognition by an RNA-guided bacterial Argonaute

Argonaute (Ago) proteins are widespread in prokaryotes and eukaryotes and share a four-domain architecture capable of RNA- or DNA-guided nucleic acid recognition. Previous studies identified a prokaryotic Argonaute protein from the eubacterium Marinitoga piezophila (MpAgo), which binds preferentially to 5′-hydroxylated guide RNAs and cleaves single-stranded RNA (ssRNA) and DNA (ssDNA) targets. Here we present a 3.2 Å resolution crystal structure of MpAgo bound to a 21-nucleotide RNA guide and a complementary 21-nucleotide ssDNA substrate. Comparison of this ternary complex to other target-bound Argonaute structures reveals a unique orientation of the N-terminal domain, resulting in a straight helical axis of the entire RNA-DNA heteroduplex through the central cleft of the protein. Additionally, mismatches introduced into the heteroduplex reduce MpAgo cleavage efficiency with a symmetric profile centered around the middle of the helix. This pattern differs from the canonical mismatch tolerance of other Argonautes, which display decreased cleavage efficiency for substrates bearing sequence mismatches to the 5′ region of the guide strand. This structural analysis of MpAgo bound to a hybrid helix advances our understanding of the diversity of target recognition mechanisms by Argonaute proteins.


Introduction
Argonaute (Ago) proteins exist in all three domains of life [1,2]. In eukaryotes, Argonautes are the core component of the RNA interference (RNAi) effector complex. RNAi utilizes RNAguided messenger RNA (mRNA) binding to regulate gene expression at the transcriptional, post-transcriptional and translational levels [3][4][5]. Ago proteins form the RNA-induced silencing complex (RISC) of the RNAi pathway by binding to 20-30 nt microRNAs (miRNAs) or smallinterfering RNAs (siRNAs) for silencing of mRNAs. This form of post-transcriptional gene regulation occurs either by Ago-catalyzed cleavage of targeted transcripts or by translational silencing through the recruitment of proteins for deadenylation and mRNA decay [6]. In prokaryotes, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 ssDNA recognition and cleavage, we crystallized MpAgo bound to a 5 0 -hydroxylated 21-nucleotide RNA guide and a complementary 5 0 -phosphorylated 21-nucleotide DNA target to a resolution of 3.2 Å (PDB ID: 5UX0) ( Fig 1A and Table 1). In order to capture a target-bound structure, we mutated an aspartate residue to an alanine (D516A) in the enzyme's catalytic pocket to prevent DNA cleavage. The resulting structure revealed a conserved bi-lobed architecture formed by the N-terminal (green), PAZ (pink), MID (purple), and PIWI (blue) domains, and Linkers L1 (grey) and L2 (yellow) [8]. Nucleotides 1-20 of the guide RNA (orange) and 2-21 of the target DNA (red) are ordered with a straight helical axis passing through the central cleft of the protein (Fig 1B). The heteroduplex bound by MpAgo contains more ordered nucleotides, with 20 base pairs modeled within the duplex, than previously crystallized Ago complexes (S1 Fig) [31].
Comparing the ternary MpAgo complex to the previously crystallized binary complex reveals prominent conformational changes [12]. Recognition and subsequent binding of the DNA target results in movement of the N-lobe away from the C-lobe to accommodate the hybrid helix (Fig 2A). Within the MpAgo-RNA complex, the 5 0 end of the guide RNA is anchored into the MID domain, while the 3 0 end is bound to the PAZ domain. The MpAgo ternary complex shows that the 5 0 nucleotide of the guide remains bound to the MID domain and unpaired to the 3 0 terminal nucleotide C21 of the DNA target ( Fig 2B). The insertion of F410 between C20 and C21 disrupts the helical base stacking, splaying the terminal base away from the hybrid helix, while K279 of Linker L2 interacts with the phosphate backbone to stabilize the contorted 3 0 end of the DNA target. A similar disruption at the 3 0 end of the target strand by an aromatic residue is seen in both the RsAgo and TtAgo ternary complex structures (S2 Fig) [18,31]. In contrast to the 5 0 end of the guide, the 3 0 end is released from the PAZ domain to enable binding to the 5 0 end of the target. The unique kink that was identified in the MpAgo-RNA complex is no longer seen in the hybrid helix [12]. The PIWI domain of Argonaute proteins contains a conserved DEDX catalytic tetrad (where X can be His, Asp, or Asn) [22]. In the guide-bound state, the glutamate residue (called a glutamate finger) is positioned in a flexible loop between ß-strand 3 and α-helix 1 of the PIWI domain. Upon target binding, a conformational change repositions the glutamate finger to complete the catalytic tetrad for subsequent target cleavage. In the pre-cleavage state, ßstrands 1 and 2 of the PIWI domain block the entry of the glutamate into the catalytic site. The MpAgo ternary structure shows that ß-strands 1 and 2 twist away from ß-strand 3 to create a path for the glutamate finger to complete the catalytic tetrad (S3 Fig).
The long length of this heteroduplex revealed that the 5 0 region of the target DNA (positions 5-8, counting from the 5' end) is stabilized by a positively charged groove between the N-terminal and PAZ domains (S4 Fig). The geometry of the heteroduplex within the MpAgo ternary complex aligns more closely with a perfect B-form helix instead of an A-form helix (S5 Fig). With a modeled A-form helix, the target strand is no longer positioned near the charged cleft between N-terminal and PAZ domains. The formation of the charged groove within the N-lobe is established by a unique repositioning of the N-terminal domain closer to the PAZ domain. This N-lobe arrangement allows a unique, straight orientation of the hybrid helix not seen in other Argonaute ternary complexes.

N-terminal domain stabilizes linear conformation of RNA-DNA heteroduplex
The Argonaute protein from Rhodobacter sphaeroides (RsAgo) displays a preference for RNA guides and DNA substrates similar to MpAgo [31]. Comparison of our MpAgo ternary complex to the crystal structure of RsAgo bound to a hybrid helix reveals differences in both the orientation of the N-lobe and the trajectories of their guide-substrate heteroduplexes. Aligning MpAgo (colored by domain) and RsAgo (grey) relative to their PIWI domains, the most structurally conserved domain of Argonaute proteins, shows analogous positioning of the MID and PAZ domains, and Linkers L1 and L2 ( Fig 3A). shown that the N-terminal domain can act as a wedge and promote duplex unwinding [32]. In contrast, the N-terminal domain of MpAgo does not split the guide and target strands, but instead the helix remains intact through the central channel of the protein. We postulate that a combination of the unique orientation of the N-terminal domain and conserved DNA target-interacting residues results in a straight conformation of the heteroduplex for the 5 0hydroxylated guide RNA binding family of Ago proteins. In contrast, after two helical turns the directionality of the duplex held by RsAgo is diverted~40˚relative to the duplex of MpAgo ( Fig 3C).

MpAgo displays a symmetric tolerance for mismatches
Previous studies of MpAgo showed that single nucleotide mismatches between the guide and substrate strands at positions 5, 7, and 8 within the guide sequence reduced cleavage efficiency of a DNA target [12]. This suggested that the seed region of MpAgo may differ from the canonical seed region of other Ago proteins, nucleotides 2-8 of the guide strand [25]. To investigate this noncanonical seed region and mismatch tolerance, we introduced dinucleotide mismatches individually along the length of the guide RNA sequence (S1 Table). Dinucleotide mismatches up to positions 3 and 4 were tolerated with minimal to no decrease in cleavage efficiency, defined as the percent of DNA target cleaved after 30 minutes (Fig 4A). Introducing mismatches at positions 4 and 5 decreased cleavage efficiency to~10% of that observed for a fully matched guide-substrate duplex, which gradually dropped to~0% with mismatches introduced around the cleavage site, between positions 10 and 11 of the substrate DNA. The 3 0 half of the guide strand displayed a symmetric mismatch tolerance profile similar to that of the 5 0 half. Cleavage efficiency gradually increased to~15% at positions 15-16, after which any dinucleotide mismatch did not significantly impact cleavage efficiency. The rate of target cleavage followed a similar symmetric profile, only differing with a decrease in cleavage rate with mismatches at positions 2-3 and 3-4 (S11 Fig and S2 Table). Comparing MpAgo to the mismatch tolerance of TtAgo suggests that the straight orientation of MpAgo's heteroduplex may play a role in the cleavage efficiency with mismatches [15]. Similar to MpAgo, single nucleotide mismatches around position 9-10 abolish cleavage efficiency of an RNA target. In contrast, mismatches after position 11 do not affect target cleavage. The guide-substrate duplexes of these two Ago complexes diverge structurally after position 11 of the TtAgo DNA guide, with MpAgo maintaining a linear conformation compared to the angled conformation observed for TtAgo (S8 Fig). Reduced cleavage efficiency occurs when dinucleotide mismatches are introduced at positions 5-15 of the MpAgo RNA guide. This region of the guide is positioned underneath Linker L2 and the PAZ domain, suggesting that Linker L2 and the PAZ domain assist in stabilizing heteroduplex in the appropriate position for cleavage (Fig 4B).

Discussion
Although Argonaute proteins across all three domains of life share a conserved, fourdomain architecture, their endogenous functions and mechanisms of action appear to differ. The crystallization of bacterial Argonautes TtAgo and RsAgo, and human Argonaute hAgo2, bound to their respective guide-substrate homo-or heteroduplexes have provided structural insights into preferential binding of guide and target strands. Here we present a structural analysis of a divergent bacterial Argonaute, MpAgo, bound to an RNA-DNA helix, revealing unique domain orientations and a distinct conformation of the helical substrate. These features extend the diversity of target recognition mechanisms observed for Argonaute proteins.
Argonaute proteins exhibit preferential binding to either RNA or DNA guide and substrate strands through various specific and non-specific interactions. Previous biochemical experiments showed that MpAgo binds RNA guides and preferentially targets DNA [12]. Analysis of the MpAgo ternary complex crystallized in this study revealed that the bound RNA-DNA heteroduplex adopts a B-form-like helical geometry. RNA-DNA hybrids naturally adopt an Aform geometry, which is less energetically stable than the A-form geometry adopted by RNA-RNA duplexes [33][34][35]. This implies that MpAgo deforms the RNA-DNA heteroduplex into a B-form-like geometry, and may explain why RNA targeting displays decreased cleavage efficiency relative to DNA targets. The target strand of the helix appears to interact with a positively charged cleft at the interface of the N and PAZ domains. Charged residues within this groove interact with the backbone of the DNA target, whereas an A-form-like helix may not be appropriately positioned for these stabilizing interactions. Although RsAgo also uses RNA guides to target DNA, the heteroduplex of the RsAgo ternary complex does not adopt a linear conformation. We speculate that this may be due to the position of the RsAgo N domain closer to the PIWI domain, which stabilizes a bent conformation of the heteroduplex. RsAgo may also not require the same mismatch tolerance as MpAgo, which we suggest is affected by the linear versus bent conformation of the duplex.
In addition to generating a charged cleft that stabilizes a DNA target strand, the positioning of the N domain close to the PAZ domain also induces a unique linear helical axis of the heteroduplex. In contrast, both TtAgo and RsAgo ternary complexes have N domains that are angled away from the PAZ domain. The helical substrates of these Argonautes bend after the second helical turn, which corresponds with the position of the N-terminal domains. In the case of TtAgo, the N domain acts as a wedge to block guide-target pairing and divert the target strand after position 11 [17]. This "passive" form of wedging has been proposed to correctly position target strands for cleavage, while an "active" form of wedging assists in separating miRNA duplexes bound to hAgo2 [32]. Similar to RsAgo, the N domain of MpAgo does not splay the heteroduplex, but instead stabilizes the helix through interactions with the target strand [31]. The absence of wedging by the MpAgo N domain may be necessary for the appropriate positioning of the target strand for cleavage. Alternatively, the endogenous guides of MpAgo may be loaded as single-stranded RNA (ssRNA) and thus wedging would not be necessary to actively unwind a duplex for passenger strand removal. When tiling dinucleotide mismatches along the length of the RNA-DNA hybrid helix, we observe that mismatches around the center of the guide RNA (positions 5 to 15) significantly inhibit cleavage efficiency, whereas the 5 0 and 3 0 ends display a strong tolerance for mismatches. The 3 0 supplementary region (positions 13-16) of guide RNAs have been shown to be important for target recognition [36]. Our mismatch data confirm this observation, with mismatches in this region exhibiting decreased cleavage efficiency. In contrast, TtAgo displays a less symmetric mismatch tolerance profile, where mismatches within the 5 0 region (positions 4 to 11) of the guide reduce cleavage efficiency and mismatches within the 3 0 region (positions [13][14][15][16][17][18][19] show no effect [15]. The region where mismatches do not affect cleavage occurs within the bent portion of the helix. We hypothesize that the symmetric mismatch profile of MpAgo may be a result of the linear orientation of the heteroduplex, which places nucleotides 5 to 15 of the guide strand underneath the PAZ domain and Linker L2. The target-bound MpAgo structure extends our current understanding of Argonaute diversity. This and related structures also highlight the differences between Argonaute proteins and the RNA-guided CRISPR-Cas (clustered regularly interspaced short palindromic repeats)-(CRISPR associated) effector proteins [37]. CRISPR-Cas enzymes have been widely adopted for applications involving RNA-guided nucleic acid recognition and cleavage [38], raising the possibility of similar biotechnological adaptation of Argonautes [39]. In contrast to CRISPR-Cas enzymes, however, Argonaute enzymes including MpAgo do not catalyze guide-directed dsDNA cleavage and they cannot unwind or displace a duplex substrate. Nonetheless, Argonaute proteins have the potential to be employed for ssDNA and ssRNA detection and cleavage, and may have different tolerance for guide strand length and mismatches to substrate strands based on available data. Additionally, despite a preference for DNA targeting, the ability to bind RNA targets may enable use of MpAgo and related enzymes for intracellular RNAtracking and RNA pulldowns. The natural diversity of Argonaute proteins and their widespread occurrence across phylogeny implies adaptation for a variety of biological functions that have yet to be determined.

Cloning and purification of MpAgo
The sequence encoding M. piezophila Argonaute (MpAgo) was codon-optimized for expression in E. coli and cloned into a custom pET-based expression vector using ligation independent cloning. The cloned construct encodes a fusion protein containing an N-terminal His 10tag followed by an Asn 10 -linker, Maltose Binding Protein (MBP), and a PreScission protease cleavage site. For crystallization, the D516A mutation for crystallography was introduced using QuikChange site-directed mutagenesis and verified by DNA sequencing.
The wildtype and mutant proteins were expressed in E. coli strain BL21(DE3) (New England Biolabs). For protein expression, cells were grown in TB medium to an OD 600 of 0.8, expression was induced by addition of IPTG to 0.5 mM final concentration, and cells were incubated at 16˚C while shaking for 16 h. The cell pellets were resuspended in 50 mM Tris-HCl pH 7.5, 300 mM NaCl, 1 mM TCEP, 0.5% (v/v) Triton-X 100, 10 mM imidazole, and supplemented with Complete protease inhibitor cocktail tablets (Roche). Cells were lysed via sonication and clarified lysate was bound in batch to Ni-NTA agarose (Qiagen). The resin was washed with 50 mM Tris-HCl pH 7.5, 300 mM NaCl, 1 mM TCEP, and 10 mM imidazole and bound protein was eluted in wash buffer containing 300 mM imidazole. The His 10 -MBP affinity tag was removed by cleavage with PreScission protease, while the protein was dialyzed overnight at 4˚C against 50 mM Tris-HCl (pH 7.5), 300 mM NaCl, 1 mM TCEP, 5% (v/v) glycerol, and 10 mM imidazole. The cleaved MpAgo protein was separated from the fusion tag by ortho/reverse Ni-NTA. The protein was dialyzed into 50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 1 mM TCEP, 5% (v/v) glycerol, applied to a 5 ml Heparin HiTrap column (GE Life Sciences), and eluted with a linear gradient of 0.15-1.2 M NaCl. Final purification was achieved by size exclusion chromatography using a HiLoad 16/60 Superdex 200 column (GE Life Sciences) in 50 mM Tris-HCl pH 7.5, 300 mM NaCl, 1 mM TCEP, and 5% (v/v) glycerol. Eluted protein was concentrated, flash-frozen in liquid nitrogen, and stored at −80˚C.

Oligonucleotide purification
All DNA and short RNA oligonucleotides were purchased from Integrated DNA Technologies (IDT). All synthetic oligonucleotides used for cleavage assays and crystallography experiments were gel-purified and quality-checked by Urea-PAGE prior to use

In vitro cleavage assays
Purified oligonucleotide target DNA (10 pmol) was radiolabeled using T4 polynucleotide kinase (PNK) (NEB) and [γ-32 P] ATP (Perkin Elmer) in 1× T4 PNK buffer (NEB) at 37˚C for 30 min. The T4 PNK was heat inactivated at 65˚C for 20 min. The labeling reactions were purified with illustra MicroSpin G-25 columns (GE Life Sciences). For single turnover experiments, MpAgo-RNA complexes were reconstituted by mixing 1 nM MpAgo with 1 nM guide strand in 10 mM Tris-HCl (pH 7.5), 150 mM NaCl, 2 mM MnSO 4 , 2 mM DTT, 5% (v/v) glycerol and incubating at 37˚C for 30 min. Cleavage reactions were initiated by addition of 0.1 nM radiolabeled DNA or RNA substrates and performed at 60˚C. 10 µl aliquots were removed at various time points and quenched by mixing with an equal volume of formamide gel loading buffer supplemented with 50 mM EDTA. Cleavage products were resolved by 12% (v/v) Urea-PAGE and visualized by phosphorimaging. Cleavage experiments were tested in three independent experiments. Percentage of cleavage was analyzed by densitometry using ImageQuant (GE Healthcare) and the average of three independent experiments was plotted against time (Prism). X-ray diffraction data were processed with XDS and merged in AIMLESS [40] using the SSRL autoxds script (A. Gonzalez, SSRL). Indexed crystals belonged to the space group P 2 1 2 1 2 1 with two copies of MpAgo in the asymmetric unit. The MpAgo-RNA complex (PDB ID: 5I4A) was used as a model for molecular replacement using the Phaser-MR program within PHENIX [41,42]. An initial electron density map was used for iterative building with Coot and refinement with PHENIX until all interpretable electron density was modeled [43].