The Structure of an RNAi Polymerase Links RNA Silencing and Transcription

RNA silencing refers to a group of RNA-induced gene-silencing mechanisms that developed early in the eukaryotic lineage, probably for defence against pathogens and regulation of gene expression. In plants, protozoa, fungi, and nematodes, but apparently not insects and vertebrates, it involves a cell-encoded RNA-dependent RNA polymerase (cRdRP) that produces double-stranded RNA triggers from aberrant single-stranded RNA. We report the 2.3-Å resolution crystal structure of QDE-1, a cRdRP from Neurospora crassa, and find that it forms a relatively compact dimeric molecule, each subunit of which comprises several domains with, at its core, a catalytic apparatus and protein fold strikingly similar to the catalytic core of the DNA-dependent RNA polymerases responsible for transcription. This evolutionary link between the two enzyme types suggests that aspects of RNA silencing in some organisms may recapitulate transcription/replication pathways functioning in the ancient RNA-based world.


Introduction
RNA silencing, or RNA interference (RNAi), refers to a group of RNA-induced gene silencing mechanisms that developed early in the eukaryotic lineage and play essential roles in cellular immunity, modulation of chromatin structure, and development [1][2][3]. RNAi can induce transcriptional gene silencing (TGS) via chromatin repression or posttranscriptional gene silencing (PTGS) by degradation of target RNAs. RNA silencing pathways use double-stranded RNA (dsRNA) triggers, processed by Dicer, to form short interfering RNAs (siRNAs) of 21-25 nucleotides [4]. One siRNA strand is recruited by an effector complex containing the Argonaute protein and used as a guide for sequencespecific degradation of target mRNAs (in PTGS) or directed silencing of cognate chromatin domains (in TGS) [5,6]. A cellencoded RNA-dependent RNA polymerase (cRdRP) is also involved, in plants, fungi, protozoa, and certain animals, but apparently not insects and vertebrates, producing dsRNA triggers and hence amplifying the PTGS response [7][8][9]. Furthermore, cRdRPs may also interact with the cellular transcription apparatus and effect chromatin silencing [10].
In order to understand the functionality of the QDE-1 in more detail, we have determined the high-resolution crystal structure. This reveals the molecule to be dimeric and containing at its core two subdomains responsible for catalysis. These subdomains each have the topology of double-psi b-barrels (DPBBs), and are similar to (and disposed in a similar fashion to) two separate subunits in the DNA-dependent RNA polymerases (DdRPs) that perform transcription across the domains of life. The structure not only suggests how the molecule might efficiently produce dsRNA triggers, but may also add a piece to the jigsaw that relates the RNA world to the DNA world, and provide a model for all cellular RdRPs.

QDE-1 DN Is a Dimer with an Active Site Formed from DPBBs
The structure of QDE-1 DN was initially solved by multiwavelength anomalous diffraction (MAD) analysis of a crystal of selenomethionated protein, expressed in yeast, to a resolution of 3.2 Å [13]. The structure was then refined (C) View of presumed active site. The two DPBBs that form the active cleft, DPBB1 (residues 680-782) and DPBB2 (residues 916-1,018), are labelled. Sequence motifs conserved across other cRdRPs are highlighted: motif 1, red; 2, orange; 3, dark yellow, 4, purple; 5, dark pink; 6, bright pink; and 7, blue (see Figure S1). Invariant residues in cRdRPs are shown in ball-and-stick representation (O, red; and N, blue); Mg 2þ ion shown as green sphere. (D) Zoom of active site region, with conserved residues labelled. Representation is as for (C). DOI: 10.1371/journal.pbio.0040434.g001 against a higher resolution dataset extending to 2.3 Å . These crystals belong to space group P2 1 , and the crystallographic asymmetric unit contains two subunits, which together form a compact pyramidal object with dimensions of 90 3 70 3 127 Å 3 . The two constituent subunits are related by an approximate 2-fold axis and are in such intimate contact (over 2,000 Å 2 of contact area per subunit) that we would expect this to represent a functional dimer [14]. This was confirmed by gel filtration and sedimentation assays, with a molecular weight (MW) approximately equal to 230 kDa (compared to a predicted ;120 kDa/monomer). QDE-1 DN has 1,026 residues, and the final model contains 933 residues in subunit A and 930 residues in subunit B (residues are missing from the N-and C-termini and from some flexible loops, see Figure 1). The refined structure at 2.3 Å has an R work of 21.7% (R free ¼ 26.4%), and the stereochemistry is good (root mean square deviation [rmsd] bond ¼ 0.013 Å , rmsd angle ¼ 1.98, 2.3% of residues are in disallowed regions of the Ramachandran plot; Table 1). In addition, a lower resolution structure (3.5-Å resolution) was determined in space group C2 (see Table 1). In space group C2, the dimer is formed by two subunits related by exact crystallographic 2-fold symmetry. Both crystal forms use the same residues to stabilize the dimer, and these residues are primarily contributed by the head domains (head domain contacts account for 1,720/1,710 Å 2 of the contact area of 2,215/2,100 Å 2 per subunit in the P2 1 /C2 space groups, respectively [15]). Each QDE-1 subunit contains 41 a-helices and 25 b-strands, creating a four domain fold previously undescribed for RdRPs ( Figure 1A and 1B). Distal to the molecular twofold axis is a mixed a/b ''slab'' domain composed of the approximately 250 N-terminal residues (390-646). The polypeptide chain then leads into a ''catalytic'' domain (residues 647-807; 914-1,161) that houses the three proposed catalytic aspartic acid residues [9] (D1007, D1009, and D1011; Figure  1C and 1D) within an active site cleft formed between two DPBB [16] subdomains (DPBB1, residues 690-792, and DPBB2, residues 916-1,018). The catalytic domain also contains a separate, mainly a-helical, ''flap'' subdomain (residues 1,025-1,161), peripheral to the active site cleft. The ''neck'' domain comprises three long a-helices (residues 808-836, 817-913, and 1,162-1,195), which lie close to the molecular twofold axis, connecting the catalytic domain to the mainly a-helical ''head'' domain (residues 837-888 and 1,196-1372; Figure 1A and 1B).
Multiple alignment of cRdRP amino acid sequences reveals the presence of seven motifs containing invariant residues: motifs 1-3 map to DPBB1, 4-6 to DPBB2, and 7 corresponds to a-helices 29 and 30 at the inner face of DPBB2 ( Figures 1B,  1D, and S1). The conserved motifs therefore cohere in three dimensions, with at their heart the proposed catalytic aspartates that reside in a loop (residues 1,007-1,011, motif 6) in DPBB2 at the interface of the two b-barrels. Analysis of electron density maps showed a Mg 2þ ion coordinated by these aspartic acid side chains (the functional significance of this is underscored by the observation that the QDE-1 polymerase activity is dependent on divalent cations [9]). DPBB1 contributes several positively charged residues to the active cleft, which include three invariant residues: Q736, K743 (motif 2), and K767 (motif 3). Together these establish a network of hydrogen bonds with water molecules, linking the two DPBB subdomains ( Figure 1D). Although we have been unable to determine structures for complexes of QDE-1 with RNA and/or nucleotides (NTPs), we are able to infer functional aspects from the architecture of the uncomplexed molecule and its unexpected similarity to other polymerases, as discussed below.
The QDE-1 DN molecule has several distinct channels and cavities: first, there is a channel formed between the slab and head of each subunit, which is highly positively charged and leads to the active site ( Figure 2A); we propose that these channels accommodate dsRNA product. Second, there is a small, negatively charged tunnel at the bottom of each subunit (formed between the flap and DPBB subdomains), which communicates with the active site and may be a route of entry for NTPs ( Figure 2A). Third, there is a single tunnel, for which we cannot propose a function, formed between the neck domains and the catalytic domains, bridging the two active sites. The proposed dsRNA product binding channels are not identical in the two subunits, because the disposition of the domains is different: subunit A has a ''closed'' conformation, with the head and slab clamped down on the active site cleft, whereas an 118 rotation of the head and 28 rotation of the slab render the B subunit more open and provide space for an RNA duplex ( Figure 2B and 2C). In the lower resolution C2 crystal form, the molecule forms a symmetric dimer. Here, the subunits both assume a partially closed conformation (the head is rotated by 48 relative to A and 88 to B, and the slab is displaced upwards and outwards by 48 relative to A and 28 to B). Overall, QDE-1 exists as a compact, but flexible, dimeric enzyme with metal binding sites confirming positions of the catalytic sites but with extensive additional structure whose biological functions are less immediately clear.

QDE-1 Active Site Is Closely Similar to Those of DdRPs
A search using the QDE-1 DN active site residues (ASSAM [17]) identified similarities with the yeast [18] and bacterial [19] DdRPs. Superposition revealed that the DPBB subdomains in QDE-1 and the DdRPs are structurally very similar and almost identically disposed (;108 change in the relative positions of the subdomains; Figure 3). The DdRPs have a DPBB in each of two largest subunits (b9 and b subunits in the bacterial enzyme, and Rbp1 and Rbp2 subunits in yeast RNA pol II), with the first contributing the catalytic aspartates to the active site and the other a set of positively charged residues [18]. In QDE-1 DN, the two subdomains have a similar segregation of chemical roles, but they are arranged sequentially on a single polypeptide chain. The similarity of the DPBB bearing the catalytic aspartates between the cRdRPs and DdRPs had been predicted from sequence  Figure 1C). Structurally equivalent residues in QDE-1 DN are coloured dark purple (non-equivalent residues in light purple). Equivalent residues in yeast DPBBs are coloured green (non-equivalent residues in semi-transparent grey). QDE-1 DN and yeast (D481, D483, and D485) active site aspartates are coloured yellow and green respectively. DOI: 10.1371/journal.pbio.0040434.g003 analysis [20]; however, there is no sequence homology detectable in the second DPBB or elsewhere in the molecule (see Figure 4). Superposition matches 81 residues of DPBB2 in QDE-1 DN with the bacterial b9 DPBB and 85 residues with the yeast Rbp1 DPBB (2.2 and 2.1 Å rmsd in Cas, respectively). DPBB1 is somewhat less similar to the homologous domain in the DdRPs (74 and 67 residues matched with rmsd of 3.0 and 3.1 Å for bacterial and yeast, respectively; Figure 3). The QDE-1 DN catalytic aspartates lie within 1.4 Å of the bacterial and yeast catalytic residues, with equivalent Mg 2þ coordination to metal A in the yeast enzyme [18]. Moreover, the bridge helices in DdRPs, proposed to be important for nucleic acidprotein interactions during translocation of the duplex [18,21], are structurally equivalent to helices 27 and 28 in QDE-1 DN, suggesting a similar role. Superposition onto QDE-1 of the structure of yeast DdRP complexed with an RNA-DNA duplex [21] maps the duplex into the putative RNA product groove in QDE-1 ( Figure 2B) and the proposed NTP tunnel matches well with that proposed in yeast RNA pol II [22]. Intriguingly, only about ten base pairs of duplex RNA can be modelled into this groove without severe steric clashes with the head domain, suggesting a steric basis for modulating the length of RNA synthesised ( Figure 2C).
Overall, the DdRP structures are much more elaborate than QDE-1 DN, and we can detect no significant structural similarity beyond the vicinity of the active site. Nevertheless, there may be some functional relationship at the level of protein domains. The flexible head that forms part of the proposed QDE-1 product groove might be equivalent to the clamp in yeast Rbp1 [18]: the head domain closing down on the slab to stabilize the RNA product during polymerisation ( Figure 2B). The yeast Rbp2 protrusion-lobe is also displaced during transcription [18] to accommodate and stabilize the RNA-DNA product, and the somewhat flexible QDE-1 slab may play a similar role. These observations suggest a similar mode of action for the DdRP and cRdRPs, a view reinforced by the proposal that a human DdRP, Pol II, is involved in replication of hepatitis D RNA [23].

QDE-1 May Act as a Two-Stroke Motor for the Production of dsRNA Triggers
QDE-1 DN is active in solution [9], and we have shown here that this form is dimeric. What, if anything, might the functional significance of the dimer be? Three roles for cRdRPs have been discussed: (1) conversion of aberrant RNAs into dsRNA triggers for PTGS, (2) primer-dependent amplification of RNAi triggers, and (3) chromatin silencing [2,[8][9][10]. Reaction (1) presumably involves initiation at the 39 end of aberrant mRNA and processive RNA synthesis, whereas reactions (2) and (3) would require internal recruitment of the polymerase to its RNA targets. Both initiation modes function in vitro [9,24]. The more open conformation of subunit B would allow internal initiation, but conversely, the partial closure of the RNA product groove in the dimer might favour initiation at the 39 end of an RNA template. It seems likely that, in the dimer, only one catalytic site is active at any given time, represented by the closed conformation of subunit A, with the inactive subunit being held open. This suggests a mechanism whereby binding to one active site primes the other. Thus, the molecular architecture might favour the molecule working as a ''two-stroke motor,'' with facilitated active site switching as the subunits cycle back and forth between conformations in response to RNA binding. This model has two attractive features: first, by tethering the RNA template to the dimer, re-initiation will be efficient, and second, initiation at active site B can be coupled to the activity at active site A, by molecular switching, driven by the steric clashes with the rather stiff dsRNA product suggested by our modelling studies. Overall, this mechanism might lead to the effective production of appropriate-length dsRNA triggers.

A Polymerase for the RNA World
From the structure of QDE-1, it appears that all cellular RdRPs as well as DdRPs are related but structurally distinct from viral RdRPs, which have the ''right-hand'' architecture [25][26][27][28][29][30]. The similarity revealed here between the active sites of cRdRPs and the DdRPs indicates that they share a common ancestor. We refer to this family of enzymes as the ''doublebarrel'' polymerases. The RNAi polymerases present the entire active site on a single polypeptide chain, suggesting that they are closer to the ancestral protein than the present day transcription polymerases, which are composed of up to 12 separate polypeptide chains, and in general bear the two active site b-barrels on separate chains. Interestingly the DdRP of Helicobacter pylori also has its active site on a single chain, due to the fusion of the rpoB and rpoC genes, coding for the b and b9 subunits [31]. Indeed, it has been shown in vitro that a fused Escherichia coli b-b9 protein can assemble into a functional polymerase, although there is presumably some advantage in vivo for separate b and b9 chains [32]. Classifying the evolutionary relatedness of polymerases on the basis of their structural similarity, reveals that the ''right-handed'' and ''double-barrel'' polymerases are tightly grouped and quite separate from each other and also from Polb ( Figure  5A). The implied path for the evolution of the doublebarrelled RNA polymerases is shown in Figure 5B. The original enzyme presumably possessed a single DPBB, bearing the essential catalytic apparatus. This molecule may have taken over the role of RNA polymerisation, presumably from an RNA molecule, very early in evolution. Gene duplication led to a molecule with two DPBBs on a single polypeptide chain, which differentiated to produce a QDE-1-like molecule capable of efficient RNA polymerization, a key development in the elaboration of the RNA-based world. During this process, DPBB2, containing the active site aspartate residues, was conserved relatively strongly, whereas DPBB1, which acts as an accessory domain, essentially lost any detectable sequence conservation (Figure 4). The ability of such molecules to act on DNA could then underpin the switch to DNA as the repository of genomic information. Splitting of the DPBBs onto separate subunits would facilitate the radical evolution of this first DdRP into the complex, highly regulated, transcription machines with particular functions delegated to specialist subunits, which we observe today. In summary, it seems that this polymerase part of the RNA silencing machinery may give us a glimpse far back in time, providing insight into the evolution of a protein-based mechanism for the transmission of RNA genomic information in an RNA-based world.

Materials and Methods
Structure determination. The protein expression in yeast, purification, crystallization, and data collection of a cryo-cooled selenomethionated QDE-1 DN crystal (space group P2 1 ), and two native datasets (space group P2 1 and C2) have been described previously [13]. Briefly, a three-wavelength MAD experiment was performed using a cryo-cooled selenomethionated crystal of QDE-1 DN, to a resolution of 3.2 Å , on the Medical Research Council (MRC) MAD beamline, BM14, at the European Synchrotron Radiation Facility (ESRF) (Grenoble) using a MarCCD detector (Mar USA, Evanston, Illinois, United States). Subsequently, a native dataset to 2.3-Å resolution was collected on BM14 using a MarMosaic 225 CCD detector (Mar USA). The datasets from the three-wavelength MAD experiment were scaled and merged together and used in SnB [33] to solve the selenium substructure. The top 54 Se sites from SnB were refined using SOLVE [34] to obtain initial phases, and anomalous difference Fourier maps allowed the identification of a subset of 46 correct Se atoms that were refined using SHARP [35]. Phase improvement using density modification with RESOLVE [34] led to maps which, in combination with Se positions, revealed two articulated subunits, related by two slightly displaced non-crystallographic twofold axes, which precluded simple averaging. The initial RESOLVE model was completed using manual model building in O [36] and Coot [37], and automated model building and water placement with ARP/wARP [38]. The model was refined with REFMAC5 [39], using TLS refinement and imposing non-crystallographic restraints for the core regions of subunits, against the native P2 1 data to 2.3-Å resolution to yield the final model described in the main text and Table 1. The C2 crystal form with one molecule in the asymmetric unit was solved by molecular replacement and refined, keeping the domains as rigid bodies (AMORE [40], Table 1).
Structural alignments. The program ASSAM [17] was used to find structural similarities with the proposed three catalytic aspartates and the coordinating magnesium ion. Structures of DNA-directed RNA polymerase II largest subunit were amongst the best matches. Superimposition operators for yeast and bacterial DdRP models onto to QDE-1 DN were optimized using SHP [41].
Oligomeric  Figure S1. Conserved Sequence Motifs in Cellular RdRPs Multiple sequence alignment of a representative subset of cRdRPs. Amino acid sequences of 30 cRdRPs from fungi from the groups of Ascomycota (Schizosaccharomyces pombe, Spo; Neurospora crassa, Ncr; and Gibberella zeae, Gze) and Basidiomycota (Cryptococcus neoformans, Cne), slime molds (Dictyostelium discoideum, Ddi), dicot plants (Arabidopsis thaliana, Ath; Solanum tuberosum, Stu; and Nicotiana tabacum, Ntu), monocot plants (Oryza sativa, Osa), protozoa (Entamoeba histolytica, Ehi), and nematodes (Caenorhabditis elegans, Cel) were aligned using standard settings of ClustalW algorithm. Local alignment was improved by manual editing. N. crassa QDE-1 protein sequence is shown on the top. N. crassa contains two additional non-allelic cRdRP genes-SAD-1 (essential for meiotic silencing by unpaired DNA ) and RdRP-3-that likely function in distinct cellular pathways. Invariant residues are shaded in black; other residues with 80% or more conservation are shaded in grey. Conserved sequence motifs comprising invariant residues are outlined: motif 1, red; motif 2, orange; motif 3, dark yellow; motif 4, purple; motif 5, violet; motif 6, light pink; and motif 7, blue. QDE-1 secondary structure elements are shown on top, coloured according to domain definition (slab, blue; catalytic, deep purple; neck, pink; and head, red). The identified double-psi b-barrels DPBB1 and DPBB2 are outlined by deep purple boxes. The flap subdomain and the potential ''bridge helices'' are also represented by boxes, coloured light purple and grey, respectively. Found at DOI: 10.1371/journal.pbio.0040434.sg001 (914 KB DOC)

Accession Numbers
Coordinates and structure factors have been deposited in the Protein Data Bank (PDB; http://www.rcsb.org/pdb) as accession numbers 2J7N and 2J7O. The GenPept accession numbers for the genes and gene products mentioned in this paper are N. crassa QDE-1 protein sequence (EAA29811); dRP-3 (EAA34169); and N. crassa SAD-1 (AAK31733).