Molecular Basis for Nucleotide Conservation at the Ends of the Dengue Virus Genome

The dengue virus (DV) is an important human pathogen from the Flavivirus genus, whose genome- and antigenome RNAs start with the strictly conserved sequence pppAG. The RNA-dependent RNA polymerase (RdRp), a product of the NS5 gene, initiates RNA synthesis de novo, i.e., without the use of a pre-existing primer. Very little is known about the mechanism of this de novo initiation and how conservation of the starting adenosine is achieved. The polymerase domain NS5PolDV of NS5, upon initiation on viral RNA templates, synthesizes mainly dinucleotide primers that are then elongated in a processive manner. We show here that NS5PolDV contains a specific priming site for adenosine 5′-triphosphate as the first transcribed nucleotide. Remarkably, in the absence of any RNA template the enzyme is able to selectively synthesize the dinucleotide pppAG when Mn2+ is present as catalytic ion. The T794 to A799 priming loop is essential for initiation and provides at least part of the ATP-specific priming site. The H798 loop residue is of central importance for the ATP-specific initiation step. In addition to ATP selection, NS5PolDV ensures the conservation of the 5′-adenosine by strongly discriminating against viral templates containing an erroneous 3′-end nucleotide in the presence of Mg2+. In the presence of Mn2+, NS5PolDV is remarkably able to generate and elongate the correct pppAG primer on these erroneous templates. This can be regarded as a genomic/antigenomic RNA end repair mechanism. These conservational mechanisms, mediated by the polymerase alone, may extend to other RNA virus families having RdRps initiating RNA synthesis de novo.


Introduction
Most RNA viruses maintain the specific sequences present at the ends of their genomes. The 59 genome end may carry a cap structure to ensure both genome stability and efficient translation [1]. The 39-end may carry a poly(A) tail or adopt specific 39-end sequences required for viral replication [2,3]. They are generally copied exactly to avoid loss of genetic information, and have supposedly evolved towards optimum replication efficiency. Terminal genome damage can be caused by errors introduced by the viral polymerase during initiation and termination, or by cellular ribonucleases [4]. In addition to special mechanisms to ensure efficient initiation of RNA synthesis, viruses have evolved mechanisms to repair or correct damaged extremities such as the use of abortive transcripts as primers, the generation and use of non-templated primers, and the addition of one or few nontemplated nucleotides to the 39-end by a terminal transferase activity [4]. However, our knowledge about these mechanisms is still very limited. Many RNA virus polymerases, which do not use a primer and thus initiate RNA synthesis de novo, generate abortive transcripts during the initiation phase of RNA synthesis [5,6,7]. Primer-mediated repair of template extremities was so far only demonstrated for the positive-strand RNA (+RNA) turnip crinkle virus (TCV) [8]. Non-templated primer synthesis by the viral polymerase might be involved in the repair mechanism of TCV [9]. Such mechanism was also proposed as the molecular basis of the reconstitution of 59-ends of negative-strand RNA (-RNA) respiratory syncytial virus (RSV) replicons [10]. In this study we demonstrate how the dengue virus (DV) RNA-dependent RNA polymerase (RdRp), which starts RNA synthesis de novo, plays a decisive role in the nucleotide conservation of viral RNA ends.
DV belongs to the Flavivirus genus within the +RNA virus family of Flaviviridae together with viruses of the genera Hepacivirus and Pestivirus [11]. The Flavivirus genus comprises around 50 virus species [12] including major human pathogens such as DV, yellow fever virus (YFV), West Nile virus (WNV) and Japanese encephalitis virus (JEV). Flaviviruses harbour the RdRp activity in the C-terminal domain (amino acids 272-900) of non-structural protein NS5 [13,14,15,16,17]. The N-terminal domain contains methyltransferase activities involved in RNA capping [18,19]. Evidence has been presented that the N-terminal domain of NS5 also harbours the central RNA capping guanylyltransferase activity [20]. The structure of full-length NS5 is not known but several structures of methyltransferase domains have been determined (for review see [21]). Likewise, crystal structures of Flavivirus NS5 RdRp domains have been determined for DV [16] and WNV [22]. All structurally characterized viral RdRps so far adopt the basic fold of the SCOP superfamily of DNA/RNA polymerases. As the other subgroups of this superfamily, DNAdependent DNA polymerases (DdDp, prototype Klenow fragment of the E.coli DdDp I), RNA-dependent DNA polymerase (prototype HIV reverse transcriptase) and DNA-dependent RNA polymerases (DdRp, prototype bacteriophage T7 DdRp), their apo-structure is usually likened to a right hand comprising fingers, palm and thumb subdomains. Viral RdRps contain an encircled active site having connecting elements between the fingers and thumb subdomains. Active sites of viral RdRps performing de novo RNA synthesis are additionally closed in their initiation conformation due to the existence of structural elements allowing the stable positioning of the first NTP into a priming site [23,24]. All Flaviviridae RdRps studied so far initiate RNA synthesis de novo. Accordingly, Flavivirus RdRp domain structures contain a ''priming loop'' in the thumb subdomain closing the catalytic site [16,22]. The putative priming loop of DV RdRp was defined as comprising residues 792 to 804. Of particular interest are two aromatic residues near the tip of the loop, W795 and H798, which are conserved in all Flavivirus RdRps. They might play the role of an initiation platform to which the base of the priming NTP stacks as it was shown for bacteriophage w6 [23] and proposed for HCV and BVDV RdRps [25,26]. Structures of DV RdRp in complex with 39dGTP as well as two models of de novo initiation complexes of DV and WNV RdRps favor Trp795 in the role of the initiation platform [16,22].
Genomes of Flaviviridae lack a poly(A) tail at the 39-end. A remarkable trait of Flavivirus genomes is the strict conservation of the 59-and 39-end dinucleotides as 59 AG…CU 39. The molecular basis for this strict conservation of the 59-and 39-end dinucleotides and/or the use of the same starting nucleotide for +RNA and -RNA strand synthesis by the viral polymerases is not known. Its Hepacivirus and Pestivirus counterparts have to display higher nucleotide tolerance. They are able to initiate with (A/G)C and G(G/U), respectively, since the 59-and 39-ends of Hepacivirus genomes of different genotypes correspond to 59 (A/G)C…GU 39 and the genomes of pestiviruses to 59 GU…CC 39. Interestingly, genomes and antigenomes of non-segmented -RNA (ns-RNA) paramyxoviruses, whose RdRps perform de novo RNA synthesis, start with a conserved 59-AC [10].
Here we show that the strict sequence conservation of Flavivirus genome ends is entirely polymerase-encoded. We demonstrate ATP-specific de novo initiation using the RdRp domain of DV protein NS5 (NS5Pol DV ) and specific 10-mer oligonucleotidic RNA templates corresponding to the 39-end of genomic +RNA and -RNA. We document the existence of a built-in ATP-specific priming site of NS5Pol DV . This specific site is one of the means by which NS5Pol DV ensures that the DV genome and antigenome start with an A, the others being several correction mechanisms including the generation of non-templated pppAG primers as well as the preferential formation and elongation of pppAG even on templates with non-cognate 39-ends. Finally, we show that the ATP-specific priming site is part of the putative priming loop coming from the thumb subdomain. There, residue H798, and not W795, is essential for de novo initiation and may act as a priming platform stabilizing the ATP priming nucleotide. DV RdRp is actively involved in the conservation of the correct ends of the genome proving thus a direct example of how RNA viruses maintain the integrity of their genomes. The mechanisms described here may more broadly apply to other RNA viruses having viral RdRps able to initiate RNA synthesis de novo.

Results
NS5Pol DV generates pppAG by abortive de novo initiation on short RNA templates We set out to study primer synthesis by the RdRp domain of dengue virus protein NS5 (NS5Pol DV ) using small specific templates corresponding to the 39-ends of the genome (+RNA) and the antigenome (-RNA). Templates are comprised of 10 nucleotides and are predicted to be devoid of stable secondary structure (see Materials and Methods). Both templates end with the dinucleotide 59-CU-39. Product formation over time was followed using either ATP and GTP, or all NTPs needed to form a full-length product when synthesis is precisely started at the 39-end of the template. Figure 1 shows reaction kinetics of RNA synthesis on DV 10 39+ corresponding to the 39-end of the RNA genome 59-AACAGGUUCU-39 (left) and on DV 10 39-corresponding to that of the antigenome 59-ACUAACAACU-39 (right). We used either [a-32 P]-GTP (aGTP, panel A) or [c-32 P]-ATP (cATP, panel B) as the radioactive nucleotide. For the catalytic ion, either Mg 2+ (panel A) or Mg 2+ supplemented with Mn 2+ (panel B) were used at their optimum concentrations 5 mM for Mg 2+ and 2 mM for Mn 2+ [14]. Reactions with ATP and GTP render time-dependent accumulation of a short product migrating below the marker G2 (see panel B). Comparison with authentic unlabeled pppAG (see Materials and Methods) visualized using UV-shadowing indicated that it indeed corresponds to pppAG (not shown), the expected product of the first step of de novo RNA synthesis. When DV 10 39+ is used as a template, pppAG is formed as well as pppAGA and pppAGAA. When all NTPs are used, pppAG accumulates with time as does pppAGA in the case of DV 10 39+ and pppAGU in the case of DV 10 39-. After the synthesis of trinucleotides NSPol DV adopts a processive RNA synthesis elongation mode to continue synthesis up to full-length products (labeled by asterisks in Figure 1). As we had observed before [14], when using Mn 2+ the reaction is much more efficient and allows for the use of [c-32 P]-ATP (cATP) as radiolabeled nucleotide in order to visualize exclusively de novo RNA synthesis products starting with ATP. The pattern observed with Mg 2+ is reproduced when Mn 2+ is present ( Figure 1B). One difference is that the use of Mn 2+

Author Summary
The 59-and 39-ends of RNA virus genomes have evolved towards efficient replication, translation, and escape from defense mechanisms of the host cell. Little is known about how RNA viruses conserve or restore the correct ends of their genomes. The Flavivirus genus of positive-strand RNA viruses contains important human pathogens such as yellow fever virus, West Nile virus, Japanese encephalitis virus and dengue virus (DV). The Flavivirus genome ends are strictly conserved as 59-AG…CU-39. We demonstrate here the primary role of the DV polymerase in the conservation of the first and last genomic residue. We show that DV polymerase contains an ATP-specific priming site, which imposes a strong preference for the de novo synthesis of a dinucleotide primer starting with an ATP. Furthermore, the polymerase is able to indirectly correct erroneous sequences by producing the correct primer in the absence of template and on templates containing incorrect nucleotides at the 39-end. The correct primer is productively elongated on either correct or incorrect templates. Our findings provide a direct demonstration of the implication of a viral RNA polymerase in the conservation and repair of genome ends. Other polymerases from other RNA virus families are likely to employ similar mechanisms.
results in longer full-length products, which might be caused by an alteration of the terminal nucleotide transferase activity of NS5Pol DV [14,27,28].
In conclusion, using RNA templates mimicking viral sequences, dinucleotide and trinucleotide products are formed during initiation and before processive RNA elongation, the most abundant being the dinucleotide pppAG.
NS5Pol DV contains a built-in ATP-specific priming site for de novo RNA synthesis initiation The first nucleotide of Flavivirus genomes is an adenosine, followed by a guanosine. This 59-pppAG sequence is strictly conserved along the Flavivirus genus. In order to answer the question whether the polymerase (and/or the correct template) is at the origin of the conservation of the first nucleotide, we tested a set of DV 10 39-variants with different 39-ends. In addition to the correct DV 10 39-CU, we used DV 10 39-CC, DV 10 39-CA and DV 10 39-CG in the presence of the corresponding priming NTP and GTP. The expected primer products are pppAG, pppGG, pppUG and pppCG, respectively. Figure 2A compares end points of reactions performed in the presence of aGTP and Mg 2+ as the catalytic ion. Remarkably, the CU template only is proficient for product synthesis (pppAG). RNA primer synthesis on other templates is almost undetectable. We conclude that in the presence of Mg 2+ as a catalytic ion the DV RdRp priming-site accommodates exclusively ATP.
To our surprise, when Mn 2+ was used instead of Mg 2+ , the pppAG primer was generated even in the absence of the template, albeit to a lower extent ( Figure 2B). This is not the case in the presence of Mg 2+ even at ten-fold higher enzyme concentration (see below Figure 3B). When using Mn 2+ and the DV 10 39template variants, we therefore included control reactions in the absence of corresponding templates and in the presence of cGTP, which allows exclusive detection of dinucleotides starting with pppG. Figure 2C shows corresponding reaction kinetics with Mn 2+ as the catalytic ion in the absence or the presence of templates using aGTP or cGTP as the radioactive nucleotide. Again, using DV 10 39-CU and ATP/GTP, NS5Pol DV generates pppAG to a higher extent than without template. Note that no pppGA product is generated. When DV 10 39-CC and GTP is used, NS5Pol DV synthesizes pppGG in the presence of the template only. DV 10 39-CA, UTP, and GTP lead to the formation of pppUG and pppGU (see cGTP control reaction), the latter by initiation internal to the template. No product is formed in the absence of the template. Finally, DV 10 39-CG allows formation of pppCG which is not formed in the absence of the template. In conclusion, NS5Pol DV keeps the strict preference for an ATP as the priming nucleotide in the presence of Mn 2+ when no template is present. Nevertheless, the use of templates with an altered 39-nucleotide can force NS5Pol DV to start the de novo RNA synthesis with the corresponding base-paired priming nucleotide, and also allows internal initiation.
Collectively, these observations confirm that the priming site of NS5Pol DV has a marked specificity for ATP. This preference is strict in the presence of Mg 2+ . It is equally strict for dinucleotide synthesis in the presence of Mn 2+ and in the absence of template. The specificity for ATP as the starting nucleotide is lost when Mn 2+ is used in the presence of templates with incorrect 39-ends; only then NS5Pol DV is able to form pppNG products as efficiently as pppAG.
The ATP-specific priming site enables NS5Pol DV to produce and elongate the correct primer pppAG on viral templates with non-canonical 39-nucleotides In the presence of Mg 2+ and/or Mn 2+ the built-in ATP-specific priming site drives NS5Pol DV -mediated RNA synthesis starting with pppA. The dinucleotide pppAG is accumulated during RNA synthesis on templates with the correct 39-end (see Figure 1). Using Mn 2+ this pppAG primer is also formed in the absence of an RNA template. We asked the question whether NS5Pol DV forms and/or Reactions were started by the addition of MnCl 2 and incubated for 2 h. The identity of product bands is given on the right. (C) Specific nontemplated pppAG dinucleotide formation and non-specific NG dinucleotide formation on DV 10 39-template variants (see under A) in the presence of Mn 2+ . Reaction mixtures contained 2 mM MnCl 2 , 500 nM NS5Pol DV , 1 mM template, 500 mM of NTPs, which were not labeled, and 100 mM GTP (containing either aGTP or cGTP as outlined below the gel) and either no template or DV 10 39-variants (given below the gel). Reactions were started by the addition MnCl 2 and samples were taken at given time points. The identity of product bands is given on the right side of the reaction kinetics. doi:10.1371/journal.ppat.1002912.g002 elongates pppAG even on templates with incorrect 39-nucleotides thus enabling to repair incorrect 39-ends.
First, pppAG formation was tested on the four DV 10 39-variants in the presence of only ATP and GTP. Figure 3A shows that NS5Pol DV is indeed able to form pppAG in the presence of templates with any 39-nucleotide and Mn 2+ . In contrast, in the presence of Mg 2+ only the natural DV 10 39-CU template supports pppAG formation even in the presence of an increased concentration of NS5Pol DV ( Figure 3B). We then tested pppAG formation exclusively in the presence of Mn 2+ on all DV 10 39variants in the presence of all nucleotides, a scenario putatively mimicking the situation within the replication complex. Figure 3C shows that pppAG is always formed in parallel to the dinucleotide, which corresponds to the template. In the case of the template variant with a -CG 39-end, pppAG is produced with even higher efficiency than the base-paired dinucleotide. Note that the dinucleotide pppGU is also produced on all templates by internal initiation. For the reaction in the presence of all templates and all nucleotides, we quantified all products, which were initiated de novo over the very 39-end, and found that pppAG is formed as the prominent product (32.361.5%, three independent reactions). Note that all templates are present at the same concentration, which should not correspond to the situation in vivo. We conclude that in the presence of incorrect templates and Mg 2+ , NS5Pol DV discriminates against these templates and forms pppAG only on the correct template (see also Figure 2A). In contrast, Mn 2+ ions enable NS5Pol DV to preferentially generate pppAG even in the presence of incorrect templates, which could represent an indirect way of 39-end repair.
We then considered the elongation of the correct pppAG primer over templates with incorrect 39-ends. We thus tested the elongation of a chemically synthesized pppAG primer (see Materials and Methods) either without template or in the presence of the four DV 10 39-variants ( Figure 4). The most prominent result is that NS5Pol DV is able to productively elongate pppAG on the correct template in the presence of Mn 2+ ( Figure 4A) and Mg 2+ ions ( Figure 4B). We also observe that NS5Pol DV in the presence of Mn 2+ is able to productively elongate pppAG on incorrect templates ( Figure 4A), thus demonstrating that the enzyme is able to indirectly correct the error in the template and conserve the 59end of the DV genome. Note that as expected there is no primer elongation detectable in the absence of a template.
The predicted T794-A799 priming-loop of NS5Pol DV provides the built-in ATP-specific priming site NS5Pol DV harbors an ATP-specific priming site, which is essential for the formation, accumulation, and elongation of the correct primer pppAG. Which elements of NS5Pol DV form this site? The crystal structure of NS5Pol DV ( Figure 5A) allowed the prediction of a priming loop comprising residues 792 to 804 [16], which is expected to provide the priming site during de novo RNA synthesis initiation. We generated a deletion mutant (NS5Pol DV TGGK) by replacing residues T794-A799 between T793 and K800 by two glycines (see close-up in Figure 5A). The overall correct folding of the purified, recombinant mutant protein was verified by a fluorescent thermal shift assay giving identical temperatures of denaturation (melting temperature T m ) for both proteins (wild type (wt) NS5Pol DV T m 49.0uC 6 0.5uC, NS5Pol DV TGGK T m 48.4uC 6 0.05uC).
The TGGK mutant is expected to have an open active site, which impedes correct ATP-specific de novo initiation over the 39end of a single-stranded RNA template but may favor the accommodation of double-stranded RNA. Its RNA synthesis initiation and elongation activity was first tested using a ''minigenomic'' RNA template consisting of 224 nucleotides of the 59-end of the DV genome fused to 492 nucleotides of the 39end [14]. It has been shown before using this template and analyzing the products on a denaturing agarose-formaldehyde gel [29] that two types of product are formed (see wt reaction kinetics in the center panel of Figure 5B). Firstly, the de novo RNA synthesis product is generated corresponding to the size of the template. Secondly, an elongation product is generated by back-primed RNA synthesis. There, the 39-end (…AACAGGUUCU-39) forms a short hairpin annealing the last di-nucleotide to nucleotides -6 and -7 (underlined in the sequence) and is then elongated [29]. The length of the product is thus ,twice the size of the template. Reactions were carried out using either Mg 2+ or Mn 2+ as catalytic ions. The left and right panels of Figure 5B show that in both cases the mutant TGGK shows an increased overall activity on this template compared to wt activity. The center panel shows that this is mainly caused by increased back-priming. Interestingly, instead of one product species of twice the template size NS5Pol DV TGGK produces a range of elongated products of different lengths. This might be due to the accommodation of long hairpins, which then create longer products than the template but shorter than the elongation product of wt NS5Pol DV .
De novo RNA synthesis initiation by wt NS5Pol DV and the TGGK mutant were then tested on DV 10 39-, in the absence of a template and on DV 10 39+ using Mn 2+ as the catalytic ion, ATP and GTP containing aGTP. Figure 5C (panel 1) shows that in contrast to wt NS5Pol DV , NS5Pol DV TGGK is not able to catalyze de novo initiation on DV 10 39-. Secondly, NS5Pol DV TGGK does not catalyze pppAG formation without template (panel 2). In contrast, it is able to catalyze de novo initiation on DV 10 39+ presenting ca. 32% of wt activity (panel 3). In order to understand this apparent contradiction, we used cATP instead of aGTP as radioactive NTP. It became clear that NS5Pol DV TGGK was unable to generate the pppAG primer product (panel 4). We conclude that the product observed with aGTP corresponds to pppGA formed by internal de novo initiation being only possible on DV 10 39+. When using Mg 2+ as catalytic ion again we did not observe formation of the de novo RNA synthesis initiation product pppAG on either template (for DV 10 39-see below Figure 6B). We conclude that NS5Pol DV TGGK is unable to pre-form the ATP-specific priming site necessary for de novo RNA synthesis initiation at the very 39-end. The predicted priming loop plays indeed an essential role in providing the correct priming site. We explain the increased activity of NS5Pol DV TGGK on minigenomic RNA templates by its increased propensity to catalyze (C) De novo initiation of wt NS5Pol DV and its deletion mutant TGGK was followed in the presence of Mn 2+ using either 1 mM DV 10 39-, in the absence of a template, or 1 mM DV 10 39+ (from left to right as indicated). Reaction mixtures also contained 2 mM MnCl 2 , 500 nM enzyme, 500 mM of NTPs, which were not labeled, and 100 mM labeled NTP (containing aGTP or cATP as indicated). Reactions were started by addition of MnCl 2 and samples were taken at given time points. Identities of labeled product bands are given on the right and left side of the reaction kinetics. pppGA and pppGAA internal de novo initiation products on DV 10 39+ are labeled by an asterisk. doi:10.1371/journal.ppat.1002912.g005 back-priming due its more accessible catalytic site, i.e. to harbor the minigenome in different hairpin conformations allowing 39 elongation.

Residue H798 and not W795 is important for ATP-specific de novo initiation
Two aromatic residues, W795 and H798, within the priming loop were proposed to play a particular role in providing an initiation platform to which the base of the priming ATP could establish a stacking interaction [16]. Residue W795 was given special attention because it was found near the triphosphate moiety of a 39-dGTP bound to NS5Pol DV [16]. In addition, this tryptophan was better placed than the histidine for stacking a priming ATP in two models of de novo RNA synthesis initiation complexes of NS5Pol DV and NS5Pol WNV [16,22]. We generated two mutants of NS5Pol DV , W795A and H798A. Overall correct folding of the purified recombinant mutants was equally verified by a fluorescent thermal shift assay giving T m values corresponding to the wt protein (wt NS5Pol DV T m 49.0uC 6 0.5uC, W795A mutant T m 48.6uC 6 0.6uC, H798A mutant T m 48.1uC 6 0.04uC).
The RNA initiation and elongation activities of wt NS5Pol DV and the W795A and H798A mutants were tested using the minigenomic RNA template and either Mg 2+ or Mn 2+ as catalytic ions ( Figure 6A). In both cases the H798A mutant shows an increased activity on this template whereas W795A shows a similar overall activity compared to wt NS5Pol DV . Figure 6B shows the analysis of the reaction products on a denaturing agaroseformaldehyde gel. The W795A mutant behaves indeed like wt NS5Pol DV , the percentage of the de novo RNA synthesis initiation product of template size is unchanged. In contrast the H798A mutant generates considerably less de novo RNA synthesis product whereas the yield of RNA elongation products is higher.
We then compared the capacities of wt and all mutant NS5Pol DV proteins to catalyze de novo RNA synthesis initiation on DV 10 39-, without template and on DV 10 39+ using Mn 2+ as catalytic ion ( Figure 6C panels 1, 3 and 4). Indeed, the H798A mutant is considerably less capable of correct de novo RNA synthesis initiation than wt NS5Pol DV whereas W795A behaves as wt NS5Pol DV . Note that the product formed by NS5Pol DV TGGK on DV 10 39+ (panel 4) corresponds to pppGA generated by internal RNA synthesis initiation (see also Figure 5C); and therefore part of the product formed by the H798A mutant may correspond to pppGA. When Mg 2+ is used on both templates, the same results are obtained ( Figure 6C panel 2 for template DV 10 39-). We thus conclude that residue H798 is essential for the formation of the correct ATP-specific priming site and may act as a priming platform.

Discussion
In this study, we present evidence that the dengue virus NS5 polymerase domain (NS5Pol DV ) alone is responsible for maintenance of A and U as first and last nucleotides of the DV genome, respectively. NS5Pol DV was used instead of full-length NS5 in the frame of this study in order to avoid any interference of the RNAbinding, NTP-binding, or enzymatic activities of the N-terminal domain of NS5. We report that NS5Pol DV is endowed with several structural and mechanistic features converging to the specific de novo synthesis and elongation of the correct ATP-initiated primer even on templates that lack the correct corresponding U at the 39end. The first and last nucleotides of the genome are strictly conserved in the genus Flavivirus thus the results presented here may apply to the entire genus. We demonstrate the generation of a dinucleotide primer pppAG on both genomic and antigenomic RNA templates. We have previously observed the production of such dinucleotide primer on homopolymeric templates [14]. In the following step pppAG(A/U) trinucleotides are formed before processive RNA elongation occurs. During the latter, NS5Pol DV continues RNA synthesis to the very end of the template. We do not know if diand tri-nucleotide primers as detected in the reaction, originate from a slow but processive RNA synthesis reaction, or are actually released from the complex and re-used by the polymerase acting in a distributive RNA synthesis mode. We also show that the pppAG primer is effectively elongated in the presence of Mg 2+ or Mn 2+ and the correct template. Thus, after initial phosphodiester bond synthesis, the pppAG primer is aligned at the correct position in order to be elongated. The efficient use of the short primer pppAG reported here is in apparent contrast to the inefficient use of 59-OH-AG dinucleotide previously reported [13,30]. The 59triphosphate moiety of the chemically synthesized pppAG primer is most probably an important binding determinant allowing efficient elongation (see discussion of the proposed de novo initiation complex Figure 7).
We then demonstrate that in its de novo RNA synthesis initiation state NS5Pol DV contains a built-in ATP-specific priming site. Major structural elements of NS5Pol DV contributing to this site reside within residues T794 to A799. Their deletion forces NS5Pol DV to initiate de novo RNA synthesis internal to the template using GTP as the first nucleotide ( Figure 5C panel 1) and to perform primer-dependent RNA synthesis ( Figure 5B). In analogy to the structure of HCV NS5B in complex with a nucleotide in its priming site [31] and because of the amino acid conservation observed within a larger group of de novo RdRps [25], we expect that NS5Pol DV residues R472 (RdRp catalytic motif F3, see [14]) as well as S710 and R729 (motif E) are involved in triphosphate binding. This might explain why de novo RNA synthesis initiation by the loop-deleted mutant is still possible, albeit internal to the template. We conclude that indeed the T794-A799 loop plays a major role both in correct de novo initiation and in shaping the priming site. Within the priming loop, residue H798 is essential for primer synthesis (Figure 6). We propose that H798 provides the initiation platform against which the priming nucleotide ATP is stacked. Using the structure of the de novo initiation complex of the RdRp of bacteriophage w6 [23] as a starting point, we generated a model of the initiation complex of DV serotype 2 RdRp in complex with the 39-end of the genome UUCU and both ATP and GTP as first and second nucleotide, respectively (Figure 7). In this model, the triphosphate moiety of ATP indeed interacts with residues S710, R729 and R737 of the thumb subdomain of NS5Pol DV . The aromatic ring of H798 stacks the adenine nucleobase of ATP in a similar position to a w6 RdRp tyrosine residue against which the guanine nucleobase of its priming GTP is stacked. In several protein complex structures histidine has been shown to bind an adenine nucleobase by stacking interactions [32]. Nevertheless, histidine does not seem to provide any specificity towards adenine versus guanine [33]. Our model does not propose any obvious specific interaction with the adenine base. This might be due to the fact that the structure of NS5Pol DV has been captured in a pre-initiation state. In this state, motif F, which provides the upper part of the NTP entry tunnel in the active initiation and elongation conformation of viral RdRps, is not yet correctly positioned [34]. The fine characterization of the ATP-specific built-in priming site of NS5Pol DV awaits the crystal structure of a de novo RNA synthesis initiation complex.
We provide a mechanistic basis for the conservation of nucleotides A and U as the first and last nucleotides of the DV genome, respectively. Figure 8 summarizes the different levels of control that ensure ATP-specific de novo RNA synthesis initiation. Firstly, it generates and elongates the bona fide pppAG primer (red arrows and green arrows on the right). Even in the absence of any template and in the presence of Mn 2+ (Figure 8 left red arrow) NS5Pol DV is able to exclusively synthesize the pppAG primer ( Figure 2B and C, Figure 3A and C). Note that we have also observed pppAG synthesis by full-length NS5 in the absence of a template (not shown). Since a sufficiently high Mn 2+ concentration is present in the cell (0.1 mM to 40 mM Mn 2+ in blood, brain, and other tissues [35]), NS5 in the replication complex might already be loaded with pppAG and thus be ready to elongate pppAG on the viral template. The same pppAG primer is preferentially synthesized in the presence of the correct template irrespective of the metal ion present at the polymerase active site (Figure 8 right red arrows, Figure 2A and B, Figure 3). In the presence of Mg 2+ , NS5Pol DV supports neither formation nor elongation of pppAG on incorrect templates (Figure 8 blue blocked arrow, Figure 4B). In the presence of Mn 2+ , NS5Pol DV is able to synthesize cognate dinucleotides on incorrect templates ( Figure 2C), but in the presence of all nucleotides and all templates (a probably biased and more unfavorable set-up compared to the situation in the replication complex in vivo), pppAG is still a major product ( Figure 3C). Remarkably, the pppAG/Mn 2+ -loaded polymerase is able to mismatch and extend pppAG in order to restore the correct 59-end (Figure 8 blue arrows, Figure 4). The selective extension reaction thus refrains synthesis of incorrect RNAs that  Figure 5A. Only the 39-end CU of the RNA template is shown, it base pairs the initiating ATP and the second nucleotide GTP. The triphosphate of ATP contacts residues S710, R729 and R737 of the thumb subdomain motif E. The 39-O atom of the ribose of ATP lies near the a-phosphate of GTP, which is coordinated to Mg1 bound to D534 of motif A and D663 of motif C. Mg2 is coordinated to the leaving pyrophosphate of GTP. The proposed priming platform H798 stacks to adenine base of the priming ATP. doi:10.1371/journal.ppat.1002912.g007 could occur in the presence of incorrect templates. All these reactions converge to the formation of pppAG and the conservation of A as the starting nucleotide at the 59-end of viral genomic and antigenomic RNAs. Note that the mechanistic basis of the conservation of the second nucleotide G is beyond the scope of this study. Preliminary results generated in our laboratory indicate that both template and polymerase are important to ensure the specific incorporation of a G as the second nucleotide (not shown).
Several ways of viral RNA genome maintenance and repair concerning terminal damage have been discussed [4], among others the generation of ''non-templated'' primers and the use of abortive transcripts as primers. Here we demonstrate that NS5Pol DV uses these two mechanisms. Non-templated primers are generated only in the presence of Mn 2+ . Abortive transcripts are used as primers in the presence of either Mg 2+ or Mn 2+ . A third mechanism observed here is the discrimination against an incorrect template in the presence of Mg 2+ . In addition, in the case that a 39-end might be shortened, the correction upon de novo initiation should be preceded by the addition of (a) nucleotide(s) by the terminal transferase activity of NS5. This activity has also been listed as another way of repairing terminal damage of viral RNA genomes [4]. For NS5Pol DV we have observed this activity before [14] and now again in the presence of Mn 2+ ( Figure 1B).
The DV polymerase endows several of the proposed mechanisms to maintain the correct 59 and 39-ends of the DV genome and antigenome. The ability of DV and WNV to restore a U at the very 39-end of genomes with 39-end deletions has been demonstrated [2,36]. This observation is in accordance with the existence of an ATP-specific priming site in NS5Pol DV . Tilgner et al. [2,36] reported the complete reversion of WNV replicon CA and CG 39ends to CU whereas CC was only partially reverted. Since we have not seen preferential de novo RNA synthesis initiation starting with GG in comparison to UG or CG (all three are possible in presence of Mn 2+ , Figure 2), this might be due to an intrinsic difference between DV and WNV RdRp or caused by different propensities of the erroneous templates to allow pppAG elongation. Indeed CA and CG 39-ends allow pppAG elongation more readily than the CC 39-end ( Figure 4, two independent reactions were performed). Thus the CC 39-end might therefore take longer to revert. Furthermore, Teramoto et al. [2,36] observed the correction of the 59-end from pppGAG to pppAG. Our work provides a mechanistic explanation for their observation.
The observation of non-templated pppAG formation in the presence of Mn 2+ by a viral RdRp has not been reported before using recombinant RdRp assays. However, previous reports convey the occurrence of non-templated dinucleotide formation. RSV, a member of the ns-RNA virus family Paramyxoviridae restores the correct 59-pppA although minireplicons did not encode the correct 39-U [10]. The authors propose that RSV RdRp contains a built-in ATP-specific priming site and cite the observation that the RdRp of the related ns-RNA vesicular stomatitis virus (VSV, Rhabdoviridae) contains a specific ATPbinding site [37] as an argument in favor of their proposition. When VSV RdRp assays were carried out using recombinant RdRp in the presence of Mg 2+ , non-templated 59-initiation was not observed [6]. There is either the possibility that RSV and VSV belong to two different ns -RNA viral families and thus developed different strategies or, in analogy to our results that their RdRps use Mn 2+ to correctly initiate RNA synthesis on erroneous templates as observed for NS5Pol DV here. It is generally believed that Mg 2+ is the activating cofactor of polymerases in vivo because viral RdRp properties observed with Mg 2+ in vitro are more consistent with properties observed biologically. A second reason for giving the preference to Mg 2+ is its cellular abundance in comparison to Mn 2+ (i.e., 0.5 mM free Mg 2+ versus 0.7 mM free Mn 2+ in rat hepatocytes [38,39] and 0.1 mM-40 mM Mn 2+ in blood, brain and other tissues [35] versus 0.2 to 0.7 mM Mg 2+ in human blood [40]. Nevertheless, some events especially involved in correct and efficient de novo RNA synthesis initiation may require the specific use of Mn 2+ by viral RdRps under physiological conditions (our study and [10,36,41,42]).
The pppAG primer synthesis by the DV RdRp can be considered as the first line of control of the conservation of Flavivirus genome and antigenome ends. However, there might be other mechanisms to tighten the selection. The first one could be the base pairing of the genome ends maintaining specific RNA secondary structures, which are necessary to recruit the replication machinery. Computer simulations of such structures [43] indicate that the last U of the 39-end of the genome may be unpaired or paired (structure I or II, respectively in [43]). Thus, requested base pairing may exert selective pressure to keep a U at the end of the Flavivirus genome. Another selection level concerns only the 59-end of the genome and is due to the counterselection of incorrect 59ends through the NS5 RNA-cap methyltransferase. Indeed, several crystal structures of the cap-dependent bi-functional methyltransferase domain of NS5 show that specific binding of the 59-cap involves specific recognition of the first transcribed 59adenosine through its N1 position and residue Asn18 [44,45]. Therefore, for the genomic strand, methylation at the cap N 7guanine and the subsequent 29-O position of the first transcribed adenosine should be efficiently achieved only when ATP is the starting 59-nucleotide. Finally, cap addition seems to involve 59-ATP selectivity as well [20]. Collectively, we propose that the RdRp of flaviviruses is the first actor responsible for the conservation of the correct ends of their genome, and that other mechanisms such as genome cyclization and the specificity of guanylyltransferase and methyltransferase activites add to the selective pressure. These mechanisms of maintenance might also apply to other RNA virus genera with conserved genome ends and viral RdRps initiating RNA synthesis de novo.

Large-scale chemical synthesis and purification of pppAG
Chemical synthesis of AG on solid support. Chemical synthesis of the diribonucleotide AG was performed on an ABI 394 synthesizer (Applied Biosystems) from commercially available (Link Technologies) long chain alkylamine controlled-pore glass (LCAA-CPG) solid support with a pore size of 1000 Å derivatized through the succinyl linker with 59-O-dimethoxytrityl-29-O-acetyl-N 2 -dimethylformamide guanosine. The dinucleotide AG was assembled at a 8-mmol scale in Twist oligonucleotide synthesis columns (861-mmol scale) (Glen research) using 59-O-DMTr-29-Opivaloyloxymethyl-N 6 -phenoxyacetyladenosine]-39-O-(O-cyanoethyl-N,N-diisopropyl-phosphoramidite (Chemgenes) and following a previously described procedure [47]. After assembly completion, the CPG beads in the eight columns were dried under a stream of argon. The beads were pooled, divided in two, and around 4 mmol AG transferred into two Twist oligonucleotide synthesis columns (size for 10-mmol scale). The 59-functionalization of AG with triphosphate moiety was performed in parallel with both columns following previously described conditions [48].
59-triphosphorylation of solid-supported AG. A solution (8 ml) of 1 M diphenyl phosphite (1.6 ml) in dry pyridine (6.4 ml) was manually passed with a glass syringe through the columns containing AG still attached to the solid support and left to stand for 30 minutes at room temperature. After several washings, the oxidation solution containing imidazole (375 mg, 5 mmol) in N,Obis-trimethylsilylacetamide (1 ml, 4.1 mmol), CH 3 CN (1.875 ml), CCl 4 (1.875 ml) and triethylamine (0.25 ml) was added under argon and left to react for 5 h at 30uC. After washing and drying the support, the TBAPP solution (0.23 M, 2 ml) was applied to the column and left to react for 18 h at 30uC. The solution was removed and the support was washed with dry CH 3 CN (468 ml). Finally, the column was dried by 1-min argon flush.
Deprotection and release of solid-supported pppAG. A 0.1 M solution of 1,8-diazadicyclo-[5,4,0]undec-7-ene (DBU) (1.2 ml) in anhydrous CH 3 CN (6.8 ml) was applied to each column for 3 min. Subsequently, a 30% aqueous ammonia solution was applied to each column in three batches (6 ml, 4 ml, 2 ml) for 30 min each. The three ammonia fractions were collected in screw-capped glass vials and were left to react at 30uC for 1.5 h. The fully deprotected pppAG was transferred to 50 ml round-bottomed flasks and isopropylamine (15% of total volume) was added to the solutions. Then the mixtures were evaporated under reduced pressure at 30uC until the volumes were reduced to 0.5 ml. The residues were redissolved in water (1.5 ml), transferred to 2 ml Eppendorf-vials and then lyophilized.
Analysis and purification of pppAG by reverse-phase HPLC. Analytical and semi-preparative HPLC was performed on a Dionex DX 600 HPLC system equipped with reverse-phase columns (Nucleodur C 18 , 100 Å , 3 mm, 4.6670 mm for analysis and Nucleodur C 18 , 100 Å , 7 mm, 86125 mm for purification, Macherey Nagel). The following solvent system was used: 5% CH 3 CN in 50 mM TEAB buffer, pH 8 (buffer A) and 80% CH 3 CN in 50 mM TEAB buffer, pH 8 (buffer B). Flow rates were 1 ml.min 21 and 2 ml.min 21 for analysis and semi-preparative purposes, respectively. Elution was performed with a linear gradient of 0% to 10% buffer B in buffer A in 20 min. The fractions containing the pure pppAG were pooled in a 100 ml round-bottomed flask and were concentrated to a volume of 0.5 ml under reduced pressure at 30uC. The residue was coevaporated ten times with 2 ml of water. The residue was redissolved in 1.5 ml water, transferred to 2 ml Eppendorf-vials and lyophilized. MALDI-TOF mass spectra were recorded on a Voyager-DE spectrometer (Perseptive Biosystems, USA) using a 10:1 (m/m) mixture of 2,4,6-trihydroxyacetophenone/ammonium citrate as a saturated solution in acetonitrile/water (1:1, v/v) for the matrix. Analytical samples were mixed with the matrix in a 1:5 (v/v) ratio, crystallized on a 100-well stainless steel plate and analyzed. UV quantitation of pppAG was performed on a Varian Cary 300 Bio UV/Visible spectrometer by measuring absorbance at 260 nm. Two mmol of pure pppAG were obtained corresponding to 25% total yield. Lyophilized aliquots of 100 nmol have been stored at 220uC for several months without any sign of degradation.

Protein expression and purification
The gene coding for N-terminal His 6 -tagged NS5Pol DV (serotype 2, New Guinea C) as defined in [14] cloned in a pQE30 plasmid was expressed in E.coli (Tuner (Novagen) or NEB Express (New England Biolabs)) cells carrying helper plasmid pRare2LacI (Novagen). Expression was carried out in Luria broth overnight at 17uC after induction with 50 mM IPTG, addition of 2% EtOH and a cold shock (2 h at 4uC). Sonication was done in 50 mM sodium phosphate lysis buffer, pH 7.5, 500 mM NaCl, 20% glycerol, 0.8% Igepal (10 ml of this lysis buffer for around 2 g cell pellet from 1l culture) in the presence of DNase I (22 mg/ml), 0.2 mM benzamidine, protease inhibitor cocktail (SIGMA), 5 mM b-mercaptoethanol and 1 mg/ml lysozyme after 30 min incubation at 4uC. After centrifugation the soluble fraction was incubated in batch with 2 ml TALON metal-affinity resin slurry (Clontech) for 40 min at 4uC. Protein bound to the beads was washed once with 10 volumes of sonication buffer containing 1 M NaCl and 10 mM imidazole and once with the former buffer without Igepal. Protein fractions were then eluted with sonication buffer containing 250 mM imidazole, no Igepal and 250 mM glycine. After dialysis into 10 mM Tris buffer, pH 7.5 containing 300 mM NaCl, 20% glycerol, 250 mM glycine and 1 mM DTT the protein was diluted with the same volume of this buffer without NaCl and loaded onto a HiTrap heparin column (GE Healthcare). Pure NS5Pol DV was then eluted in a single peak applying a gradient from 150 mM to 1 M NaCl. Alternatively, gel filtration was used as a second purification step using a Superdex 75 HR 16/60 column (GE Healthcare) and the dialysis buffer. NS5Pol DV was stored at 220uC at a concentration of 40 to 60 mM after a final extensive dialysis into 10 mM Tris buffer, pH 7.5 containing 300 mM NaCl, 40% glycerol and 1 mM DTT. Purity was higher than 98% as judged by SDS-PAGE.

Mutant NS5Pol DV genes and proteins
Mutant TGGK, W795A and H798A NS5Pol DV expression plasmids were generated using the kit QuikChange (Stratagene). Protein expression and purification was done as for the wt protein. Analysis by gel filtration showed a single peak eluting at the same volume as wt NS5Pol DV .

Determination of T m values
Melting temperature (T m ) values of wt and mutant NS5Pol DV were determined using a thermofluor-based assay [49]. In 96-well thin-wall PCR plates 3.5 ml of a fluorescent dye (Sypro Orange, Molecular Probes, 714-fold diluted in H 2 O) was added to 21.5 ml protein solutions at a concentration of 0.5 or 1 mg/ml (6.7 or 13.4 mM) in storage buffer. Thermal denaturation of the proteins was followed by measuring fluorescence emission at 575 nm (excitation 490 nm). T m values were calculated using GraphPad Prism software and the Boltzmann equation as in [49].
In vitro RdRp assays on DV 10 39+ and DV 10

39-
Reactions were done in 50 mM HEPES buffer, pH 8.0 containing 10 mM KCl, 10 mM DTT and template, NS5Pol DV , non-labeled NTPs, and catalytic ions at final concentration as given in the figure legends. Radiolabeled [c-32 P]-ATP, [a-32 P]-GTP, or [a-32 P]-UTP was used at 0.4 mCi per ml reaction volume (3000 Ci/mmol, Perkin-Elmer). Reactions were started by addition of a mixture of HEPES buffer, KCl, catalytic ions and UTP and CTP when used (given in Figures). After given time points samples were taken and reactions stopped by adding an equal volume of formamide/EDTA gel-loading buffer. Reaction products were separated using sequencing gels of 20% acrylamidebisacrylamide (19:1), 7 M Urea with TTE buffer (89 mM Tris pH 8.0, 28 mM taurine (2-aminoethanesulfonic acid), 0.5 mM EDTA). RNA product bands were visualized using photostimulated plates and the Fluorescent Image Analyzer FLA3000 (Fuji) and quantified using Image Gauge (Fuji). The oligoG marker was produced as explained in [14].

In vitro RdRp assays on minigenomic template
The minigenomic template was produced by in vitro transcription and tests carried out as described in [14].
Reactions analyzed by filter-binding and liquid scintillation counting contained 50 mM HEPES buffer, pH 8.0, 10 mM KCl, 10 mM DTT, 100 nM RNA template, 200 nM NS5Pol DV , 500 mM NTP except for UTP (4 mM), [ 3 H]-UTP at 0.2 mCi/ml and either 5 mM MgCl 2 or 2 mM MnCl 2 . Reactions were started by the addition of a mixture of HEPES, KCl, catalytic ions, CTP, and UTP. After 30, 60, 90, and 120 min 10-ml samples were taken and diluted into 50 ml of 100 mM EDTA, pH 8.0 to quench the reaction. Samples were then transferred onto a DEAE filter mat. Non-incorporated [ 3 H]-UTP was removed by washing with 300 mM ammonium formate and the radioactively labeled product quantified in counts per minute (cpm) using liquid scintillation counting. Product formation was then plotted against time and initial velocities calculated in cpm/min.

Modeling of the NS5Pol DV initiation complex
A homology model of NS5Pol DV serotype 2 strain New Guinea C was generated using the Swiss-model server [50] and the X-ray structure of NS5Pol DV serotype 3 (PDB code 2J7W [16]). NS5Pol DV and the RdRp of bacteriophage w6 in complex with a template RNA strand and initiating NTPs (PDB code 1HI0) were then superimposed using the three catalytic aspartate residues of both proteins. The structural model of the initiation complex of NS5Pol DV serotype 2 was then generated by changing the RNA template to UUCU (39-end of the DV genome) and the initiating NTP to ATP, and by manually adapting the conformation of the priming loop using the UCSF Chimera software [51]. Subsequently using the same program the computed free energy of the model was minimized.