Complex and simple translational readthrough signals in pea enation mosaic virus 1 and potato leafroll virus, respectively

Different essential viral proteins are translated via programmed stop codon readthrough. Pea enation mosaic virus 1 (PEMV1) and potato leafroll virus (PLRV) are related positive-sense RNA plant viruses in the family Solemoviridae, and are type members of the Enamovirus and Polerovirus genera, respectively. Both use translational readthrough to express a C-terminally extended minor capsid protein (CP), termed CP-readthrough domain (CP-RTD), from a viral subgenomic mRNA that is transcribed during infections. Limited incorporation of CP-RTD subunits into virus particles is essential for aphid transmission, however the functional readthrough structures that mediate CP-RTD translation have not yet been defined. Through RNA solution structure probing, RNA secondary structure modeling, site-directed mutagenesis, and functional in vitro and in vivo analyses, we have investigated in detail the readthrough elements and complex structure involved in expression of CP-RTD in PEMV1, and assessed and deduced a comparatively simpler readthrough structure for PLRV. Collectively, this study has (i) generated the first higher-order RNA structural models for readthrough elements in an enamovirus and a polerovirus, (ii) revealed a stark contrast in the complexity of readthrough structures in these two related viruses, (iii) provided compelling experimental evidence for the strict requirement for long-distance RNA-RNA interactions in generating the active readthrough signals, (iv) uncovered what could be considered the most complex readthrough structure reported to date, that for PEMV1, and (v) proposed plausible assembly pathways for the formation of the elaborate PEMV1 and simple PLRV readthrough structures. These findings notably advance our understanding of this essential mode of gene expression in these agriculturally important plant viruses.

To date, only limited information is available about the regulation of readthrough-mediated translation of CP-RTD from sg mRNAs in Luteovirus, Polerovirus and Enamovirus genera [12,13]. The luteovirus barley yellow dwarf virus (BYDV) was the first virus shown to require RNA sequences both proximal (i.e. PRTE) and distal (i.e. DRTE) from its CP stop codon for efficient readthrough [12] and, subsequently, the polerovirus potato leaf roll virus (PLRV) was shown to have similar requirements [13]. Despite the existence of complementarity between the PRTEs and DRTEs in these and other related viruses, no experimental evidence confirming the functional importance of such long-distance interactions has been reported, nor have there been any studies investigating the structural nature of the functional readthrough-promoting RNA signals.
Pea enation mosaic virus (PEMV1) is an enamovirus with a 5.7 kb-long plus-strand RNA genome that contains a 5 0 viral protein genome-linked (VPg) and no 3 0 poly(A) tail (Fig 1A) [32]. The enamovirus protein coding scheme is very similar to that of poleroviruses, like PLRV, except that poleroviruses encode a few additional smaller proteins [18]. The PEMV1 genome codes for three 5 0 -proximally encoded non-structural proteins, p0, p1, and p1/2 (RdRp, expressed via frameshifting), all of which are translated from the genome (Fig 1A) [32]. Structural proteins, CP and its readthrough product CP-RTD, are expressed from a 1.8 kb-long sg mRNA that is transcribed during infections (Fig 1B). Similar to poleroviruses and luteoviruses, the sg mRNAs of enamoviruses were predicted to contain complementary readthrough-promoting PRTEs and DRTEs [13]. However, although the interactions proposed for PEMV1 were in the right general areas of the RTD (Fig 1B and 1C, red circles), the base-pairing partner sequences that were suggested were not correct [13], as revealed by results presented herein.
In this study we investigated, in detail, the CP readthrough signal in PEMV1's sg mRNA and determined that it adopts an elaborate RNA structure, which contrasts the simple readthrough structure that was deduced for PLRV. We also confirmed the functional requirement for long-distance RNA-RNA interactions between the PRTEs and DRTEs in both PEMV1 and PLRV. Lastly, we propose putative folding pathways for formation of the complex PEMV1 and simple PLRV readthrough structures.

Secondary structure analysis of the PRTE and DRTE in PEMV1 sg mRNA
Prior to investigating the functional involvement of the PRTE and DRTE in regulating PEMV1 CP stop codon readthrough (Fig 1B and 1C), the local RNA secondary structures in these regions were analyzed via selective 2 0 -hydroxyl acylation analyzed by primer extension (SHAPE) [33]. SHAPE was conducted on in vitro synthesized transcripts of the full-length wild type (wt) PEMV1 sg mRNA and the reactivity data gathered (SHAPE reactivity correlates with flexibility of the corresponding nucleotide) were integrated into the RNAStructure folding program to predict the most probable secondary structure [34]. The results for the PRTE region revealed the presence of two small RNA stem-loop (SL) structures, termed SL1 and SL3, located downstream of the CP stop codon (Fig 2A, left). Interestingly, an alternative fold was also possible for the sequence between the stop codon and SL3, in which SL1 is replaced by a mutually-exclusive SL2 (Fig 2A, right). Notably, SL2 contains four cytidylate residues (red) in its terminal loop (Fig 2A, right), which are complementary to four guanylate residues (red) in the terminal loop of the SHAPE-predicted SL4 in the DRTE (Fig 2B). Consequently, the complementary terminal loops of SL2 and SL4 could potentially engage in a kissing-loop base-pairing interaction and nucleate contact between the PRTE and DRTE. Subsequent to this initial interaction, additional regions of complementarity, such as the identified orangehighlighted segments, would then be able to pair (Fig 2A and 2B).
The sequence in the PRTE between the CP stop codon and SL3 is highly conserved in enamoviruses. However, one genus member, citrus vein enation virus (CVEV), was found to

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV  (grey boxes) for p0, p1, p2, coat protein (CP) and coat protein-readthrough domain (CP-RTD). Proteins translated from the genome are shown beneath it as tan and green bars. P1/2 RdRp protein is expressed via programmed -1 frameshifting within the p1 ORF. Black arrow beneath the genome indicates the differ significantly in this region [35]. In CVEV, a SL2 equivalent with a tandem base pair covariation (boxed) in the center of its stem was predicted, however a corresponding SL1 could not be identified (Fig 2C). Additionally, CVEV has a comparable, but distinct, SL4 in its DRTE (described in a later section) that contains a complementary terminal loop sequence for CCCCA (red) in SL2 (Fig 2C). These comparative observations support the existence and proposed relevance of PEMV1's SL2 in mediating the initial union of PRTE and DRTE via a SL2/ SL4 kissing-loop interaction.

Functional analysis of the CCCC/GGGG (red) interaction in PEMV1 sg mRNA
A wheat germ extract (wge) in vitro translation system was employed to assess modulation of PEMV1 CP readthrough by the identified RNA elements, starting with the red partner sequences (Fig 3A). To accurately assign the identity of translational products, the wt sg transcription initiation site for the subgenomic (sg) mRNA. The black square at the 5 0 -end of the genome represents the VPg. (B) PEMV1 sg mRNA encoding CP and CP-RTD. Corresponding translation products are indicated below as blue bars. CP-RTD is expressed via programmed readthrough of the CP UGA stop codon. Relative positions of the proposed readthrough-regulating proximal readthrough element (PRTE) and distal readthrough element (DRTE) are shown as red circles. (C) RNA secondary structure model of full-length PEMV1 sg mRNA, as predicted by RNAStructure using default settings [34] and rendered using RNA2Drawer [57]. Labelled are the 5 0 and 3 0 ends, PRTE, DRTE, CP stop codon, SL1, SL3, SL4 and the pinkorange intervening (POI) domain. The red circles on the folded structure correspond to the regions circled on the linear sg mRNA in panel B. The proposed long-distance RNA-mediated interaction between PRTE and DRTE is indicated by a red double-headed arrow, and spans approximately 700 nt.
https://doi.org/10.1371/journal.ppat.1010888.g001 and (F) In vitro translation analyses of the sg mRNAs shown in panels C and E, respectively. Average relative readthrough (Rel. RT) levels (±SE) calculated from three independent trials are shown below each lane. (G) Northern blot analysis of total nucleic acids isolated from pea protoplasts transfected with wt and HA-tagged mutant PEMV1 genomes. gHA, gHA7, gHA8, gHA9 and gHAns each contain a triple HA tag inserted 6 amino acids from the CP N-terminus. Tagged genomic mutants gHA7, gHA8 and gHA9

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV mRNA, a sg mRNA with the CP start codon inactivated by changing AUG to CAG (mutant sg1), and a sg mRNA with the UGA CP stop codon altered to glycine-coding GGA (mutant sg2) were tested. The results showed that both CP and CP-RTD were produced from wt sg mRNA, as confirmed by the absence of the former in the CP AUG knockout (sg1) and the increased levels of the latter in the CP UGA knockout (sg2) (Fig 3B). Two smaller minor products were also generated from wt sg mRNA (Fig 3B, denoted by X). These bands likely represent translational initiation at inframe downstream AUGs in the CP open reading frame (ORF), because in mutant sg1 (CP AUG knockout) their accumulation increased and a corresponding smaller readthrough product(s), denoted by an arrowhead, appeared (Fig 3B).
Having established that the wge system yielded readily detectable amounts of CP-RTD, the proposed red CCCC/GGGG interaction between SL2 and SL4 was investigated (Fig 3A). Sets of compensatory substitutions were introduced individually at two different nucleotide positions in the complementary red sequences (Fig 3C and 3E). In vitro translation analysis of corresponding wt and mutant sg mRNAs showed that the relative readthrough levels correlated with base-pairing capacity between the terminal loop sequences (Fig 3D and 3F). That is, when base pairing was disrupted, relative readthrough levels dropped below 20% that of wt (Fig 3D, mutants sg4 and sg5; Fig 3F, mutants sg7 and sg8), while restoration of base pairing in compensatory mutants sg6 and sg9 rescued relative readthrough to wt levels (Fig 3D and  3F).
To determine if the results obtained from in vitro translation assays reflected activity in corresponding in vivo viral infections, an N-terminal triple-HA tag was introduced into the CP ORF in the full-length PEMV1 genome (creating gHA), thus allowing for immunological detection of CP and CP-RTD. The same red sequence-targeting mutations in sg mRNA mutants sg7, sg8 and sg9 (Fig 3E) were then introduced into the gHA genomic context, creating gHA7, gHA8, and gHA9, and the tagged viral genomes were transfected into pea protoplasts. Infections also included gHAns as a control, which was a mutant genome in which the CP stop codon was converted to a glycine sense codon (UGA ! GGA). Northern blot analysis revealed that HA-tagged genomes and sg mRNAs accumulated to lower levels than their wt counterpart (Fig 3G), likely due to the tag interfering with virus packaging and/or other intracellular viral processes. However, the accumulation levels of the sg mRNAs in the tagged virus infections were reasonably comparable, and examination of corresponding relative readthrough levels revealed results that were consistent with those from in vitro assays (compare Fig 3H and 3F). Combined, these in vitro and in vivo findings provide compelling evidence for the requirement of the red CCCC/GGGG interaction for optimal readthrough and validated use of the wge system for further analysis.

Additional RNA elements are required for efficient readthrough
Formation of the long-distance red CCCC/GGGG interaction would position the identified complementary orange sequences in the PRTE and DRTE in close proximity (Fig 4A). In vitro translation of compensatory mutants targeting two different base pairs in the orange partner sequences in the sg mRNA (Fig 4B and 4D) supported functional base-pairing (Fig 4C and contain the same compensatory mutations as shown in panel E, and genomic mutant gHAns has the same CP stop codon knockout substitution as mutant sg2 in panel B. Substitutions in the DRTE in gHA8 and gHA9 lead to an arginine to serine amino acid change in CP-RTD. Positions of the genome (g) and sg mRNA (sg) are shown on the left side of the blot. Average sg levels (±SE) were calculated from three independent trials and are displayed below each lane. An ethidium bromide-stained rRNA loading control is shown below the Northern blot. (G) Western blot analysis of total proteins extracted from the same pea protoplast infections as in panel G. Identities of the detected viral proteins are indicated on the left and averaged Rel. RT levels (±SE) from three independent trials are shown under each lane. Ponceau S-stained loading control of the blot is shown below. https://doi.org/10.1371/journal.ppat.1010888.g003

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV 4E). Notably, in both cases, only partial rescue of readthrough (~45-50% of wt) was observed for compensatory mutants sg56 and sg59 (Fig 4C and 4E). This lower level of rescue could be related to the substitutions in the orange sequence in the PRTE interfering with presentation of the red CCCC in the terminal loop of SL2, because the orange sequence forms the 5 0 half of the stem in SL2 (Fig 4A). Regardless, the obtained results support an important role for the orange interaction in promoting readthrough efficiency.
The importance of SL3 (blue), localized within the PRTE region, was also assessed due to its proximity to the other functionally relevant PRTE sequences (i.e. orange and red) and its conservation among enamoviruses (Fig 5A). Regarding the latter point, four enamoviruses contain a U-to-C substitution in the stem of their SL3s that maintains pairing (Fig 5A), while CVEV contains a SL3 with multiple covariant base pairs (Fig 5B, boxes). Compensatory mutations in sg mRNAs were designed to simultaneously target three base pairs in the GC-rich stem of PEMV's SL3 (Fig 5C) and results from wge assays indicated that stability of the stem contributes to CP stop codon readthrough (Fig 5D), albeit to a lesser degree than the associated long-distance interactions.

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV Formation of the two long-distance RNA-RNA interactions (red and orange) between the PRTE and DRTE would lead to an RNA structure with a large intervening sequence (659 nt) (Fig 5E). In this structure, the red helix would likely coaxially stack on the orange helix below, with the red helix separated from SL3 (blue) by an 8 nt long intervening sequence (pink)

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV (Fig 5E). The location of this small linker sequence between two functionally important structures suggested that it too could be important for readthrough. Consequently, two separate single nucleotide substitutions were introduced into the intervening pink sequence (Fig 5E). Results from translational assays revealed that both substitutions had notable detrimental effects on relative readthrough levels (Fig 5F), confirming an important role for the pink linker sequence.

A third long-distance RNA-RNA interaction is required for readthrough
Like the orange and red sequences in the PRTE, we reasoned that the pink sequence (Fig 6A, top left) could also function by pairing with a complementary sequence. Potential base-pairing partner sequences for the PRTE's pink segment were initially sought close to the red and orange sequences in the DRTE. Although complementary sequences were identified nearby, none proved to be functionally relevant. A continued search ultimately identified a partially complementary 5 nt long sequence (pink) located some 170 nucleotides upstream from the orange segment in the DRTE (Fig 6A, bottom right). Compensatory mutagenesis of two different base pairs followed by translational analyses revealed a critical role for the pink PRTE-DRTE long-distance interaction in facilitating optimal CP readthrough (Fig 6B-6E). As with the orange interaction (Fig 4), the inability to recover full activity with restored pink pairing may be related to concurrent destabilization of the stem of SL2 and reduced presentation of the red CCCC (Fig 6A). Notably, although the 5 nt long pink sequence is located 170 nucleotides upstream from the orange and red in the DRTE, the intervening 170 nucleotides are predicted, in the context of the full-length wt sg mRNA (Fig 1C), to fold into a small RNA domain, herein termed the pink-orange intervening (POI) domain (Fig 6A, grey shading). Formation of the POI domain would colocalize the red, orange, and pink sub-elements of the DRTE (Fig 6A, bottom right), thereby facilitating their simultaneous interaction with their corresponding localized partner sequences in the PRTE. Collectively, the results show that optimal PEMV1 CP stop codon readthrough depends on three long-distance RNA-RNA interactions (red, orange, and pink) and a local stem-loop structure, SL3 (blue).

Role of DRTE's SL4 and a potential fourth long-distance interaction
Simultaneous base pairing between complementary red, orange, and pink sequences would collectively lead to the assembly of an extended quasi-continuous helix (Fig 7A), with the 170 nt long POI domain and a larger 482 nt long domain extending from the helical intersections. The junctions of the adjacent helices are likely stabilized via coaxial stacking, which for the blue-pink and pink-red helical joints could involve non-canonical base pairs forming above (AG, CA) and below (CC) the pink helix (Fig 7A). In this structure, SL4's stem could, as shown, remain paired while its loop interacts with its red partner sequence in the PRTE (Fig  7A). However, an alternative long-distance interaction (green) was noted in which the 5 0 -portion of SL4's stem could base-pair with a 6 nt complementary sequence immediately downstream from the CP stop codon (Fig 7A, green). Thus, the stem sequence in SL4 could first function locally in the DRTE to present the GGGG (red) sequence and subsequently participate in a fourth PRTE/DRTE (green) interaction. SL4's role in presenting GGGG (red) is strongly supported by comparative structural analysis among enamoviruses, which revealed covariation within the stem (alfalfa enamovirus-2, AEV-2; bird's-foot trefoil enamovirus, BFTV-2; and red clover enamovirus-1, RCEV-1) and alternative SL4 folds (Bean enamovirus-1, BEnV-1 and citrus vein enation virus, CVEV) (Fig 7B). Additionally, the analysis of PEMV1 sg mRNAs with compensatory mutations in the stem of SL4 (Fig 7C, right) confirmed the importance of pairing in its stem (Fig 7D, boxed lanes).

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV Comparative structural analysis of the potential long-distance green interaction revealed that for most enamoviruses (except CVEV) the green sequence in the PRTE is strictly conserved, with substitutions in partner green sequences in their DRTEs that generally maintained complementarity (nucleotides in green), or generated non-canonical GA or AG pairs (nucleotides in red) (Fig 7E). CVEV's green sequence in its PRTE contains two substitutions (boxed) compared to that of the other enamoviruses (Fig 7E), and collectively maintains a potential green interaction that could include GA and AG pairs [36][37][38]. Accordingly, the structural

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV comparisons suggest the possibility of a fourth functionally relevant long-distance green PRTE/DRTE interaction. Indeed, if sterically feasible, the green interaction would extend the quasi-continuous helix at the base and presumably further enhance the structure's stability (Fig 7A). To address this possibility, compensatory mutations were introduced into the green partner sequences in sg mRNAs and tested in translational assays (Fig 7F and 7G). Disruptive mutants (sg22 and sg23) notably decreased readthrough, while the restorative mutant (sg24) caused further reduction (Fig 7G). Moreover, combining the green interaction-restoring changes in mutant sg24 with an additional substitution (Fig 7C, sg63) that simultaneously restored pairing in the stem of SL4 (thereby generating sg65) did not lead to recovery of readthrough (Fig 7D). Therefore, nucleotide identity within the PRTE's green sequence is important, but its role may be independent of pairing with the DRTE's complementary green sequence. Accordingly, while our results corroborate an important role for the stem of SL4 in presenting the red GGGG partner sequence in the DRTE, they do not support, but also do not conclusively preclude, its involvement in a fourth long-distance green PRTE/DRTE interaction.

PLRV readthrough signal involves a long-distance RNA-RNA interaction
A previous study identified a PRTE and DRTE in PLRV that were both shown to be essential for efficient CP-RTD production [13]. Although these sequences exhibited notable complementarity, efforts to experimentally demonstrate a PRTE/DRTE pairing requirement for readthrough were unsuccessful [13]. Due to PLRV's close relationship to PEMV1, we sought to assess the necessity for such pairing and deduce the RNA structure formed. Initially, the secondary structure of wt PLRV sg mRNA was modeled using the RNAStructure folding program [34]. Interestingly, within the full-length sg mRNA fold, the previously identified PRTE and DRTE sequences (orange) were predicted to be paired to each other at the base of a large RNA domain (Fig 8A). In the prior attempt to generate informative sg mRNA compensatory mutants, several nucleotides were targeted simultaneously for substitution [13]. We reasoned that this approach likely hindered important local folding in one or both regions and/or the modified PRTE or DRTE inadvertently bound to non-cognate partner sequences elsewhere in the sg mRNA. We therefore designed our compensatory mutations as single nucleotide changes that would disrupt the bottom of the proposed structure while minimally altering the partner sequences. This strategy would both destabilize the overall structure and alter the functionally important distance between the UAG and the base of the readthrough-promoting structure (Fig 8B and 8C). Also, contrary to prior reports [13,39], we were able to detect synthesis of a PLRV CP-RTD product using wge assays, as confirmed by its level increasing upon knockout of the CP stop codon in sg mRNA mutant PLns (Fig 8D). Using the wge system to test wt and mutant PLRV sg mRNAs, we observed that both sets of compensatory mutants yielded results consistent with base pairing of the orange sequences in the PRTE and DRTE being required for optimal CP-RTD production (Fig 8E and 8F). These results demonstrate that the previously proposed long-distance interaction in PLRV [13] is indeed essential for optimal readthrough of its CP stop codon.

Discussion
Survival of PEMV1 and PLRV depends on aphid-mediated host-to-host transmission, which is conferred by their CP-RTD minor capsid proteins generated via programmed ribosome readthrough [40]. In this study we performed a detailed investigation of the regulation of CP-RTD production in PEMV1 and developed an elaborate multi-helix model for the readthrough structure. In contrast, our assessment of the PLRV readthrough signal indicated a simple single-helix RNA structure. Below, different readthrough structures are discussed, the PEMV1 and PLRV readthrough structures are compared, and hypothetical models for the assembly of PEMV1 and PLRV readthrough signals are proposed.

Long-distance readthrough structures in other viruses
Programmed stop codon readthrough is commonly used by RNA viruses to produce their RdRps or minor CPs [1]. In some cases, readthrough stimulating signals are localized immediately downstream from corresponding stop codons. Murine leukemia retrovirus relies on a compact RNA pseudoknot structure situated 8-nt downstream from its gag stop codon for pol translation [41], while in tobacco mosaic virus a 6 nt-long linear sequence directly after the stop codon promotes readthrough production of its RdRp [9]. In other viruses, bipartite readthrough signals, separated by intervening sequences, are employed. For example, alphaviruses

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV utilize a simple helical readthrough structure, similar to that in PLRV (Fig 9A), for production of their RdRps [5]. However, although comparable with respect to their basic stem structures, the intervening sequences in alphaviruses are considerably shorter than that in PLRV (i.e. 100-150 nt versus~670 nt, respectively).
Arguably the best studied viruses employing long-distance interactions for readthrough are genera in the family Tombusviridae (Tombusvirus, Betanecrovirus and alphacarmovirus), all of which use long-distance RNA-RNA base pairing (spanning kilobases) to mediate readthrough expression of their RdRps [6][7][8]. In contrast to readthrough in the sg mRNAs in PEMV1 and PLRV, readthrough in tombusvirids occurs in the full-length viral genomes, and with corresponding DRTEs located in their genomic 3 0 UTRs. This placement coincides with genomic replication elements, allowing for potential crosstalk between the two processes. For instance, the DRTE of the tombusvirus carnation Italian ringspot virus is integrated with a genome replication element in the genomic 3 0 UTR and, importantly, the functional structures of the DRTE and replication element are mutually-exclusive RNA conformations. This

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV overlapping arrangement acts as an RNA switch that dictates whether genomic minus-strand synthesis or translational readthrough proceeds, thereby coordinating these two opposing processes [6]. In contrast, the DRTEs for PEMV1 and PLRV are positioned centrally in the coding regions of their RTDs (Fig 1B). Thus, although possible, the remote locations of these DRTEs are likely not related to regulation of other sg mRNA processes. Instead, their positions are more likely the consequence of random but productive (for readthrough) initial long-distance interactions, which were maintained and further optimized.
The PRTEs of some tombusvirids can assume alternate structures or have flexible adjacent structures important for readthrough efficiency. The PRTE of the alphacarmovirus turnip crinkle virus can adopt two alternative structures, one which is nonfunctional and the other that is functional [8]. In tobacco necrosis betanecrovirus, a downstream PRTE-adjacent structure that influences readthrough efficiency has both active and inactive conformations [42]. Alternative RNA conformations such as these provide additional avenues for regulating readthrough, and illustrate the importance of considering local context and structural flexibility when investigating regulatory RNA elements. Indeed, as alluded to earlier (Fig 2A), alternative local conformations are also likely relevant in PEMV1's PRTE.
For both betanecroviruses and tombusviruses, in addition to their PRTE/DRTE interactions, efficient RdRp readthrough expression requires an extra long-distance RNA-RNA interaction, termed the upstream linker/downstream linker (UL/DL) interaction [6,43], which is also essential for viral genome replication [44]. Accordingly, these viruses employ two distinct long-distance interactions for readthrough, one involved in forming the readthrough structure (PRTE/DRTE) and another that serves an essential accessory role (UL/DL). Since the UL/DL interactions reside within the~3 kb intervening sequence between the PRTE and DRTE partner sequences, it was proposed that they likely function to help unite the PRTE and DRTE [6,43]. In this regard, the possibility of intervening sequence assisting in the formation of the PRTE/DRTE interaction in PLRV is discussed in the next section.

The PEMV1 readthrough structure versus PLRV's
Enamovirus, Luteovirus, and Polerovirus genera are related based on amino acid conservation of their CP and CP-RTD [32,45]. Members of these genera are also predicted to contain bipartite readthrough regulatory signals separated by~600 to~800 nt [12,13]. Notably, they all have the same relative positioning of their PRTEs and DRTEs in the CP-RTD coding region [13]. This suggests that CP/CP-RTD coding and associated readthrough signal were adopted by an enamo/polerovirus common ancestor prior to its divergence into two distinct genera, while a recombination event introduced the 3 0 -proximal structural gene cassette into luteoviruses, which contain tombus-like polymerases [46][47][48]. Despite their distinct evolutionary histories, these genera have maintained commonalities in their strategies for mediating readthrough.
Of the three genera, poleroviruses and enamoviruses are most similar [16]. Yet a comparison of the prototype species, PLRV and PEMV1, revealed clear differences in their approach to inducing readthrough (Fig 9A and 9B). PLRV's CP ORF and those of all known poleroviruses terminate with an UAG stop codon, while PEMV1 and all known enamoviruses (except for CVEV) use UGA. Proteomic analysis of PLRV's CP-RTD revealed that the UAG is decoded 89% of the time by tRNA Gln [13]. The tRNA responsible for decoding PEMV1's UGA is currently unknown. Corresponding PRTEs and DRTEs in PLRV and PEMV do not share any noteworthy sequence identity (Fig 9A and 9B). Dissimilarity also extends to the predicted local RNA secondary structures at these two locations. For PLRV, prior solution structure probing and mutational analyses [13] determined that the orange DRTE sequence involved in forming the readthrough structure resides in a local stem-loop structure, with most of the

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV orange nucleotides paired (Fig 9C, right). The local structure in the PRTE region was not investigated [13], but thermodynamic predictions suggest that this segment likely includes a small RNA stem-loop that sequesters most of the PRTE's orange sequence (Fig 9C, left). Based on these predictions, the PRTE and DRTE regions do not adopt conformations that would effectively nucleate PLRV's orange interaction. This suggests that PLRV uses a different strategy for uniting these sequences, and secondary structure predictions of the full-length PLRV sg mRNA indicate that this could be accomplished through global folding, where PRTE and DRTE form the closing ends of a large RNA domain (Fig 8A). That is, the folding of subdomains within the large domain would act to bring the partner sequences together. In contrast, folding predictions for PEMV1 sg mRNA indicate that the PRTE and DRTE are located in different RNA domains (Fig 1C), thus a unification mechanism akin to that suggested for PLRV would be less likely. Accordingly, the differences in sequence and predicted RNA structures for PLRV and PEMV1 indicate that the former likely mediates formation of its readthrough structure primarily through the folding of an independent RNA domain, while the latter initiates readthrough structure formation by stochastic nucleation of key partner sequences (i.e. red) located in different RNA domains (see next section for details).
The proposed readthrough signal for PLRV, a contiguous helix, is relatively simple (Fig  9A). In comparison, the PEMV1 readthrough structure is considerably more complex, consisting of a quasi-contiguous helix stabilized by coaxial stacking at stem junctions and assembled via multiple long-distance interactions involving different regions (Fig 9B). Though these structures differ greatly, they are both able to direct production of the requisite amounts of CP-RTD. It is intriguing that two closely related viruses have found such radically different structural solutions for readthrough. These differences are presumably the consequence of repeated sequential sampling of distinct structural variants, resulting in maintenance of those that adequately addressed functional requirements. The net result being that these viruses have evolved via divergent pathways to give rise to secondary structures of vastly contrasting complexity. Considering these extreme examples, and the predicted variability of PRTE/DRTE interactions [13], we anticipate the existence of a range of readthrough structures with different levels of complexity within the expansive and diverse polerovirus and luteovirus genera [49].
An assembly model for PEMV1 readthrough structure SHAPE data indicated that the default structure of the PRTE is comprised of SL1 and SL3 (Fig  2A, left). Importantly, although the orange sequence is predicted by SHAPE to be single stranded in the loop of SL1 (Fig 2A), its orange partner sequence in the DRTE is predicted to be paired (i.e. low SHAPE reactivity) and thus unavailable for pairing (Fig 2B). The latter interpretation is supported by the prediction that, in the context of the full-length sg mRNA, the orange sequence in the DRTE is paired with the DRTE's pink sequence (Fig 6A, bottom  right). Accordingly, SL1 would be limited in its ability to nucleate the PRTE/DRTE interaction via an orange pairing interaction. In the alternative PRTE fold where SL2 forms and presents the red sequence in its loop, the pink and orange sequences are paired in its stem (Fig 6A, top  left) and thus would not be available for long-distance base-pairing with partner sequences in the DRTE; which are also predicted to be paired and unavailable (Fig 6A, bottom right). Consequently, the predicted local structural contexts in the alternatively-folded PRTE and the DRTE would favor the red CCCC/GGGG kissing-loop interaction, and concurrently impede the orange and pink interactions (Fig 6A).
Based on our experimental results, we propose a theoretical model for the assembly of the PEMV1 readthrough structure (Fig 10). In the PRTE, the red CCCC sequence in SL1 is

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV initially paired in the stem of SL1 and is not available for pairing with its available red GGGG partner sequence in SL4 in the DRTE (Fig 10A). However, the unfolding of SL1 by the helicase activity of terminating ribosomes [50] would facilitate a SL1 to SL2 conversion (Fig 10A).
Refolding of the PRTE sequence into the alternative CCCC-presenting SL2 (Fig 10B, i) would then allow for a red CCCC/GGGG kissing loop interaction with SL4 in the DRTE (Fig 10B,  ii). In this model, SL1 acts as an attenuator of readthrough structure formation in the absence of CP translation and presumably contributes to the regulation of readthrough levels. Following the red-mediated nucleation of the interaction, additional secondary interactions, such as the orange (Fig 10C) or the pink would form in turn and lead to the assembly of an active readthrough structure (Fig 10D).
Not depicted in Fig 10D is the potential formation of an additional interaction involving the green partner sequences in the PRTE and DRTE. This pairing would extend the helical region at the base and could help to stabilize the structure via coaxial stacking with the orange helix (Fig 10D). However, this green interaction would need to be temporary and disengage during ribosome readthrough, so as to allow for the necessary spacer distance (~7-9 nt) between the stop codon and the base of the readthrough structure [1]. Either with or without the involvement of this latter interaction, the active RNA structure, postulated to be that depicted in Fig 10D, would then be able to efficiently trigger CP stop codon readthrough, presumably by increasing utilization of near cognate tRNAs or decreasing recruitment of release factors by an unknown mechanism [1]. Active translation of the RTD coding region would cause disruption of PRTE/DRTE interactions and their local RNA structures. Accordingly, for subsequent rounds of readthrough to occur, ribosome-mediated conversion of SL1 to SL2 would again be required to initiate assembly of an active readthrough structure (Fig 10A and  10B). It is also noted that the readthrough structure folding process described could also involve other protein factors, such as RNA chaperones and/or RNA helicases.

Conclusion
This study has provided the first higher-order RNA models for readthrough structures in the Enamovirus and Polerovirus genera. Compelling experimental evidence demonstrating the importance of long-distance RNA-RNA interactions in the formation of these structures was also presented. Compared to other readthrough structures, the proposed structure for PEMV1 is arguably the most elaborate readthrough signal reported to date, and its suggested folding pathway, as well as that for PLRV, provide new insights into readthrough structure assembly. Collectively, these findings significantly advance our understanding of the strategies used by viruses to mediate the production of essential readthrough proteins.

cDNA preparation
Standard PCR-based site-directed mutagenesis was utilized for introducing nucleotide substitutions in different parts of full-length PEMV1 genome (gene bank: NC_003629.1) and

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV PEMV1 sg mRNA. Cloned cDNA of the full-length PEMV1 genome [32,45] (Kindly provided by W. Allen Miller, Iowa State University) was used to create genomic mutants, as well as wt PEMV1 sg mRNA and its mutant derivatives. All viral mutants utilized in this study were sequenced to confirm that only the intended modifications were present.
Full-length PEMV1 genome construct gHA, contained three tandem HA-tag sequences (UACCCAUACGAUGUUCCAGAUUACGCU) introduced at the N-terminal region of CP ORF (genome coordinates 4015-4095, immediately downstream from the first 6 codons of CP). gHA was then utilized as a backbone to insert PRTE-DRTE compensatory nucleotide substitutions, thereby creating gHA7, gHA8 and gHA9.
Mutants constructed to investigate CP-RTD production from the PLRV sg mRNA [51] were derived from PLRV genome cDNA (gene bank: KP090166.1) that was kindly provided by Michelle Heck (Cornell University).

Synthesis of viral RNAs in vitro
All of the PEMV1 genome and sg mRNA constructs investigated in this study contained a T7 promoter at the 5 0 -end of the viral sequence and a unique PstI restriction enzyme cut site at its 3 0 -end. PstI-linearized wt and mutant clones were treated with T4 DNA polymerase (NEB) to remove the 3 0 -overhang left after PstI cleavage and then were transcribed in vitro using Ampli-Cap-Max T7 High Yield Message Maker Kit (Cellscript) to create 5 0 -capped sg RNAs and Mes-sageMax T7 ARCA-Capped Message Transcription Kit (Cellscript) to create 5 0 -capped genomic RNAs, both with authentic viral 3 0 ends.
The PLRV sg mRNA constructs utilized in this study contained a T7 promoter at the 5 0 -end of the viral sequence and a unique 3 0 -terminal ScaI restriction enzyme cut site. ScaI-linearized wt and mutant cDNAs were transcribed in vitro using AmpliCap-Max T7 High Yield Message Maker Kit (Cellscript) to create 5 0 -capped sg mRNAs with authentic viral 3 0 ends.

In vitro translation assays
To test readthrough levels of CP-RTD, 0.5 pmol of 5 0 -capped transcripts of wt or mutant PEMV1 sg mRNAs (sub-saturating levels) were incubated in wheat germ extract (wge, Promega) in the presence of [ 35 S]-Methionine at 25˚C for 1 hr according to the manufacturer's instructions, except that the concentration of KOAc was increased to 133 mM for each reaction to optimize translation and readthrough efficiency. The viral proteins translated during the incubation were detected and quantified through 12% SDS-PAGE and phosphorimaging, respectively [52,53]. Imaging was carried out using Typhoon FLA 9500 Variable Mode Imager (GE Healthcare). QuantityOne software (BioRad) was used to quantify protein bands, from which ratios of the readthrough product CP-RTD and the pre-readthrough product CP were calculated for each tested mRNA. Percentages of the mutant ratios relative to the wt ratio were determined and used as relative readthrough levels (Rel. RT). Three independent repeats were carried out for each of the in vitro translation experiment and means with standard errors (SE) were calculated.
The same steps were followed as above for obtaining readthrough levels of CP-RTD from PLRV sg mRNAs in vitro, except that 0.4 pmol of 5 0 -capped transcripts (sub-saturating levels) was used per in vitro translation reaction.

Pea protoplast transfection
Pea protoplasts were isolated from 12-day old, fully expanded Pisum sativum leaves by first removing the lower epidermis and then incubating the remaining tissue in a cellulase mixture at 26˚C for 4 hours [54,55]. Two million protoplasts were transfected with 20 μg of 5 0 -capped

PLOS PATHOGENS
Translational readthrough signals in PEMV-1 and PLRV PEMV1 transcripts using polyethylene glycol (PEG 1450) and CaCl 2 and incubated at 22˚C for 40 hours under constant fluorescent light [55]. After the incubation, one half of the infection was used for total protein isolation and western blotting and the other half for total nucleic acid extraction and Northern blotting.

Western blotting
Total proteins were separated by 12% SDS-PAGE and transferred to membrane (Amersham Hybond P 0.45 PVDF). Ponceau S staining was carried out for visualizing total proteins and confirming equal loading and transfer prior to proceeding with blotting. HA-tagged CP and CP-RTD were detected by blotting with Anti-HA-peroxidase high affinity (3F10) rat monoclonal antibodies (Roche) at 1:2000 dilution. CP and CP-RTD bands were detected using ECL Select western blotting detection reagent (GE Healthcare) and captured through MicroChemi imager (DNR Bio-Imaging Systems). Detected viral protein bands were quantified using QuantityOne software. Three independent repeats of pea protoplast infections/western blotting were carried out and means with SE were calculated. Rel. RT levels were calculated as described for in vitro translation assays.

SHAPE RNA structure analysis
Selective 2 0 -hydroxyl acylation analyzed by primer extension (SHAPE) was performed and the data was used to model the RNA secondary structures of PRTE and DRTE regions in fulllength PEMV1 sg mRNA, as described previously [33,6,53]. SHAPE was carried out using 1-methyl-7-nitroisatoic anhydride (1M7) that modifies flexible (i.e. single stranded) nucleotides. Two primers, fluorescently labeled at their 5 0 -ends, one complementary to a region downstream from the PRTE (genome coordinates-4828-4857) and the other to a region downstream from DRTE (genome coordinates -5504-5533), were used for primer extension reactions following 1M7 treatment of wt PEMV1 sg mRNA. After fluorescent capillary electrophoresis of the products of primer extension, the raw data was analyzed using the ShapeFinder software [56] to generate relative reactivities for each nucleotide. These reactivity values were normalized against the ten highest reactivities in the pool. The SHAPE experiment was performed twice, with consistent results, and averaged values of the two repeats were used for secondary structure prediction. The RNAStructure web server was used [34] to combine SHAPE reactivity data (slope = 1.8 kcal/mol; intercept = -0.6 kcal/mol) with thermodynamic prediction to generate secondary structure models of PEMV1 PRTE and DRTE in the sg mRNA context. RNA2Drawer software was utilized to draw RNA secondary structure models depicted throughout the paper [57].