Skip to main content
Advertisement
  • Loading metrics

Flexibility and modulation of translation initiation in enterovirus genomes

  • Rhian L. O'Connor,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Georgia M. Cook,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Resources, Writing – review & editing

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Jacqueline Hankinson,

    Roles Formal analysis, Investigation, Writing – review & editing

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Ksenia Fominykh,

    Roles Investigation

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Samantha H. Cheng,

    Roles Investigation

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Daniel A. Nash,

    Roles Methodology, Resources

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Aurélie Cenier,

    Roles Methodology, Resources

    Affiliations Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom, Department of Paediatrics, University of Cambridge, Cambridge, United Kingdom

  • Komal M. Nayak,

    Roles Methodology, Resources

    Affiliations Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom, Department of Paediatrics, University of Cambridge, Cambridge, United Kingdom

  • Stephen C. Graham,

    Roles Funding acquisition, Methodology, Resources, Supervision, Writing – review & editing

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Janet E. Deane,

    Roles Funding acquisition, Methodology, Resources, Supervision, Writing – review & editing

    Affiliation Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom

  • Matthias Zilbauer,

    Roles Funding acquisition, Methodology, Resources, Supervision

    Affiliations Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom, Department of Paediatrics, University of Cambridge, Cambridge, United Kingdom

  • Andrew E. Firth,

    Roles Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Visualization, Writing – review & editing

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

  • Valeria Lulla

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    vl284@cam.ac.uk

    Affiliation Department of Pathology, University of Cambridge, Cambridge, United Kingdom

Abstract

Enteroviruses comprise a large group of mammalian pathogens that often utilize two open reading frames (ORFs) to encode their proteins: the upstream protein (UP) and the main polyprotein. In some enteroviruses, in addition to the canonical upstream AUG (uAUG), there is another AUG that may represent an alternative upstream initiation site. An analysis of enterovirus sequences containing additional upstream AUGs identified several clusters, including strains of pathogenic Enterovirus alphacoxsackie and E. coxsackiepol. Using ribosome profiling on coxsackievirus CVA13 (E. coxsackiepol), we demonstrate that both upstream AUG codons can be used for translation initiation in infected cells. Moreover, we confirm translation from both upstream AUGs using a reporter system. Mutating the additional upstream AUG in the context of CVA13 did not result in phenotypic changes in immortalized cell lines. However, the wild-type virus outcompeted this mutant in human intestinal organoids and differentiated neuronal systems, representing an advantage in physiologically relevant infection sites. Mutation of the stop codon of the shorter upstream ORF led to dysregulated translation of the other ORFs in the reporter system, suggesting a potential role for the additional uORF in modulating the expression level of the other ORFs. Additionally, we demonstrate regulation of uORF translation in response to stress. These findings reveal the remarkable plasticity of enterovirus IRES-mediated initiation and the competitive advantage of double-upstream-AUG-containing viruses in terminally differentiated intestinal organoids and neuronal systems.

Author summary

Enteroviruses cause over a billion infections annually, leading to serious conditions such as hand, foot, and mouth disease and viral meningitis. While these viruses are known to encode proteins through a single polyprotein and an upstream open reading frame (uORF), the role of additional upstream AUG codons has remained unclear. In this study, we investigated two upstream AUGs (uAUG and uuAUG) in the coxsackievirus CVA13 genome using ribosome profiling and reporter assays. Our results show that both AUGs can initiate translation, with the uuAUG providing a competitive advantage in physiologically relevant models, including human intestinal organoids and neuronal cultures. These findings reveal a remarkable genome plasticity and advance our understanding of how multiple upstream AUGs modulate viral protein expression.

Introduction

The Enterovirus genus belongs to the Picornaviridae family and consists of 13 species and more than 70 serotypes. Human enteroviruses are widely spread respiratory and intestinal pathogens and are classified into four species: Enterovirus alphacoxsackie, E. betacoxsackie, E. coxsackiepol and E. deconjuncti (formerly Enterovirus A to D). Virus infections usually begin in peripheral tissues and can progress to the nervous system. Disease phenotype in humans ranges from sub-clinical to acute flaccid paralysis, myocarditis, and meningitis [14]. Enteroviruses are non-enveloped viruses with single-stranded positive-sense RNA genomes of ~7.4 kb (Fig 1A). The genome is flanked by 5′ and 3′ untranslated regions (UTRs). The 3′ UTR is polyadenylated, and the 5′ end of the genome is highly structured, containing an internal ribosome entry site (IRES) and a 5′-covalently bound viral protein, VPg (also known as 3B) [2]. All enteroviruses encode a large polyprotein in a single long open reading frame (ppORF). Human E. alphacoxsackie, E. betacoxsackie, and around half of E. coxsackiepol genomes contain an upstream open reading frame (uORF) that partially overlaps the IRES and the ppORF and encodes a small upstream protein (UP) that promotes virus infection in gut epithelial cells [5]. Where present, the translation of the uORF produces a peptide of 56–76 amino acid (aa) residues, with a molecular mass of 6.5–9.0 kDa and isoelectric point (pI) of 8.5–11.2 [5,6].

thumbnail
Fig 1. Analysis of AUG triplets in regions flanking the enterovirus SL-VI AUG.

(A) Schematic representation of the enterovirus genome with indicated features: 5′ and 3′ untranslated regions (UTRs), internal ribosome entry site (IRES), and stemloop VI (SL-VI). Below are the most frequent non-SL-VI AUG patterns in the 20 nt 5′ and 20 nt 3′ of the SL-VI AUG. AUG configurations present in ≥5 of the 9347 sequences analyzed are shown; see S1S3 Figs for the complete analysis. (B) Histograms of SL-VI AUG and non-SL-VI AUG ORF lengths. Histogram bins are in increments of 5 codons from 1–5 up to 136–140; the last two bins are for lengths 141–2000 and >2000 codons; ORF lengths in the last bin correspond to cases where the upstream-AUG-initiated ORF is contiguous with the ppORF (often rhinoviruses). ORF lengths are counted for each relevant AUG even if there are multiple in-frame AUGs.

https://doi.org/10.1371/journal.ppat.1013967.g001

The 5′ UTR contains six structured domains I-VI. Domain I has a role in directing negative-strand RNA synthesis, whereas domains II-VI form a type I IRES, that facilitates cap-independent ribosomal recruitment [7]. Domain IV is essential in 43S ribosomal preinitiation complex assembly and polyprotein synthesis [8]. The uORF AUG (uAUG) is situated in domain VI of the IRES (Fig 1A) [5]. Mutating the uAUG in the virus context results in a non-viable virus since ppORF translation is entirely IRES-dependent and impossible without the SL-VI uAUG that facilitates initiation complex pre-assembly [9]. This translational co-dependence complicates investigation of how the expression of these two ORFs (ppORF and uORF) is regulated. Using UP knockout (KO) viruses, ribosome profiling, and IRES-containing reporter systems, we previously demonstrated translation of both ORFs during virus infection, overturning the previous dogma that enteroviruses encode a single polyprotein [5].

Here, we identify an interesting feature whereby several enteroviruses contain two upstream AUG codons, both of which are utilized for translation, thus further expanding the translational potential of enteroviruses. We demonstrate the modulation of translation initiation and reading frame usage through the additional upstream AUG. We use several computational and experimental techniques to understand the evolutionary conservation of additional upstream AUGs and their translational potential, regulation, and competitive advantage in terminally differentiated intestinal organoids and neuronal cultures.

Results

Analysis of enterovirus sequences reveals additional upstream AUG codons in the vicinity of the IRES SL-VI AUG

A total of 9347 enterovirus sequences were analyzed for the presence of additional upstream AUGs. Using BLASTCLUST, sequences were grouped into 41 clusters based on an 80% identity threshold in the polyprotein amino acid sequence (see Methods). We found over 3000 sequences to have one or more additional AUGs within 20 nucleotides upstream or downstream of the SL-VI uAUG (Figs 1A and S1S3). Most of these additional AUG codons were found upstream rather than downstream of the SL-VI AUG (Fig 1A), and additional AUGs were present in both rhinoviruses and enteroviruses (S3 Fig). However, the majority of sequences with an additional upstream AUG fell into the “enterovirus A”, “enterovirus C” and “enterovirus D” clusters (S1 Fig), and in fact, these represented the majority of all sequences for the “enterovirus C” and “enterovirus D” clusters (S3 Fig). We define these additional non-SL-VI AUGs as “alternative upstream AUGs”.

In rhinoviruses, the SL-VI AUG is positioned close to the ppORF AUG, whereas in enteroviruses the SL-VI AUG is positioned ≥50 nt and often ≥150 nt upstream of the ppORF AUG (S3 Fig) providing space for a protein-coding uORF. As discussed in Lulla et al. (2019) [5], a lengthy SL-VI AUG-initiated uORF is present in the majority of sequences in the “enterovirus A” and “enterovirus B” clusters and a substantial number of sequences in the “enterovirus C” cluster, besides some less-abundantly-sequenced clades such as the “enterovirus E”, “enterovirus G” and “porcine enterovirus 9” clusters (S3 Fig). This is reflected in the peak at ~60–75 codons in a histogram of SL-VI AUG-initiated uORF lengths across all 9347 sequences (Fig 1B, upper panel). In contrast, in most cases the ORF beginning with an alternative upstream AUG codon (where present) was found to be very short (Fig 1B, right panel). However, there were some cases where the ORF beginning with the alternative upstream AUG codon had a length of ~70 codons (Fig 1B, lower panel). Looking more closely, we found that in 44 and 22 sequences in the “enterovirus C” and “enterovirus A” clusters, respectively, the alternative upstream AUG initiated a uORF with ≥40 codons upstream of the ppORF (S2 and S3 Figs). In some of these sequences (12 “enterovirus C”, 22 “enterovirus A”, and the singleton sequence AF326750), the alternative upstream AUG supplanted the SL-VI AUG as a suitable initiation site for a UP-encoding uORF (according to the Lulla et al. 2019 definition [5], namely an ORF which overlaps the ppORF by at least 1 nt, is not in-frame with the ppORF, and contains at least 150 nt upstream of the ppAUG). Given these findings, we wanted to investigate whether some enteroviruses could indeed utilize alternative upstream AUG codons for translation initiation.

Analysis of E. alphacoxsackie and E. coxsackiepol clades that encode a UP-like protein initiated at an alternative upstream AUG codon

Looking first at the 22 E. alphacoxsackie sequences, besides the singleton sequence AF326750, we found the additional AUG (uuAUG, upstream upstream AUG) to be located 10 nt upstream of the SL-VI uAUG. Translation from the uuAUG in these sequences would yield a protein closely related to the characterized UP [5]. Aside from an altered N-terminus (MVTI in the uORF compared to MAAY in the uuORF), the predicted protein possesses typical UP features, including a size of 8–9 kDa, a transmembrane helix (TMH) domain [10], a conserved WIGHP sequence, and a high isoelectric point (9.5–10.7) (Fig 2A). The canonical uAUG-initiated short ORF in these isolates was truncated by an in-frame stop codon at the tip of SL-VI, resulting in a 4 aa peptide upon translation (Fig 2B). An analysis of metadata associated with these sequences revealed that the viruses were collected from different geographical locations (India, Bangladesh, China, Cameroon, Cambodia, and the Netherlands), mostly isolated from gastrointestinal samples, and often associated with paralysis or myelitis (S1 Table).

thumbnail
Fig 2. E. alphacoxsackie sequences that encode a UP-like protein initiated at an alternative upstream AUG codon.

(A) Nucleotide (left) and uuORF-encoded protein (right) sequences in enterovirus A89/A76/A90/A121/A91, where the uAUG-initiated uORF is truncated and the uuAUG-initiated uuORF can potentially rescue UP expression. Transmembrane helix (TMH) predictions are highlighted in yellow (50–80% confidence) or green (>80% confidence). Two ORFs are indicated in purple (uuORF) and blue (uORF) highlighting. (B) Schematic representation of the EV-A90 (JX390656) IRES dVI region with uORF (blue), uuORF (purple), and ppORF (orange) start and stop codons annotated.

https://doi.org/10.1371/journal.ppat.1013967.g002

The singleton sequence AF326750 (Fig 2A) originated from a baboon stool sample [11]. Its UP sequence and properties were divergent from the other sequences, with a much longer protein (with predicted TMH) and the absence of the conserved WIGHP sequence (Fig 2A).

Looking at the 12 E. coxsackiepol sequences, we also found the additional AUG (uuAUG) to be located 10 nt upstream of the SL-VI uAUG. Translation from the uuAUG in these sequences would yield a protein closely related to the characterized UP [5]. Aside from an altered N-terminus, the predicted protein meets the above-mentioned criteria, albeit with fewer sequences having a predicted TMH (Fig 3). The canonical uAUG ORF in these isolates is truncated by an in-frame stop codon, resulting in a 4–18 aa peptide upon translation (Fig 3).

thumbnail
Fig 3. E. coxsackiepol sequences that potentially encode a UP-like protein from an alternative upstream AUG codon.

Nucleotide (left) and uuORF-encoded protein (right) sequences in enterovirus CVA19/CVA1/C113/C116/CVA22, where the uORF is truncated and the uuORF can potentially rescue UP expression. Transmembrane helix (TMH) predictions are highlighted in orange (20–50% confidence), yellow (50–80% confidence) or green (>80% confidence). Two ORFs are highlighted in purple (uuORF) and blue (uORF). uSTOP* indicates stop codons in the uORF for the last two sequences; the remaining ten have stop codons further downstream, resulting in a 17–18 aa peptide.

https://doi.org/10.1371/journal.ppat.1013967.g003

The existence of a double AUG signature in these two different groups of Enterovirus species provides additional capacity for coding if both AUGs are translationally competent. This may be particularly important for those enterovirus strains where a long uORF is initiated from the uuAUG instead of the uAUG (Figs 2 and 3). Taken together, these analyses suggest alternative UP-coding possibilities, which have been previously overlooked.

Both upstream AUGs can be used for translation initiation in enterovirus CVA13

To test the possibility of translation initiation at uAUG and uuAUG codons during virus infection, we utilized ribosome profiling (Ribo-Seq), a technique for the global footprinting of translating ribosomes. To identify translation initiation sites, we used the translation inhibitor lactimidomycin (LTM), which acts on initiating ribosomes [12]. This approach has proven valuable to detect closely located translation initiation events during RNA virus infection [13]. We used the CVA13 Flores strain of E. coxsackiepol that contains both the SL-VI uAUG and a uuAUG 10 nt upstream, with the uAUG-initiated ORF coding for the UP (Fig 4A). The infection dynamics of this virus followed a classical enterovirus scenario as previously observed for E. alphacoxsackie, E. betacoxsackie and E. coxsackiepol strains [5], so we selected two representative time points corresponding to early (5 hpi) and late (7 hpi) infection (Fig 4B). Profiling of initiating ribosomes revealed two distinct initiation peaks corresponding exactly to the uuAUG and the uAUG (Fig 4C), indicating that translation initiation can occur at either AUG. We conducted a similar experiment without LTM pre-treatment to confirm that this was not an artifact of antibiotic treatment. Interestingly, both the uuORF and the uORF were occupied by translating ribosomes, despite the uuORF only coding for an 8 aa peptide (Fig 4D). The difference between LTM-treated and non-treated conditions is apparent due to decreased non-initiator reads in LTM-treated samples (Fig 4C-D).

thumbnail
Fig 4. Translation initiation in enterovirus CVA13.

(A) Schematic representation of the CVA13 IRES dVI region with the uORF (blue), uuORF (purple) and ppORF (orange) start and stop codons annotated. (B) Analysis of viral protein expression in HeLa cells infected with CVA13. Cells were infected at an MOI of 10, harvested at 0–18 hpi as indicated, and accumulated virus structural protein VP3 was analyzed by western blotting with the anti-VP3 antibody. Observed cytopathic effect (CPE) of virus infection is indicated by (−) absence, (+) <40% CPE, (++) 40–80% CPE, (+++) >80% CPE. (C) Ribosome profiling of CVA13-infected cells at 5 and 7 hpi in the presence of translation initiation inhibitor lactimidomycin. (D) Ribosome profiling of CVA13-infected cells at 5 and 7 hpi without lactimidomycin treatment. (C-D) Ribo-Seq RPF densities (mapping positions of 5′ ends of reads with a + 12 nt offset to indicate the approximate P site) in reads per million mapped reads (RPM). Colors indicate the triplet phase of reads relative to the genome start. Since most read 5′ ends map to the first nucleotide of codons (S5D Fig), reads deriving from translation in the uuORF (+1), uORF (+2) and ppORF (0) are expected to be predominantly purple, blue and orange respectively. The amino acid sequences of the predicted proteins/peptides encoded by the three ORFs are displayed underneath the profiles. The highlighted green region indicates the predicted transmembrane domain of the UP protein encoded by the uORF. Full-genome plots and Ribo-Seq quality control plots are provided in S5 Fig.

https://doi.org/10.1371/journal.ppat.1013967.g004

Surprisingly, the uORF and uuORF initiation peaks were higher than the ppORF initiation peak in the LTM-treated samples (Fig 4C), contrary to the ratio of expression levels expected from previous work [5] and subsequent functional analysis (see below). This could potentially result from ligation biases or an artifact of LTM treatment [14]. To investigate the latter possibility, we selected cellular mRNAs with upstream AUGs and compared ribosome occupancy levels on the uAUGs and the main ORF AUGs (mAUGs) (S4 Fig). Due to late infection-associated host shut-off, the 7 hpi LTM-treated libraries had very few host-mapping reads. In the 5 hpi LTM-treated libraries, the mean host uAUG:mAUG occupancy ratio was ~ 0.31 whereas for the virus this ratio was ~ 10.8. The analysis comes with many caveats: relatively few host mRNAs with suitable uAUGs, differences in uAUG-mAUG spacing and other 5′ UTR features between host and viral mRNAs, unquantified levels of alternative splice forms, different ribosome loading levels of host and viral transcripts, etc (see Methods). However, the analysis of host mRNAs does indicate that the high uAUG:mAUG ratio seen for the virus (~35-fold difference from host) is likely not simply an artifact of LTM treatment, but might instead be an IRES-specific effect of the drug whereby the magnitudes of the uORF and uuORF initiation peaks do not reflect their translation levels.

We assessed Ribo-Seq quality by plotting histograms of ribosome-protected fragment (RFP) mapping positions relative to annotated initiation and termination sites in host mRNAs (S5A-B Fig), and length and triplet phasing distributions of RFPs mapping to virus or host mRNA coding regions (S5C-D Fig). The read length distribution of virus-mapping reads was sharply peaked at 27–29 nt (S5C Fig). However, the read-length distribution for the host-mapping reads was less sharply peaked, especially at the later time point (7 hpi). As explained in Lulla et al., (2019) [5], we take this to be a consequence of severe virus-induced shut-off of host translation which, for the host, greatly reduces the proportion of bona fide RPFs to contamination at 7 hpi, hence leading to a host-specific reduction in Ribo-Seq quality. To maximize the contribution of bona fide RPFs for both the host and the virus, we restricted all subsequent analyses to 27–29 nt reads only. The majority of virus-mapping read 5′ ends mapped to the first nucleotide of codons (S5D Fig), indicating that the Ribo-Seq datasets were overall of high quality. For the host, even after restricting to 27–29 nt reads, the phasing quality was noticeably degraded (more reads mapping to the 2nd and 3rd positions of codons), especially at 7 hpi. Analysis of host mapping reads confirmed that LTM-treated samples mapped initiating ribosomes (S5A Fig) and non-treated samples mapped elongating ribosomes (S4B Fig). The full-virus-genome translation initiation (S5E Fig) and elongation (S5F Fig) profiles were consistent with previously analyzed enterovirus Ribo-Seq datasets [5,13]. The noisier LTM virus profile at 7 hpi (S5E Fig) could be linked to a restricted ability of apoptotic cells to take up the drug.

Therefore, ribosome profiling experiments confirm that both upstream AUGs can be used for translation initiation during enterovirus CVA13 infection. The relative usage of these ORFs can be defined using reporter systems [5].

Evaluation of long uORF and ppORF translation using an IRES reporter system

We previously observed that ribosome profiling may not provide accurate information on relative translation efficiencies for short ORFs due to ribosome pausing and Ribo-Seq library preparation biases [5,13,15]. Therefore, to evaluate the translation efficiency of long uORFs and the ppORF in different representative enterovirus species (Figs 2-3), we employed a previously developed dual-luciferase reporter system [5], where a 2A-FFLuc cassette was placed in the 0, + 1 or +2 frame relative to the ppORF (Fig 5A). We chose three different representative virus strains, with differing uuORF/uORF configurations: a UP-coding uORF in the + 2 frame with a truncated uuORF in the + 1 frame (CVA13); a UP-like-coding uuORF in the + 2 frame with a truncated uORF in the 0 frame (CVA1); and a UP-coding uuORF in the + 1 frame with a truncated uORF in the + 2 frame (EV-A90). The CVA13 sequence was derived from the clinical isolate used for ribosome profiling (Fig 4), whereas the 5′ UTRs of CVA1 and EV-A90 were synthesized in vitro. The CVA1 IRES (JX174177) was derived from the virus isolated from the stool sample (S6 Fig), and the EV-A90 IRES (JX390656) was derived from a sequence from an acute flaccid paralysis patient (Fig 2). The design of reporters included an unmodified WT 5′ sequence until nucleotide 750 (CVA13), 716 (CVA1) or 754 (EV-A90), followed by a 2A/FFLuc reporter in three possible frames, with a stop codon introduced at the end of the FFLuc gene (Figs 5B-D and S6C). Due to differences between locations and orders of short peptide-encoding uORF and potential UP-encoding uORF, we call the latter “longer upstream ORF”. Consistent with the uORF translation levels seen previously in echovirus 7 (4–5% relative to the ppORF) [5], expression of the longer upstream ORF in the three new enterovirus cassettes was in the range 3–6%, depending on cell line and virus strain (Fig 5). Translation initiation at the AUG codon for the shorter (truncated) upstream ORF would lead to early termination, upstream of the firefly luciferase reporter. Therefore, for the remaining constructs, firefly activity measures background – likely including low-level initiation at various non-AUG sites upstream of the firefly coding sequence (S6C Fig). These background levels were in the range 1–3%. For all three virus strains, translation of the longer upstream ORF was significantly higher than the background level (Fig 5), thus confirming that UP can be translated from either a uAUG or a uuAUG.

thumbnail
Fig 5. Usage of both upstream AUGs in different enterovirus IRES reporters.

(A) Schematic representation of the modified pSGDluc expression vectors used to measure translation at the polyprotein (ppORF, orange) and two upstream (blue and purple) AUG codons. (B-D) Analysis of IRES activities for ppORF, uORF and uuORF expression relative to cap-dependent expression by dual-luciferase reporter assay in HeLa and HEK293T cells (FFLuc/RLuc) at 8 h post-transfection for CVA13 (B), CVA1 (C) and EV-A90 (D) enterovirus IRESes. IRES activities in each of the three frames were normalized to cap-dependent signal and presented as relative activities in the three frames (means ± SD, n = 3 biologically independent experiments). Each panel contains a schematic representation of the corresponding IRES reporters (left). The full annotated sequence of the region can be found in S6C Fig. Statistical analysis of the difference between the translation efficiency of frames +1 and +2 was conducted using two-tailed t-tests; * p value ≤ 0.05.

https://doi.org/10.1371/journal.ppat.1013967.g005

Functional significance of the uuAUG in enterovirus CVA13

To address the functional significance of a non-UP-encoding uuAUG in the virus context, the enterovirus CVA13 Flores strain (utilized for ribosome profiling, Fig 4) was used to develop a CVA13 reverse genetics construct. This was made by fusing the entire CVA13 genome sequence after a T7 promoter (Fig 6A). We confirmed that the recombinant CVA13 virus was stable on passaging and demonstrated growth properties similar to the parental clinical isolate (Fig 6B). A CVA13-uuKO recombinant virus was created with an AUG → GUG substitution to the uuAUG (Fig 6A). This substitution was already present in minor quantities in the clinical CVA13 isolate (S7 Fig) and is predicted to knock down translation of the 8 aa uuORF peptide while preserving the translation of both other frames (uORF and ppORF), similarly to what we observe in enterovirus E7 with a single uAUG [5]. We confirmed the expression of UP in all three viruses, suggesting the uuAUG-independent translation of the two other frames (Fig 6C). We used several alternative approaches to probe the functional consequences of preventing uuORF expression. First, the WT and CVA13-uuKO viruses were competed in the highly susceptible HeLa cell line. Both viruses were retained in the population for six passages in four independent experiments, confirming no advantage/disadvantage of the uuKO mutation in HeLa cells (Fig 6D).

thumbnail
Fig 6. Development of the reverse genetics (RG) plasmid for CVA13 and properties of the uuORF knockout virus in HeLa cells.

(A) Schematic representation of the CVA13 infectious clone with the zoomed-in SL-VI/uORF region indicating the uuKO mutation (AUG → GUG). (B) Multistep growth curves of CVA13 viruses in HeLa cells (MOI 0.1). (C) Analysis of VP3 and UP expression in CVA13-infected HeLa cells. Cells were infected at MOI 10, harvested at the indicated time post-infection, and analyzed by western blotting. (D) CVA13 WT and uuKO virus competition was performed in HeLa cells. Cells were infected with WT and uuKO mutant viruses at the indicated proportions in four independent experiments and passaged six times at MOI 0.01. The virus RNA from passages 1 and 6 was isolated, RT-PCR amplified, and sequenced. The proportion of A579(WT)/G579(uuKO) nucleotide abundance was determined for each sample.

https://doi.org/10.1371/journal.ppat.1013967.g006

We then hypothesized that presence of the uuAUG might be important in terminally differentiated cells, such as intestinal epithelia or neurons – natural infection sites for many enteroviruses [4]. Therefore we developed an infection system for CVA13 in differentiated iPSC-derived i3Neurons [16,17] and assessed the replication kinetics of WT and uuKO viruses. We confirmed efficient virus replication in this system but did not observe any differences between the growth properties of the tested viruses (Fig 7A). However, a competition experiment using two different WT[AUG]:uuKO[GUG] ratios (30:70 and 70:30) revealed a shift towards WT virus in the infection mixture when the WT:uuKO ratio was 30:70 (Fig 7B), suggesting an advantage of the uuAUG-containing virus (WT) in this system.

thumbnail
Fig 7. Properties of the uuORF knockout virus in the neuronal and intestinal organoid infection models.

(A) Differentiated iPSC-derived neurons were infected in quadruplicate at MOI 0.1, and the released virus was analyzed by plaque assay in HeLa cells. (B) CVA13 WT and uuKO virus competition was performed in differentiated iPSC-derived neurons. Neurons were infected at MOI 0.1 with WT and uuKO mutant viruses at the indicated proportions in 12 independent experiments and incubated for 6 days with media changes every other day. The virus RNA from input and post-competition samples was isolated, RT-PCR amplified, and sequenced. The proportion of A579(WT)/G579(uuKO) nucleotide abundance was determined for each sample and presented as a doughnut plot (left, mean values) and box and whiskers plot with boxes extending from the 25th to 75th percentiles, and whiskers ranging from the smallest to largest values (right, median values). (C) Monolayers of differentiated cultured intestinal organoids were infected in quadruplicate with CVA13 viruses at MOI 10, aliquots of culture media were collected at the indicated time points, and viral titers were analyzed by plaque assay in HeLa cells. (D) CVA13 WT and uuKO virus competition was performed in differentiated human intestinal organoids. Organoid monolayers were infected at MOI 1 with WT and uuKO mutant viruses at the indicated proportions in 10 independent experiments and incubated for 4 days with media changes after 24 and 48 hpi. The virus RNA from input and post-competition (96 hpi) samples was isolated, RT-PCR amplified, and sequenced. The proportion of A579(WT)/G579(uuKO) nucleotide abundance was determined for each sample and presented as a doughnut plot (left, mean values) and box and whiskers plot with boxes extending from the 25th to 75th percentiles, and whiskers ranging from the smallest to largest values (right, median values). The data in (A) and (C) represent two biologically independent experiments; ns, nonsignificant using nonlinear regression analysis. The sequencing of the RT-PCR product derived from virus collected at 96 hpi (neuron) and 72 hpi (organoid) is provided in S7 Fig. Statistical analysis (B, D) was performed using two-tailed t-tests; ** p value ≤ 0.01; * p value ≤ 0.05; ns, nonsignificant.

https://doi.org/10.1371/journal.ppat.1013967.g007

An advantage of WT over uuKO CVA13 was also observed in differentiated human intestinal organoids. The growth of individual viruses was nearly identical for all three viruses (Figs 7C and S7), resulting in productive virus release without any cytopathic effect, suggesting possible persistent infection of CVA13. Due to the limited lifespan of differentiated organoid cultures (4–5 days post-differentiation), assessing this in longer-term conditions was not possible. Interestingly, the more sensitive competition experiment using two different WT[AUG]:uuKO[GUG] ratios (30:70 and 70:30) revealed a shift towards WT virus in the mixture when the WT:uuKO ratio was 30:70 (Fig 7D), recapitulating the results observed in differentiated neurons and confirming an advantage of the uuAUG-containing virus (WT) in human intestinal organoids. No changes were detected in either system when the initial WT:uuKO ratio was set at 70:30, suggesting no competitive pressure when WT is already over-represented in the mixture (Fig 7B and 7D).

Taken together, our results suggest that mutation of the uuAUG has no effect in the HeLa cell line but resulted in an apparent phenotype in competition assays using physiologically relevant systems: terminally differentiated neurons and intestinal organoids. These results echoed the UP phenotype, which was only evident in the organoid system [5], and emphasize the role of IRES elements in differentiated cells/tissues. To investigate the mechanistic role of the uuORF, we decided to assess other elements of the uuORF in a sensitive reporter system.

Mechanistic roles of 5′ and 3′ elements in IRES-dependent translation

To investigate a potential role for the uuAUG in modulating translation of other ORFs, we utilized the above-mentioned CVA13 reporter system. However, we aimed to improve it by including the viral 3′ UTR as it was previously shown that IRES function and its cell type-specific restrictions may be influenced by cis-acting interactions between the IRES and the Z domain of the 3′ UTR [18,19]. In addition, we optimized the reporter assay in human intestinal epithelial cells (HIEC6) as a more relevant model system to evaluate enterovirus-specific translation. The uuKO mutation was introduced in both versions of the reporter construct (with and without 3′ UTR) and for firefly luciferase in each of the three frames (Fig 8A). Interestingly, the relative usage of the functional ppORF (0) and uORF (+2) frames did not change between the original and 3′ UTR-containing reporters; however, the overall translation was increased 5–6 fold (Fig 8B), confirming the importance of the 3′ UTR in IRES-dependent translation. We used both reporters to evaluate the effect of the uuKO mutation in the reporter systems. In agreement with assays in infected HeLa cells (Fig 6), in most tested conditions, we did not observe any differences between WT and uuKO reporters in either cell line (Figs 8C-D and S8). We also added viral RNA to transfection mixtures to mimic the infection condition since viral proteins such as 2A were previously shown to affect IRES- and cap-dependent translation [20]. Overall, adding virus RNA also did not result in differences between WT and uuKO reporters (Figs 8C-D and S8).

thumbnail
Fig 8. Mechanistic roles of 5′ and 3′ elements in IRES-dependent translation.

(A) Schematic representation of the 5′ UTR and 5′-3′ UTR reporters used to measure translation in the three frames. (B) Analysis of IRES activities for CVA13 reporters in the three frames in the 5′ UTR and 5′-3′ UTR reporters in HeLa and intestinal HIEC6 cells. Fold changes between reporter activities are indicated above each pair. (C) Schematic representation of the three CVA13 ORFs and the uuORF knock-out introduced via AUG → GUG mutation. (D) Analysis of IRES activities for CVA13 5′-3′ UTR reporters in the three frames in HeLa and intestinal HIEC6 cells with and without virus RNA at 8 h.p.t. The full set of 5′ UTR and 5′-3′ UTR reporter data can be found in S8. (E-F) Fig Schematic representation of the CVA1 (E) and EV-A90 (F) ORFs and AUU mutants used to measure translation in three frames in HeLa and HIEC6 cells. Statistical analysis was conducted using two-tailed t-tests; * p value ≤ 0.05; ns, nonsignificant. The full annotated sequence of the region can be found in S10 Fig. (G) Schematic representation of uORF-dependent translation in eukaryotic ATF4 mRNA. (H) Analysis of IRES activities for CVA13 reporters in the ppORF and +1 uORF frames in response to sodium arsenite-mediated stress. The normalized cap-dependent translation corresponds to the Renilla luciferase signal for the indicated reporter. Statistical analysis (E, F, H) was performed using two-tailed t-tests; * p value ≤ 0.05; ns, nonsignificant. (I) Schematic representation of uORF-preferred translation in CVA13 during stress conditions. (E,F,I) The SL-VI structure is shown for the context of the start and stop codon positions; it is unwound during translation [29].

https://doi.org/10.1371/journal.ppat.1013967.g008

Although we did not observe any effect upon mutation of the uuAUG, we also wanted to test whether the position of the uuORF stop codon might affect translation of the other ORFs. The positions of stop codons in short uORFs/uuORFs vary between viruses. Out of 9333 analyzed sequences S9 Fig), 3975 have a short AUG-to-stop-codon ORF in any frame (where short ORF here means that the AUG is anywhere up to 20 nt 5′ of the SL-VI AUG, and the stop codon is anywhere up to 30 nt 3′ of the SL-VI AUG). Of these, 1545 have a stop codon in-frame with the SL-VI AUG within the 18 nt 3′ of the SL-VI AUG, i.e., they have a SL-VI AUG ORF of less than or equal to 6 sense codons (S9 Fig).

We hypothesized that removing stop codon(s) to turn short ORFs into longer ORFs might affect translation in the other IRES-initiated ORFs. In some viruses, there are two stop codons in SL-VI in the same frame. For example, in CVA13 the second stop codon is situated directly opposite the conserved SL-VI uAUG (Fig 6A), making it difficult to manipulate without breaking the SL-VI structure and IRES function [5,9,21]. However, for the two previously used IRES examples (Fig 5C-D), the short ORF frame has a stop codon that is located in the loop of SL-VI. Therefore, these SL-VI stop codons could be manipulated, similar to a previously used uORF knockout strategy (Loop mutant) [5]. In the first example – the CVA1 IRES – the stop codon for the short SL-VI uAUG-initiated ORF is in the same frame as the ppORF and there are also two additional stop codons present in this frame in the linker region at positions 640 and 661 (Figs 8E and S10A). Mutation of the (0) frame SL-VI stop to an isoleucine codon (AUU) affected the relative expression levels in the (+2) frame (Fig 8E). In the second example – the EV-A90 IRES – mutation of the SL-VI stop codon decreased the (+1) uuORF expression (Figs 8F and S10B), suggesting that the upstream ORF expression can be modulated through the SL-VI stop codon. Additional stop codons are present in the (+2) frame downstream of the EV-A90 SL-VI (positions 644, 689 and 731). Taken together, these results suggest a modulatory role of the stop codon in SL-VI, possibly through ribosome recycling between closely located stop and start codons.

In response to stress, cellular uORFs – such as in the mRNA that encodes activation transcription factor 4 (ATF4) – can act as enhancers of main ORF translation (Fig 8G) or, more precisely, stress counters uORF-mediated repression of main ORF translation [22]. Upon enterovirus infection, host translational shut-off is a well-established process caused by viral protease 2A-driven cleavage of eIF4G [23], where the cleaved N-terminal domain is not required for IRES-dependent translation [24,25]. It is plausible that uORFs in enteroviruses can also contribute to responding to stress and/or regulation of IRES-dependent translation [26,27]. To test this hypothesis, we used 5′-3′ UTR-containing reporters (Fig 8A) in the presence of sodium arsenite, a well-known stress inducer [28]. Our results revealed a significant enhancement of uORF translation, whereas cap-dependent and ppORF translation efficiencies were significantly reduced (Fig 8H) without a decrease in cell viability (S10C Fig). These results can mimic the late stage of infection, when virus infection-induced stress may result in preferential expression of the uORF (Fig 8I). This places our results into an interesting context: UP-specific function is particularly important at late stages of infection, where virus release is facilitated in gut epithelial cells [5], thus suggesting temporal regulation of uORF expression during virus infection.

Taken together, our results reveal mechanistic details of ORF-specific translation in enteroviruses, including (i) 3′-UTR enhancement of IRES-dependent translation (Fig 8B), (ii) a modulatory role of the SL-VI stop codon (Fig 8E-F), and (iii) stress-dependent enhancement of uORF expression (Fig 8H-I).

Discussion

Advances in ribosome profiling coupled with computational methodologies have allowed the identification of novel translated ORFs [30]. RNA viruses have very compact genomes, so effective strategies for gene expression and regulation come with additional challenges, often resulting in overlapping functional elements [31]. We have previously demonstrated that many enteroviruses have an additional upstream translated ORF [5]. However, the absence of a predicted SL-VI uAUG-initiated uORF in many enterovirus sequences made us wonder whether other noncanonical routes of UP expression might exist.

Here, we show that many enteroviruses have two upstream AUG codons, the SL-VI uAUG and an additional AUG (uuAUG). Combining several computational and experimental approaches, we address the significance of the additional upstream AUG in enterovirus genomes. First, using ribosome profiling of CVA13-infected cells, we confirm that both upstream AUG codons can be used for translation initiation (Fig 4).

Regulation via uORF expression is important in cell/tissue-specific translation, particularly in neural tissues [32,33]. Enterovirus infections can lead to CNS complications, such as meningitis, encephalitis, acute flaccid paralysis, and other manifestations. The majority of neurovirulence determinants in enteroviruses have been mapped to IRES elements (reviewed in Lulla and Sridhar, 2024) [4], highlighting the likely role of translation regulation [34]. In addition, IRES elements, particularly uAUGs, are often associated with neurovirulence in mice [35,36], and IRES-related neurotropic properties have been recently described for the mouse-adapted EV-A71 strain [37]. Therefore, one possible explanation for a role of the additional upstream AUG in enteroviruses could be tissue-specific regulation of translation. In line with these predictions, we observe that WT (double-uAUG) CVA13 outcompeted the uuKO virus in both neuronal and intestinal organoid infection models (Fig 7). These findings suggest that the uuORF in CVA13 can modulate translation and/or replication in terminally differentiated infection sites, such as intestinal epithelia and neurons.

Upstream ORFs have been characterized for several viral and eukaryotic mRNAs [5,3841]. The functional significance and regulatory mechanisms may differ; however, some parallels can be drawn. In coronaviruses, the role of uORFs has been characterized as regulatory but not essential [42]. Interestingly, in line with our results, Wu et al. showed that disruption of a coronavirus uORF enhanced translation of the main ORF in vitro, without detected effects on coronavirus replication. The authors also highlight the regulatory role of the uORF and its potential role in long-term survival during persistent infection [42]. By using a system similar to our bicistronic reporters, it was shown that a uORF modulates L mRNA (polymerase) translation in ebolaviruses in response to cellular stress. In contrast to enteroviruses and coronaviruses, knockout of the ebolavirus uORF led to a one-log decrease in virus titer in several cell lines [43], suggesting stronger dependence on the uORF in this virus. The HIV-1 genome encodes a minimal uORF, consisting of a start and stop codon only, overlapping with the Vpu start site. Mutation of either of these two codons results in a five-fold reduction in Env protein expression [44], further highlighting short ORF-mediated modulation of translation in different virus families.

We further corroborated our findings using several IRES reporter systems (Figs 5 and 8). Interestingly, the additional upstream AUG alone did not provide any advantage to CVA13 (E. coxsackiepol) during infection of susceptible cell lines (HeLa, HIEC6) and when tested in reporter systems. Due to the overlapping nature of functional elements in RNA viruses [31], it was not technically possible to address all scenarios in the context of virus infection and/or IRES reporter systems. Moreover, testing IRES reporters in neuronal cells or intestinal organoids may provide more relevant insight, but it is currently not technically feasible. Nevertheless, we combined several possible approaches and reporter systems derived from three different enteroviruses and found that a truncated uORF affects translation of other ORFs. Interestingly, removing SL-VI stop codons led to dysregulation of long uuORF usage (Fig 8E-F). There are several possible interpretations of these results. It is possible that the short uORF recruits additional ribosomes, which, after translating the short SL-VI ORF, can then be recycled to translate the long uuORF. When the loop stop codon of the truncated uORF is mutated to AUU, translating ribosomes terminate too far downstream to be recycled back to the uuORF AUG codon, so we observe reduced translation in the (+2) frame (Fig 8E) and the (+1) frame (Fig 8F). In analogy to these results, stressed eukaryotic cells can regulate their gene expression via reinitiation after uORF translation [45]. Alternatively, this effect can be IRES context-specific or cell type-specific (Fig 7). For example, the AUU mutant may alter IRES structure and/or IRES trans-activating factor (ITAF) recruitment and subsequently affect the uuORF to uORF initiation ratio. Interestingly, Zika virus uORFs can also modulate main ORF expression, further expanding the uORF-dependent translational modulation [41].

Regulation of mRNA translation by uORFs is well described for several cellular transcripts. One of them, encoding ATF4, contains two uORFs that affect stress-dependent regulation of ATF4 synthesis (Fig 8G) [22,46]. We observe a distinct phenomenon for CVA13 translation where, during stress, the uORF rather than main ORF translation is favored, potentially releasing more of the accessory protein UP at later stages of infection when its function is most apparent [5]. Both uORFs have initiation codons within a poor Kozak context [47], which is likely to be selected in this manner to downregulate the production of accessory protein(s) and facilitate translation of the ppORF. The usage of uORF-specific initiation was previously characterized using an in vitro reconstitution system [48]. The extent to which uAUGs are used correlates with the stability of the SL-VI domain, which may represent another way of translation modulation.

This study highlights the consideration of multiple upstream AUGs and upstream ORFs when annotating enterovirus genomes. The ability of enteroviruses to encode the UP protein via multiple routes further expands their coding capacities and provides a platform for further investigations into IRES-dependent translation. Furthermore, the cell type-specific preferences demonstrated in this work provide a foundation for further investigations into IRES-mediated pathogenicity and neuro-attenuation mechanisms.

Materials and methods

Bioinformatic analysis of uORF and uuORF characteristics in enterovirus sequences

Enterovirus sequences were retrieved from the National Center for Biotechnology Information (NCBI) nucleotide database on 22 March 2024 by searching with the parameters ‘txid12059[Organism:exp] AND 5000:50000[Sequence Length]’, which yielded 11,791 sequences, of which 30 were NCBI RefSeqs. Sequences with PAT (patent), SYN (synthetic) or CON (constructed) labels in the GenBank Division field were removed, leaving 11,469 sequences with the “VRL” (viral) label and 228 sequences with the ENV (environmental sample) label. Sequence accession OP414044 was discarded due to the first ~1440 nt unexpectedly mostly comprising a perfect inverted repeat of itself. Sequence accessions AB469183 and LC637981 were discarded as, although labelled as “VRL”, they are synthetic constructs with GFP-encoding sequence inserted. An unusual group of recombinant enterovirus-G-like sequences, where the enterovirus structural protein-coding region has been switched for a porcine torovirus papain-like cysteine protease gene and other foreign sequences [49,50] were also discarded. ON457567 was removed because ~230 codons in the 2B-2C region are shifted out of frame, explaining the high divergence of this sequence from other enterovirus sequences.

Next, we aimed to annotate the authentic polyprotein ORF (ppORF) in each sequence. In some sequences (particularly rhinoviruses) the longest AUG-to-stop-codon ORF begins 5′ of the authentic polyprotein initiation AUG codon (ppAUG). To annotate the correct ppAUG, we aligned sequences to the 30 RefSeq polyproteins. First, to check that the RefSeq annotations were themselves correct, we extracted the polyprotein amino acid sequences from the 30 RefSeqs and aligned them with MUSCLE version 3.8.31 [51]. All 30 amino acid sequences aligned perfectly at the N terminus (i.e., no N-terminal alignment gaps), indicating that none had misannotated upstream initiation sites. Next, we identified the longest stop-codon-to-stop-codon ORF in each non-RefSeq sequence. Sequences that lack the ppORF stop codon or that have no ORF ≥ 3000 nt in length were removed. The corresponding amino acid sequences were aligned pairwise with each of the 30 RefSeq polyprotein sequences, again using MUSCLE. For each non-RefSeq sequence, the highest-identity RefSeq was determined. If the polyprotein initial methionine of this RefSeq aligned to a methionine in the non-RefSeq sequence, then that methionine was annotated as the N-terminus of the non-RefSeq polyprotein. The small number of remaining sequences were manually inspected via multiple sequence alignment with the 30 RefSeq, and all were found to be incomplete sequences that did not extend to the 5′ end of the ppORF; these sequences were therefore discarded. Sequences that were substantially truncated at the polyprotein C-terminus – namely sequences that did not align with the 30 RefSeqs at least up to a perfectly conserved “WH” ~ 40 amino acids from the end of the polyprotein – were also discarded. We annotated the SL-VI region in each sequence by searching the 210 nt region immediately upstream of the ppAUG for the motif UUAUGGU[C/G]ACA, which is a (mostly) conserved sequence around the SL-VI AUG (underlined), or slight variations thereof. Sequences with >10 ambiguous nucleotide codes (for example ‘N’s) in the region from 10 nt upstream of the SL-VI AUG to the ppORF stop, sequences containing KEYWORDS tags UNVERIFIED, STANDARD_DRAFT, VIRUS_LOW_COVERAGE or VIRUS_AMBIGUITY in the GenBank files, and sequences with <160 nt of sequence upstream of the ppAUG, were discarded (for reference, the SL-VI AUG is 155, 152 and 157 nt upstream of the ppAUG in the enterovirus RefSeqs NC_001612, NC_001472 and NC_002058, respectively). At this stage, 9347 enterovirus sequences remained.

Next, we identified all AUG triplets within the 20 nt 5′ and 20 nt 3′ of the SL-VI AUG and calculated the coordinates of the corresponding ORFs, besides the SL-VI AUG ORF. There were 6137, 3165 and 45 sequences with 0, 1 or 2 AUGs (in addition to the SL-VI AUG) respectively; the SL-VI AUG itself was present in all except four of these sequences with the exceptions being AAG, AGG, AGG and AGG. In Lulla et al. (2019) [5], sequences were defined as having the enterovirus uORF if the ORF beginning with the SL-VI AUG and including the first in-frame stop codon (a) overlapped the ppORF by at least 1 nt, (b) was not in-frame with the ppORF, and (c) contained at least 150 nt upstream of the ppAUG. For each sequence, for the SL-VI AUG and any additional AUG triplets in the 20 nt 5′ and 20 nt 3′ flanking regions, we determined whether the associated ORF met the 2019 uORF criteria. Since there might be situations where a deletion in the uORF region has both reduced the distance between an upstream AUG and the ppAUG below the 150 nt threshold while also changing the reading frame so that UP expression depends on a non-SL-VI AUG, and also to allow for truncated versions of UP, we also investigated less stringent criteria for uORF definition, namely (a) encoded peptide ≥ 40 amino acids, (b) encoded peptide not contiguous with the polyprotein peptide, and (c) distance between the upstream AUG and the ppAUG ≥ 120 nt.

We used the single-linkage BLAST-based clustering algorithm BLASTCLUST [52] with parameters -p T -L 0.90 -b F -S 80 (i.e., ≥ 90% coverage of the shorter sequence, ≥ 80% amino acid identity threshold) to cluster the 9347 ppORF amino acid sequences, which resulted in 41 clusters. In each cluster, we chose a representative sequence for phylogenetic tree generation and tree annotation (S3 Fig), as follows. If a cluster contained one or more NCBI RefSeqs, we used the RefSeq with the numerically lowest accession number. Otherwise, we looked for the ppORF amino acid sequence with the most identical copies (with ties broken arbitrarily) and, of the available corresponding accession numbers, we chose the one coming first alphanumerically. If there were no duplicated ppORF amino acid sequences, we chose the centroid sequence (minimum summed pairwise identity distances from ppORF amino acid sequence i to all other ppORF amino acid sequences j within the cluster). We aligned the 41 representative ppORF amino acid sequences with MUSCLE version 3.8.31 [51] and extracted the RdRp region (N-terminally trimmed to start at INAPSKTK- in NC_002058; this removes a region around the 3CPro-RdRp junction with many alignment gaps). A phylogenetic tree was estimated for the RdRp region using the Bayesian Markov chain Monte Carlo method implemented in MrBayes version 3.2.3, sampling across the default set of fixed amino acid rate matrices, with 5,000,000 generations, discarding the first 25% as burn-in (other parameters were left at defaults). The tree was visualized with FigTree version 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). Transmembrane helix (TMH) domains and their confidence were predicted with Phobius (https://phobius.sbc.su.se/cgi-bin/predict.pl) [10].

Cells and viruses

HEK293T cells (human embryonic kidney cell line, ATCC, CRL-3216), HeLa cells (ATCC, CCL-2) and HeLa Ohio cells (derivative of HeLa cell line, ECACC, 84121901) were maintained at 37°C in Dulbecco’s Modified Eagle Medium (DMEM, Lonza) supplemented with 10% fetal bovine serum (FBS), 1 mM L-glutamine, 20 mM HEPES (pH 7.3) and Penicillin/Streptomycin (10,000 U/ml). HIEC6 cells (human intestinal epithelial cell line, ATCC, CRL-3266) were maintained in Opti-MEM (Gibco) containing 20 mM HEPES, 1 mM L-Glutamine, Penicillin/Streptomycin (10,000 U/ml), 10 ng/ml hEGF (Sigma-Aldrich), and 5% FBS. All cells tested negative for mycoplasma. Human coxsackievirus A13 (CVA13) strain Flores (EVAg, Ref-SKU: 014V-03623) was used for ribosome profiling and the development of a reverse genetics construct. Virus stocks were plaque purified and amplified using HeLa Ohio cells, clarified by centrifugation, purified through a 0.22 µm filter, and titrated on HeLa Ohio cells. Virus stocks were verified by sequencing virus-specific reverse transcribed PCR fragments, and used for ribosome profiling experiments.

Plasmids

To create the reverse genetics clone for CVA13, 5′ and 3′ terminal sequences (EVAg reference sequence AF465511.1) were used to amplify full-length genome using Phusion High-Fidelity DNA polymerase (ThermoFisher Scientific). The amplified genome was cloned into vector pBR322 under the T7 promoter. The resulting clone was sequenced and deposited in GenBank (accession number PQ067515). The uuORF-KO mutation (AUG to GUG) was introduced using site-directed mutagenesis and confirmed by sequencing. The resulting plasmids were linearized with EagI prior to T7 RNA transcription.

To evaluate the efficiency of IRES-mediated translation, a previously established pSGDLuc reporter cassette [5] was used to subclone terminal 5′ UTR sequences from CVA13 (751 nt), CVA1 (JX174177, 717 nt), and EV-A90 (JX390656, 751 nt). The cap-dependent Renilla luciferase gene was followed by an enterovirus 5′ UTR sequence and then, in one of the three frames (0, + 1, + 2), a 2A (StopGo) co-translational separation sequence followed by the firefly luciferase gene. The resulting plasmids were linearized with BamHI prior to T7 RNA transcription. All plasmids were sequenced using the Plasmidsaurus full-plasmid sequencing service.

RNA transcription

RNA transcription was performed using the mMESSAGE mMACHINE T7 transcription kit (Invitrogen). 10 µl transcription reactions were incubated at 37°C for 1 h, and reactions were terminated by treatment with DNase I for 15 min at 37°C. RNAs were further purified using the RNA Clean and Concentrator kit (Zymo Research) and quantified using Nanodrop.

Recovery of CVA13 virus from T7 transcripts

For virus recovery, 107 HeLa Ohio cells were trypsinized, washed with PBS and electroporated with 20 µg T7 RNA in 800 µl PBS pulsed twice at 800 V and 25 µF using a Bio-Rad Gene Pulser Xcell electroporation system. The cell suspension was supplemented with 10% FBS-containing media and incubated at 37°C. After 3 h of incubation and full cell attachment, the media was replaced with serum-free media, and cells were incubated until appearance of CPE. Virus stocks were amplified on HeLa Ohio cells using MOI 0.01, cleared by centrifugation, purified through a 0.22 µm filter, titrated on HeLa Ohio cells, and used for subsequent infections. All mutant viruses were also passaged at least 3 times at low MOI (0.01). The final virus stocks were used for RNA isolation, RT-PCR, and sequencing to confirm the presence of the introduced mutation.

Reporter assay for relative IRES activity

HEK293T, HeLa and HIEC6 cells were transfected in triplicate using the method previously described [5]. Briefly, per reaction, 100 ng of purified T7 transcribed RNA combined with 1 µl Lipofectamine 2000 (Invitrogen) in 10 µl Opti-Mem (Gibco) supplemented with RNaseOUT (Invitrogen; diluted 1:1,000 in Opti-Mem) were added to a suspension of cells at a density of 5 × 104 cells (HeLa) or 1 × 105 (HEK293T, HIEC6) per well on a 96-well plate. Once transfected, cells were supplemented with 5% FBS and incubated at 37°C for 8 h. For sodium arsenite-treated samples (Fig 8H), the drug was added at the time of transfection. Cells were lysed in 100 µl Passive Lysis Buffer (Promega) and freeze-thawed. Luciferase activity was measured using the Dual Luciferase Stop & Glo Reporter Assay System (Promega) as per manufacturer’s instructions. IRES-mediated translation was calculated as the ratio of IRES-dependent translation (firefly) to cap-dependent translation (Renilla). Translation in each of the three frames (0, + 1, and +2) was then normalized to Frame 0 (corresponding to ppORF translation). Three independent experiments were performed.

Virus infections and immunoblotting

HeLa Ohio cells at a density of 3.5 × 105 per 35 mm plate were infected with CVA13 at an MOI of 10 and incubated at 37°C for 0–18 h. Lysates were analyzed by SDS-PAGE, using standard 12% SDS-PAGE to resolve virus structural proteins and 4–20% Tris-Glycine gel (NuSep) to resolve UP as previously described [5]. Structural proteins were detected using a pan-enterovirus monoclonal antibody (MA5–18206, Thermo Fisher) at 1:1,000 dilution, and cellular β-tubulin was detected using anti-tubulin antibody (ab15568, Abcam) at a 1:500 dilution. A custom rabbit polyclonal antibody raised against pre-TMH UP peptide CYHKANWIGHPVKVR (GenScript) was used at 1:200 dilution to detect CVA13 UP. Immunoblots were imaged on a LI-COR ODYSSEY CLx imager and analyzed using Image Studio version 5.2.

Culturing and infection of iPSC-derived neurons and human intestinal organoids

Preparation of iPSC-derived neurons was performed and validated as previously described [16]. Infection was performed at the indicated MOI in cortical neuron media [16]. After 1 h, the virus inoculum was removed and replaced with 50% fresh/ 50% conditioned media. At the indicated times post-infection, media samples were taken and analyzed using a plaque assay in HeLa Ohio cells. For competition experiments, the media was replaced every other day to stimulate new virus release. Media samples were also used for RNA extraction, RT-PCR, and Sanger sequencing.

Human intestinal organoids were obtained from the duodenum of patients and grown as previously described [5]. Differentiated monolayers cultured in 48-well plates were infected at the indicated MOI, and media samples were taken and analyzed by plaque assay in HeLa Ohio cells. For competition experiments, the media was replaced at 24 and 48 hpi to stimulate new virus release. Media samples were used for RNA extraction, RT-PCR, and Sanger sequencing. The differentiation of organoids was confirmed using RT-qPCR for the indicated cellular transcripts (S7 Fig).

Ribosome profiling and computational analysis of Ribo-Seq data

HeLa Ohio cells were grown on 150-mm dishes to a confluency of 80% and infected with CVA13 at an MOI of 10 to ensure synchronous infection. For lactimidomycin (LTM) treated cells, 30 min before the specified time point, cells were treated with 50 mM LTM for 30 min. At 5 and 7 hpi, LTM treated and non-treated cells were flash frozen in an ethanol/dry ice bath and lyzed in the presence of 0.36 mM cycloheximide (CHX). The lysates were then processed according to previously described Ribo-Seq protocols [5,13]. An Illumina NextSeq Platform was used to sequence the prepared amplicon libraries.

The bioinformatic Ribo-Seq analysis was performed as described previously [5,13,15]. Briefly, adapter sequences were trimmed using the FASTX-Toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit) and trimmed reads shorter than 25 nt were discarded. Reads were mapped to host (Homo sapiens) and virus RNA (CVA13) using bowtie1 [53] version 0.12.947, with parameters -v 2 --best (i.e., maximum two mismatches, report best match). Mapping was performed in the following order: host rRNA, virus RNA, host RefSeq mRNA, host non-coding RNA, host genome. After calculating the read-length distributions (S5C Fig), only 27–29 nt reads were taken forward for calculating phasing (S5D Fig), and histograms of RPFs mapped to host mRNAs (S5A and S5B Fig) and the viral genome (S5E and S5F Fig). To normalize for library size in S5E and S5F Fig, reads per million mapped reads (RPM) values were calculated using the sum of positive-sense virus and host RefSeq mRNA 27–29 nt reads as the denominator. In Figs 4, S5E and S5F, a + 12 nt offset was applied to the RPF 5′ end positions to give the approximate ribosomal P-site positions. To calculate the phasing (S5D Fig) and length (S5C Fig) distributions of host and virus RPFs, only RPFs whose 5′ end (+12 nt offset) mapped between the 13th nucleotide from the beginning and the 18th nucleotide from the end of coding sequences were counted; reads mapping to the dual coding region where the UP ORF overlaps the ppORF were also excluded. Histograms of host RPF 5′ end positions relative to initiation and termination sites (S5A and S5B Fig) were derived from RPFs mapping to RefSeq mRNAs with annotated coding regions ≥450 nt in length and with annotated 5′ and 3′ UTRs ≥ 60 nt in length.

For the analysis of host uAUG:mAUG ribosome occupancy ratios, we first selected a subset of host RefSeq mRNAs that have an upstream AUG codon (uAUG) within the annotated 5′ UTR. Initiation read counts for most host mRNAs were too low to obtain meaningful results for individual transcript species. Thus, in an attempt to obtain sufficient host mRNAs to average out sources of variation (shot noise, ligation bias, etc), we summed over uAUGs at a range of spacings from mAUGs (even though this might affect possible artifacts relating to stacking of preinitiation scanning ribosomes behind an initiating ribosome arrested at the mAUG [14]). We avoided uAUGs very close to the 5′ end of annotated transcripts (at least 12 nt of sequence upstream of a uAUG are required for RPFs to map, but also very short 5′ UTRs are known to promote leaky scanning [47]). While the CVA13 uAUG is 157 nt 5′ of the mAUG, we allowed host uAUGs to be closer to the mAUG in order to increase the number of suitable host mRNAs. Furthermore, we reasoned that using uAUGs positioned closer to mAUGs may be more robust to concerns about alternative transcript isoforms when compared to more distally spaced uAUGs. Thus we selected host mRNAs with exactly one uAUG in the entire annotated 5′ UTR, and where that uAUG was within 200 nt upstream of the mAUG, and not in the first 30 nt of the annotated mRNA. This left 4684 out of an initial 35768 RefSeq mRNAs and ~8% of all host mRNA-mapping RPFs. For both host and virus uAUG and mAUG ribosome occupancy levels, we again used only 27–29 nt reads, and counted only reads whose 5′ end (+12 nt offset) mapped to the A of the relevant AUG codon. For the histograms of host RPF 5′ end positions relative to uAUG and mAUG sites (S4 Fig), in contrast to S5A and S5B Fig, we removed the restriction that mRNAs should have annotated coding regions ≥450 nt in length and annotated 5′ and 3′ UTRs ≥ 60 nt in length.

Supporting information

S1 Fig. Analysis of AUG triplets in regions flanking the SL-VI AUG in 9347 enterovirus sequences.

Columns show (1) the number of sequences with the given pattern of AUG triplets in the given BLASTCLUST cluster; (2) the positions of AUG triplets in the 20 nt 5′ of the SL-VI AUG; (3) the positions of AUG triplets in the 20 nt 3′ of the SL-VI AUG; and (4) the representative sequence of the relevant BLASTCLUST cluster (as in S3 Fig). Note that different sequences in the same BLASTCLUST cluster may harbour different non-SL-VI AUG configurations, and so the representative sequence does not necessarily contain the displayed AUG configuration. Sequences containing any ambiguous nucleotide codes (e.g., “N”, “R”, etc) in the 20 nt 5′ or 20 nt 3′ of the SL-VI AUG, or sequences with incomplete coverage of this region, were removed. Sequences without any non-SL-VI AUG triplets in this region are also not shown. Note that the smallest distance between the SL-VI AUG and the ppAUG is 21 nt (in some rhinoviruses); thus none of the displayed AUG triplets corresponds to the ppAUG.

https://doi.org/10.1371/journal.ppat.1013967.s001

(DOCX)

S2 Fig. Non-SL-VI AUG ORFs with length at least 40 codons.

Amino acid sequences of all ORFs that start with a non-SL-VI AUG within the 20 nt 5′ and 20 nt 3′ of the SL-VI AUG and that fulfill the following criteria: (a) encoded peptide ≥ 40 amino acids, (b) encoded peptide not contiguous with the polyprotein peptide, and (c) distance between the non-SL-VI AUG and the ppAUG ≥ 120 nt. The sequence WIGHP that is conserved in some enterovirus UP proteins is highlighted in blue. Transmembrane helix (TMH) predictions are highlighted in red (20–50% confidence), yellow (50–80% confidence) or green (>80% confidence); underlined sequences represent predicted N-terminal signal peptides (Phobius predictions). Columns show (1) BLASTCLUST cluster; (2) NCBI accession; (3) ‘Y’ if the non-SL-VI AUG ORF meets the more stringent Lulla et al. (2019) [1] uORF criteria (i.e., if the ORF beginning at the AUG and including the first in-frame stop codon overlaps the ppORF by at least 1 nt; is not in-frame with the ppORF; and contains at least 150 nt upstream of the ppAUG), otherwise ‘N’; and (4) the amino acid sequence of the ORF. For the purpose of cross-referencing with S3 Fig, the ‘representative sequences’ for the relevant clusters are: #1 NC_001612 Enterovirus A, #2 NC_002058 Enterovirus C, #3 NC_001472 Enterovirus B, #25 NC_010415 Enterovirus J, and #40 AF326750 Enterovirus A125. Note that some of these ORFs also have an in-frame SL-VI AUG (e.g., EF015017 – here, the SL-VI AUG corresponds to the methionine at position 5).

https://doi.org/10.1371/journal.ppat.1013967.s002

(DOCX)

S3 Fig. uORF statistics for different clusters of enterovirus sequences.

The ppORF amino acid sequences of 9347 enterovirus sequences were clustered with BLASTCLUST using an 80% identity threshold, a representative sequence was selected from each of the 41 clusters, the ppORF amino acid sequences were aligned with MUSCLE, and a phylogenetic tree (left) was estimated with MrBayes (see Methods). Upstream AUG and upstream ORF statistics were calculated for each cluster. “distance ≥ 120 nt” refers to the distance between the upstream AUG and the ppAUG. “Non-SL-VI AUG uORF complements SL-VI uORF” indicates that the non-SL-VI AUG uORF and the SL-VI AUG uORF share the same termination codon [note, in the one case in the NC_001472 cluster, the non-SL-VI AUG is 3 codons 3′ of the SL-VI AUG and the resulting uORF fails the Lulla et al. (2019) [1] uORF criterion that the uORF initiation codon should be at least 150 nt upstream of the ppAUG]. “Non-SL-VI AUG uORF supplants SL-VI uORF” indicates that the non-SL-VI AUG uORF fulfills the Lulla et al. (2019) [1] uORF criteria, but the ORF beginning with the SL-VI AUG does not.

https://doi.org/10.1371/journal.ppat.1013967.s003

(DOCX)

S4 Fig. Relative RPF occupancy at upstream and main AUG codons.

(A) After first mapping RPFs to host RefSeq mRNAs, those mapping to annotated transcripts with exactly one upstream AUG (uAUG) and where the uAUG was within 200 nt upstream of the main AUG (mAUG) and not in the first 30 nt of the transcript, were selected. Next, RPFs whose 5′ end (+12 nt offset) mapped to the A of the uAUG or mAUG were quantified. Due to host-cell shut-off and the limited number of suitable host mRNAs used, the numbers of uAUG- and mAUG-mapping RPFs were often small, but useful numbers were obtained at least for the 5 h p.i. LTM libraries (marked in red). (B) As for panel A but for the CVA13 virus uAUG and mAUG codons. (C) Data from the 5 h p.i. LTM libraries in panel A, subdivided by uAUG initiation context: all contexts, strong contexts (G at −3 and +4, or A at −3), medium contexts (G at −3 or G at +4), and poor contexts (other contexts). As expected, there is a modest increase in uAUG:mAUG occupancy ratios with increasing strength of the uAUG initiation context. (D) Histograms of the 5′-end mapping positions of RPFs (no + 12 nt offset applied) relative to uAUG and mAUG codons, summed over the selected set of host mRNAs. RPFs whose 5′ ends map to the 1st, 2nd or 3rd positions of codons are shown in purple, blue or yellow, respectively. In all panels, only 27–29 nt reads were used.

https://doi.org/10.1371/journal.ppat.1013967.s004

(DOCX)

S5 Fig. Assessment of ribosome profiling quality for lactimidomycin-treated (LTM) and non-treated (NT) CVA13 libraries.

(A-B) Histograms of the 5′-end mapping positions of RPFs relative to annotated initiation and termination sites, summed over all host mRNAs. Reads whose 5′ ends map to the 1st, 2nd or 3rd positions of codons are shown in purple, blue or yellow, respectively. (C) Length distributions for Ribo-Seq reads mapping to virus (red) or host mRNA (purple) coding regions; upper panels – LTM, lower panels – NT. (D) Phasing of 5′ ends of Ribo-Seq reads that map to the virus or host mRNA coding regions. (E-F) Ribosome profiles for the CVA13 genome at 5 and 7 hpi for cells treated with lactimidomycin (E) or untreated (F). Histograms of the 5′-end mapping positions of RPFs, with a + 12 nt offset to map the approximate P-site position, in reads per million mapped reads (RPM), smoothed with a 15-nucleotide running mean filter. For panels A, B, D, E and F, only 27–29 nt reads were used.

https://doi.org/10.1371/journal.ppat.1013967.s005

(DOCX)

S6 Fig. Enterovirus CVA1 sequences that encode a UP-like protein from an alternative upstream AUG codon.

(A) Nucleotide (left) and uuORF-encoded protein (right) sequences in enterovirus CVA1, where the uORF is truncated and the uuORF can potentially rescue UP expression. Transmembrane helix (TMH) predictions are highlighted in yellow (50–80% confidence) or green (>80% confidence). Depending on the frame used, ORFs are highlighted in blue (uuORF), and purple or orange (uORF). (B) Schematic representation of the CVA1 IRES dVI region with the uuORF (blue), uORF and ppORF (orange) start and stop codons annotated. (C) Schematic representation of the CVA13, CVA1, and EV-A90 reporters used to measure translation in three frames (Fig 5B–5D). The 5′ UTR from all three viruses was inserted into dual luciferase-expressing constructs (Fig 5A) so that WT sequence is preserved (including start and stop codons) until ppORF start codon. Then the sequence of ppORF is replaced by firefly luciferase in all three frames.

https://doi.org/10.1371/journal.ppat.1013967.s006

(DOCX)

S7 Fig. Validation of organoid and neuron infection experiments.

(A) The growth of CVA13 viruses in organoid line derived from patient 2. (B) Representative sequencing chromatograms of RT-PCR products from viruses derived from the final time points in Fig 7A and 7C. (C) Differentiation of human intestinal organoid cultures. Results of qRT-PCR showing fold change in transcript levels in differentiated duodenum organoids (days 1–4) relative to transcript levels in undifferentiated organoids (day 0). The GAPDH transcript was used for normalization. LGR5, a stem cell marker, leucine-rich repeat-containing G-protein coupled receptor 5; ALP, a mature enterocyte marker, alkaline phosphatase; Villin, an epithelial cell marker. Plotted data represent means ± s.d.; n = 3.

https://doi.org/10.1371/journal.ppat.1013967.s007

(DOCX)

S8 Fig. Full set of expression assays performed for 5′UTR and 5′-3′UTR reporters.

Analysis of IRES activities for CVA13 5′-3′ UTR reporters in the three frames in HeLa and intestinal HIEC6 cells, with and without virus RNA, at 8 h.p.t. Statistical analysis was conducted using two-tailed t-tests; * p value ≤ 0.05; ns, non significant.

https://doi.org/10.1371/journal.ppat.1013967.s008

(DOCX)

S9 Fig. Mapping positions of short ORFs in SL-VI region.

A region from 20 nt 5′ of the SL-VI AUG to 30 nt 3′ of the SL-VI AUG was extracted. All AUG codons are shown in the three reading frames. Stop codons (STP) are only shown if they terminate an ORF initiated by one (or more) of the displayed AUGs. Other nucleotides are shown as “n” if they are part of one of these ORFs, otherwise as “.”. Sequences with incomplete coverage of this region, or any ambiguous nucleotide codes (“R”, “N”, etc) in this region were removed, leaving 9333 sequences. Number of sequences (#seqs) list the number of sequences with the given configuration of displayed AUGs, stops, and short ORFs.

https://doi.org/10.1371/journal.ppat.1013967.s009

(DOCX)

S10 Fig. Supplementary data for Fig 8.

(A-B) Schematic representation of the CVA1 (A) and EV-A90 (B) reporters and their AUU mutants used to measure translation in three frames (Fig 8E8F). (C) Cell viability during sodium arsenite (NaAs) treatment (8 hours post-transfection, 8 hours post-treatment). The data are normalized to the untreated transfected control (0 μM), and presented as mean ± SEM (n = 3).

https://doi.org/10.1371/journal.ppat.1013967.s010

(DOCX)

S1 Table. Metadata associated with E. alphacoxsackie species.

https://doi.org/10.1371/journal.ppat.1013967.s011

(DOCX)

S2 Table. Metadata associated with E. coxsackiepol species.

https://doi.org/10.1371/journal.ppat.1013967.s012

(DOCX)

S3 Table. Host and virus read counts for Ribo-Seq samples.

https://doi.org/10.1371/journal.ppat.1013967.s013

(DOCX)

S1 Data. Source data file.

Source data for Figs 58.

https://doi.org/10.1371/journal.ppat.1013967.s014

(XLSX)

Acknowledgments

The authors thank members of the Lulla lab for helpful discussions. We thank Ulrich Desselberger for critical reading of the manuscript. We thank Cambridge Genomic Services for high-throughput sequencing. The CVA13 isolate (Flores) was provided by the European Virus Archive goes Global (EVAg) project funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871029.

References

  1. 1. Pons-Salort M, Parker EPK, Grassly NC. The epidemiology of non-polio enteroviruses: recent advances and outstanding questions. Curr Opin Infect Dis. 2015;28(5):479–87. pmid:26203854
  2. 2. Baggen J, Thibaut HJ, Strating JRPM, van Kuppeveld FJM. The life cycle of non-polio enteroviruses and how to target it. Nat Rev Microbiol. 2018;16(6):368–81. pmid:29626210
  3. 3. Suresh S, Forgie S, Robinson J. Non-polio enterovirus detection with acute flaccid paralysis: a systematic review. J Med Virol. 2018;90(1):3–7. pmid:28857219
  4. 4. Lulla V, Sridhar A. Understanding neurotropic enteric viruses: routes of infection and mechanisms of attenuation. Cell Mol Life Sci. 2024;81(1):413. pmid:39365457
  5. 5. Lulla V, Dinan AM, Hosmillo M, Chaudhry Y, Sherry L, Irigoyen N, et al. An upstream protein-coding region in enteroviruses modulates virus infection in gut epithelial cells. Nat Microbiol. 2019;4(2):280–92. pmid:30478287
  6. 6. Guo H, Li Y, Liu G, Jiang Y, Shen S, Bi R, et al. A second open reading frame in human enterovirus determines viral replication in intestinal epithelial cells. Nat Commun. 2019;10(1):4066. pmid:31492846
  7. 7. Sweeney TR, Abaeva IS, Pestova TV, Hellen CUT. The mechanism of translation initiation on Type 1 picornavirus IRESs. EMBO J. 2014;33(1):76–92. pmid:24357634
  8. 8. Hellen CU, Pestova TV, Wimmer E. Effect of mutations downstream of the internal ribosome entry site on initiation of poliovirus protein synthesis. J Virol. 1994;68(10):6312–22. pmid:8083971
  9. 9. Pelletier J, Flynn ME, Kaplan G, Racaniello V, Sonenberg N. Mutational analysis of upstream AUG codons of poliovirus RNA. J Virol. 1988;62(12):4486–92. pmid:2846865
  10. 10. Käll L, Krogh A, Sonnhammer ELL. Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic Acids Res. 2007;35:W429–32. pmid:17483518
  11. 11. Oberste MS, Maher K, Pallansch MA. Complete genome sequences for nine simian enteroviruses. J Gen Virol. 2007;88(Pt 12):3360–72. pmid:18024906
  12. 12. Lee S, Liu B, Lee S, Huang S-X, Shen B, Qian S-B. Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc Natl Acad Sci U S A. 2012;109(37):E2424–32. pmid:22927429
  13. 13. Lulla V, Firth AE. A hidden gene in astroviruses encodes a viroporin. Nat Commun. 2020;11(1):4070. pmid:32792502
  14. 14. Andreev DE, O’Connor PBF, Loughran G, Dmitriev SE, Baranov PV, Shatsky IN. Insights into the mechanisms of eukaryotic translation gained with ribosome profiling. Nucleic Acids Res. 2017;45(2):513–26. pmid:27923997
  15. 15. Irigoyen N, Firth AE, Jones JD, Chung BY-W, Siddell SG, Brierley I. High-resolution analysis of coronavirus gene expression by RNA sequencing and ribosome profiling. PLoS Pathog. 2016;12(2):e1005473. pmid:26919232
  16. 16. Ali H, Lulla A, Nicholson AS, Hankinson J, Wignall-Fleming EB, O’Connor RL, et al. Attenuation hotspots in neurotropic human astroviruses. PLoS Biol. 2023;21(7):e3001815. pmid:37459343
  17. 17. Fernandopulle MS, Prestil R, Grunseich C, Wang C, Gan L, Ward ME. Transcription factor-mediated differentiation of human iPSCs into neurons. Curr Protoc Cell Biol. 2018;79(1):e51. pmid:29924488
  18. 18. Zoll J, Heus HA, van Kuppeveld FJM, Melchers WJG. The structure-function relationship of the enterovirus 3’-UTR. Virus Res. 2009;139(2):209–16. pmid:18706945
  19. 19. Florez de Sessions P, Dobrikova E, Gromeier M. Genetic adaptation to untranslated region-mediated enterovirus growth deficits by mutations in the nonstructural proteins 3AB and 3CD. J Virol. 2007;81(16):8396–405. pmid:17537861
  20. 20. Hambidge SJ, Sarnow P. Translational enhancement of the poliovirus 5’ noncoding region mediated by virus-encoded polypeptide 2A. Proc Natl Acad Sci U S A. 1992;89(21):10272–6. pmid:1332040
  21. 21. Pestova TV, Hellen CU, Wimmer E. A conserved AUG triplet in the 5’ nontranslated region of poliovirus can function as an initiation codon in vitro and in vivo. Virology. 1994;204(2):729–37. pmid:7941341
  22. 22. Ait Ghezala H, Jolles B, Salhi S, Castrillo K, Carpentier W, Cagnard N, et al. Translation termination efficiency modulates ATF4 response by regulating ATF4 mRNA translation at 5’ short ORFs. Nucleic Acids Res. 2012;40(19):9557–70. pmid:22904092
  23. 23. Gradi A, Svitkin YV, Imataka H, Sonenberg N. Proteolysis of human eukaryotic translation initiation factor eIF4GII, but not eIF4GI, coincides with the shutoff of host protein synthesis after poliovirus infection. Proc Natl Acad Sci U S A. 1998;95(19):11089–94. pmid:9736694
  24. 24. Pestova TV, Kolupaeva VG, Lomakin IB, Pilipenko EV, Shatsky IN, Agol VI, et al. Molecular mechanisms of translation initiation in eukaryotes. Proc Natl Acad Sci U S A. 2001;98(13):7029–36. pmid:11416183
  25. 25. de Breyne S, Yu Y, Unbehaun A, Pestova TV, Hellen CUT. Direct functional interaction of initiation factor eIF4G with type 1 internal ribosomal entry sites. Proc Natl Acad Sci U S A. 2009;106(23):9197–202. pmid:19470487
  26. 26. Zhang H, Wang Y, Wu X, Tang X, Wu C, Lu J. Determinants of genome-wide distribution and evolution of uORFs in eukaryotes. Nat Commun. 2021;12(1):1076. pmid:33597535
  27. 27. Young SK, Wek RC. Upstream open reading frames differentially regulate gene-specific translation in the integrated stress response. J Biol Chem. 2016;291(33):16927–35. pmid:27358398
  28. 28. Elbirt KK, Whitmarsh AJ, Davis RJ, Bonkovsky HL. Mechanism of sodium arsenite-mediated induction of heme oxygenase-1 in hepatoma cells. Role of mitogen-activated protein kinases. J Biol Chem. 1998;273(15):8922–31. pmid:9535875
  29. 29. Lee K-M, Chen C-J, Shih S-R. Regulation mechanisms of viral IRES-driven translation. Trends Microbiol. 2017;25(7):546–61. pmid:28242053
  30. 30. Finkel Y, Stern-Ginossar N, Schwartz M. Viral short ORFs and their possible functions. Proteomics. 2018;18(10):e1700255. pmid:29150926
  31. 31. Firth AE. Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses. Nucleic Acids Res. 2014;42(20):12425–39. pmid:25326325
  32. 32. Zhang H, Dou S, He F, Luo J, Wei L, Lu J. Genome-wide maps of ribosomal occupancy provide insights into adaptive evolution and regulatory roles of uORFs during Drosophila development. PLoS Biol. 2018;16(7):e2003903. pmid:30028832
  33. 33. Aspden JL, Eyre-Walker YC, Phillips RJ, Amin U, Mumtaz MAS, Brocard M, et al. Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. Elife. 2014;3:e03528. pmid:25144939
  34. 34. Racaniello VR. One hundred years of poliovirus pathogenesis. Virology. 2006;344(1):9–16. pmid:16364730
  35. 35. Slobodskaya OR, Gmyl AP, Maslova SV, Tolskaya EA, Viktorova EG, Agol VI. Poliovirus neurovirulence correlates with the presence of a cryptic AUG upstream of the initiator codon. Virology. 1996;221(1):141–50. pmid:8661422
  36. 36. La Monica N, Meriam C, Racaniello VR. Mapping of sequences required for mouse neurovirulence of poliovirus type 2 Lansing. J Virol. 1986;57(2):515–25. pmid:3003384
  37. 37. Wu G-H, Lee K-M, Kao C-Y, Shih S-R. The internal ribosome entry site determines the neurotropic potential of enterovirus A71. Microbes Infect. 2023;25(5):105107. pmid:36708870
  38. 38. Lefèvre C, Cook GM, Dinan AM, et al. Zika viruses encode multiple upstream open reading frames in the 5′ viral region with a role in neurotropism. bioRxiv. 2023:112904.
  39. 39. Murphy JC, Harrington EM, Schumann S, Vasconcelos EJR, Mottram TJ, Harper KL, et al. Kaposi’s sarcoma-associated herpesvirus induces specialised ribosomes to efficiently translate viral lytic mRNAs. Nat Commun. 2023;14(1):300. pmid:36653366
  40. 40. Calvo SE, Pagliarini DJ, Mootha VK. Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans. Proc Natl Acad Sci U S A. 2009;106(18):7507–12. pmid:19372376
  41. 41. Lefèvre C, Cook GM, Dinan AM, Torii S, Stewart H, Gibbons G, et al. Zika viruses encode 5’ upstream open reading frames affecting infection of human brain cells. Nat Commun. 2024;15(1):8822. pmid:39394194
  42. 42. Wu H-Y, Guan B-J, Su Y-P, Fan Y-H, Brian DA. Reselection of a genomic upstream open reading frame in mouse hepatitis coronavirus 5’-untranslated-region mutants. J Virol. 2014;88(2):846–58. pmid:24173235
  43. 43. Shabman RS, Hoenen T, Groseth A, Jabado O, Binning JM, Amarasinghe GK, et al. An upstream open reading frame modulates ebola virus polymerase translation and virus replication. PLoS Pathog. 2013;9(1):e1003147. pmid:23382680
  44. 44. Krummheuer J, Johnson AT, Hauber I, Kammler S, Anderson JL, Hauber J, et al. A minimal uORF within the HIV-1 vpu leader allows efficient translation initiation at the downstream env AUG. Virology. 2007;363(2):261–71. pmid:17331561
  45. 45. Szamecz B, Rutkai E, Cuchalová L, Munzarová V, Herrmannová A, Nielsen KH, et al. eIF3a cooperates with sequences 5’ of uORF1 to promote resumption of scanning by post-termination ribosomes for reinitiation on GCN4 mRNA. Genes Dev. 2008;22(17):2414–25. pmid:18765792
  46. 46. Vattem KM, Wek RC. Reinitiation involving upstream ORFs regulates ATF4 mRNA translation in mammalian cells. Proc Natl Acad Sci U S A. 2004;101(31):11269–74. pmid:15277680
  47. 47. Kozak M. A short leader sequence impairs the fidelity of initiation by eukaryotic ribosomes. Gene Expr. 1991;1(2):111–5. pmid:1820208
  48. 48. Sweeney TR, Abaeva IS, Pestova TV, Hellen CUT. The mechanism of translation initiation on Type 1 picornavirus IRESs. EMBO J. 2014;33(1):76–92. pmid:24357634
  49. 49. Imai R, Rongduo W, Kaixin L, Borjigin S, Matsumura H, Masuda T, et al. Novel recombinant porcine enterovirus G viruses lacking structural proteins are maintained in pig farms in Japan. J Vet Med Sci. 2023;85(2):252–65. pmid:36543238
  50. 50. Imai R, Nagai M, Oba M, Sakaguchi S, Ujike M, Kimura R, et al. A novel defective recombinant porcine enterovirus G virus carrying a porcine torovirus papain-like cysteine protease gene and a putative anti-apoptosis gene in place of viral structural protein genes. Infect Genet Evol. 2019;75:103975. pmid:31344488
  51. 51. Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113. pmid:15318951
  52. 52. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. pmid:2231712
  53. 53. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25. pmid:19261174