Skip to main content
Advertisement
  • Loading metrics

The transcriptional and translational landscape of HCoV-OC43 infection

Abstract

The coronavirus HCoV-OC43 circulates continuously in the human population and is a frequent cause of the common cold. Here, we generated a high-resolution atlas of the transcriptional and translational landscape of OC43 during a time course following infection of human lung fibroblasts. Using ribosome profiling, we quantified the relative expression of the canonical open reading frames (ORFs) and identified previously unannotated ORFs. These included several potential short upstream ORFs and a putative ORF nested inside the M gene. In parallel, we analyzed the cellular response to infection. Endoplasmic reticulum (ER) stress response genes were transcriptionally and translationally induced beginning 12 and 18 hours post infection, respectively. By contrast, conventional antiviral genes mostly remained quiescent. At the same time points, we observed accumulation and increased translation of noncoding transcripts normally targeted by nonsense mediated decay (NMD), suggesting NMD is suppressed during the course of infection. This work provides resources for deeper understanding of OC43 gene expression and the cellular responses during infection.

Author summary

RNA viruses are under constant evolutionary pressure to maintain compact and highly-streamlined genomes. Consequently, these viruses frequently employ unusual gene expression strategies to maximize the coding potential of their relatively small genomes. In this work, we investigated the gene expression of the human coronavirus HCoV-OC43. OC43 generally causes mild common cold symptoms, but the species is closely related to more pathogenic coronaviruses such as SARS-CoV-2. Here, we used ribosome profiling and RNA sequencing to determine the expression kinetics and translation efficiencies of both viral and cellular genes. We uncovered several unexpected features in the OC43 genome, including short open reading frames upstream of major genes, and a putative alternative translation start site within the M gene. In parallel, we surveyed the cellular response to OC43 infection. Genes associated with the endoplasmic reticulum (ER) stress response were strongly induced late in infection. By contrast, conventional antiviral genes remained largely unaltered throughout the course of infection, suggesting OC43 effectively evades or disables the host antiviral response.

Introduction

To date, seven coronaviruses have been found to infect humans and cause respiratory disease, ranging from mild cold to severe pneumonia. These seven coronaviruses are broadly grouped into two distinct lineages: the alphacoronaviruses (HCoV-229E and HCoV-NL63) and betacoronaviruses (HCoV-OC43, HCoV-HKU1, SARS-CoV, MERS-CoV, and SARS-CoV-2). Each virus has a distinct complement of accessory genes and they frequently use different receptors for entry [1]. Despite these differences, the core replication mechanisms are highly conserved [2]. During early infection, the incoming, positive-strand genomic RNA (+)gRNA is the only viral genetic material (S1A Fig). The (+)gRNA is initially translated to generate the large polyproteins ORF1A and ORF1AB, the latter regulated by a −1 frameshift element that allows ribosomes to continue translating past the stop codon at the end of ORF1A. The polyproteins are cleaved by two viral proteases to yield a set of “non-structural proteins” (NSP1 through NSP16), including the RNA processing enzymes and components of the viral RNA-dependent RNA polymerase (RdRp). The viral RdRp uses the (+)gRNA to synthesize a negative-strand intermediate, (−)gRNA, which in turn serves as a template for synthesis of additional (+)gRNAs. In addition, RdRp synthesizes shorter negative-strand subgenomic RNAs, (−)sgRNAs, by prematurely disengaging from the template strand before resuming transcription at a downstream site (S1B Fig). This process, known as discontinuous transcription, is guided by the presence of transcription regulatory sequences (TRSs). The resulting (−)sgRNAs serve as templates for synthesis of plus-strand subgenomic RNAs, (+)sgRNAs. These act as monocistronic mRNAs for translation of one or more accessory proteins and several conserved structural proteins—spike (S), envelope (E), membrane (M), and nucleocapsid (N).

Coronaviruses also encode unconventional polypeptides; these include short upstream ORFs (uORFs) with possible regulatory roles, in-frame internal ORFs (iORFs) generating N-terminally truncated proteins, and out-of-frame iORFs producing novel, alternate proteins [35]. Collectively, these ‘noncanonical’ proteins increase the coding capacity of an otherwise limited genome. One study identified 23 previously unannotated ORFs across the SARS-CoV-2 genome [4], greatly expanding the protein repertoire of the virus. At least one noncanonical protein has been shown to have a clear functional role. The SARS-CoV-2 protein ORF9B, encoded by an out-of-frame iORF nested inside the N gene, antagonizes MAVS-mediated innate immune signaling through its interactions with TOM70 [68]. Notably, ORF9B is conserved throughout Betacoronaviridae, suggesting noncanonical translation is a general feature of coronavirus genomes.

Like all viruses, coronavirus is completely dependent on the cellular translation machinery for the synthesis of viral proteins. Upon infection, SARS-CoV-2 blocks cellular protein synthesis and hijacks the translation apparatus for its own use [911]. At the same time, infected cells attempt to suppress bulk translation while selectively translating antiviral genes [1214]. This tug-of-war battle plays a key role in SARS-CoV-2 infection and pathogenesis, but remains less well-characterized for other coronavirus species [3,5,9,15,16].

To address this gap, we comprehensively analyzed the transcriptional and translational landscape of HCoV-OC43 infection. We identified numerous noncanonical transcripts and several potentially novel ORFs distributed across the viral genome. We also show that infected cells are able to mount a transcriptional and translational response to virus-induced ER stress, but classical antiviral pathways largely remain suppressed. At the same time, infected cells show increased translation of noncoding RNAs normally targeted by nonsense mediated decay.

Results

Experimental design

To better understand the course of coronavirus infection and the host responses, we followed OC43 transcription and translation, together with cellular gene expression over a detailed time course (Fig 1A). To this end, we infected human lung fibroblast MRC-5 cells with OC43 at a multiplicity of infection (MOI) of 7 (Figs 1A, S2A, and S2B). Following a 1hr adsorption, the viral inoculum was removed, and samples were collected at regular intervals out to 30 hours post infection (hpi). Samples were harvested for RNAseq at each timepoint, and for riboseq analysis at most timepoints. In parallel, we ran mock infections using heat-inactivated virus. Virus-infected cells remained intact throughout the time course, but showed a reduced growth rate compared to mock-infected cells (S2C and S2D Fig). Notably, we observed some heterogeneity in the speed with which cells reached a high level of infection (S2C Fig), consistent with previous observations for SARS-CoV-2 [17].

thumbnail
Fig 1. Experimental design and overview of viral transcription.

(a) Overall design of the study. MRC-5 lung fibroblast cells were infected with OC43 virus at 33°C at a MOI of 7. RNA samples were collected prior to infection (−1) and at 1.5, 3, 6, 9, 12, 18, 24, and 30 hours post infection (hpi). Supernatants were also collected to assay for virion release (not shown). At most timepoints, samples were also prepared for cycloheximide (CHX)-based Riboseq. Harringtonine (HAR)-based Riboseq samples were prepared only from uninfected cells and 6 hpi. Alternatively, cells were inoculated with heat-inactivated virus, and RNA samples were analyzed at 3, 6, and 18 hpi. (b) The relative proportion of human and viral RNA throughout the course of infection. H.I. = heat inactivated. (c) Plot showing the expression kinetics of viral gRNA throughout the time course. (d) Estimated virion release per cell per hour quantified using RT-qPCR for the viral genome. Error bars show standard deviation of the mean (n = 3). (e) Genome coverage tracks showing RNAseq reads across the OC43 genome. The numbers to the right indicate the scale in RPM. The structure of the full-length genomic RNA is shown below. A TRS-Body (TRS-B) is present just upstream of each of the seven ORFs in the 3ʹ end of the genome. The leader TRS (TRS-L) is located at the 5ʹ end of the genome. Left: The 5ʹ end of the genome is enlarged for clarity. (f) Coronavirus transcription/replication cycle. The template switch event is represented by the curved grey line, and the resulting junction sites are identified by sequence reads spanning the junction. (g) Schematic representation of the canonical (+)sgRNAs. The leader sequence is shown in black.

https://doi.org/10.1371/journal.ppat.1012831.g001

OC43 replication and transcription kinetics

A summary of the sequencing results across the time course is presented in Fig 1B. At later time points, viral RNA, including both genomic RNA (gRNA) and subgenomic mRNAs (sgRNAs), comprised ~65% of all sequencing reads (Fig 1B). By contrast, cells exposed to heat-inactivated virus showed no evidence of infection. For infected cells, intracellular levels of viral gRNA increased exponentially between 3–12 hpi, before reaching steady-state between 12–18 hpi (Fig 1C). This plateau phase coincided with a sharp increase in viral genomes in the supernatant, suggesting that by 12–18 hpi the synthesis of new viral genomes is offset by export of mature virus particles from the cell. We measured the concentration of virions in the media using RT-qPCR, and, after accounting for the number of cells in the dish, were able to estimate virion production per cell throughout the infection cycle (Fig 1D). Remarkably, the average cell released ~1000 virions per hour between 12 and 24 hpi.

We next examined the distribution of sequencing reads across the viral genome (Fig 1E). At the earliest time point (1.5 hpi; top track), coverage was fairly even, indicating that the viral RNA population mainly comprised full-length gRNA derived from the initial inoculum. By 6 hpi, most reads mapped to the 3ʹ end of the viral genome, consistent with the onset of sgRNA production [5,18]. To quantify the abundance of different sgRNAs, we developed a pipeline to identify transcription junction sites. Coronavirus RdRp undergoes discontinuous transcription, in which ‘jumping’ during (−) strand sgRNA synthesis creates junctions between distant regions of the primary sequence. A second round of transcription then generates the (+)sgRNA complement. Junctions can be discriminated and quantified by sequencing and counting the number of reads spanning individual junctions across (+)sgRNAs [18] (Fig 1F). This approach identified seven major sgRNAs, corresponding to the seven distinct ORFs in the 3ʹ region of the OC43 genome (shown schematically in Fig 1G).

Each of the seven major sgRNAs consisted of a 56–58 nt leader sequence, derived from the 5ʹ end of the genome, fused to a downstream open reading frame. Normalized read counts for individual sgRNAs were highly reproducible between replicates (Fig 2A). The different sgRNAs showed similar expression kinetics and reached plateau between 9–12 hpi (Figs 2B and S3A). As expected, N sgRNA was by far the most abundant, comprising 60–80% of all viral RNA (Fig 2C).

thumbnail
Fig 2. OC43 replication kinetics and subgenomic RNA expression.

(a) Reproducibility of sgRNA mapping for two 30 hpi timepoints. Each dot represents the normalized read counts of a junction in replicates 1 (x-axis) and 2 (y-axis). Noncanonical transcript junctions are shown in grey. (b) Plots showing the expression kinetics of genomic RNA and individual sgRNAs throughout the time course. Lines show the average across 3–4 replicates. (c) Histogram indicating the relative abundance of full-length genomic RNA and the seven sgRNAs at each time point. (d) Sequence context of the transcription junction site for the canonical sgRNA species. Green residues are complementary to the TRS-L, and red residues show mismatches with the TRS-L. Yellow triangles indicate the junction site. (e) The consensus TRS-B sequence. (f) Schematic representation of the 5ʹ UTR regions for each canonical sgRNA. (g) Sashimi plots mapping discontinuous transcription events at 30 hpi. Leader-body canonical junctions are shown in blue, leader-body noncanonical junctions are shown in green, and body-body noncanonical junctions are shown in red.

https://doi.org/10.1371/journal.ppat.1012831.g002

Discontinuous transcription is driven by the presence of TRS motifs, located at the 5ʹ end of the genome (TRS-L) and directly preceding each viral ORF (TRS-Bs) (Fig 1E). Discontinuous transcription is thought to occur only during minus strand synthesis, when the nascent anti-TRS-B element pairs with the complementary sequence at TRS-L in the template strand, facilitating a template-switch by RdRp (S1B Fig) [19]. For this reason, jumping efficiency at each TRS-B is partly determined by how closely it matches TRS-L [2]. Analysis of the canonical OC43 sgRNA junctions identified a consensus TRS motif of AAUCUAAAC (Fig 2D and 2E), consistent with previous in silico predictions for OC43 [20]. For five of the seven sgRNAs, the TRS-B motif was located just upstream of the start codon, resulting in relatively short 5ʹUTRs consisting almost entirely of the 5ʹ leader sequence (Fig 2F). The NS5A and E sgRNAs were different, with junction sites much further upstream of the start codon, 114 and 132 nt respectively. The resulting 5ʹUTRs incorporate short uORFs which show evidence of translation (see below).

Collectively, the seven ‘canonical’ sgRNAs comprised ~95% of all junction-spanning reads (Fig 2G). The remaining reads were divided amongst 265 noncanonical sgRNAs, most of which lacked any substantial coding potential (S3B and S3C Fig) and probably represent transcriptional noise. On average, the noncanonical transcripts had weaker donor sites (i.e., TRS-B motifs) compared to the canonical sgRNAs (S3D Fig), consistent with their much lower accumulation. There were, however, some exceptions; for instance, the TRS-B for the HE gene was weaker than some noncanonical TRS-Bs, suggesting that additional factors contribute to jumping efficiency at particular sites. Interestingly, nearly all (99.8%) sgRNAs transcripts used the correct acceptor site (Fig 2G). This suggests that selection of the single acceptor site is less error-prone than selection of the multiple donor sites.

Quality control for cellular and viral Riboseq

Next, we employed ribosome profiling to characterize the OC43 translatome. We used two different translational inhibitors for our ribosome profiling experiments; cycloheximide (CHX) arrests elongating ribosomes, whereas harringtonine (HAR) stalls ribosomes transitioning between initiation and elongation (Fig 3A and 3B). When combined with runoff of already-elongating ribosomes, HAR treatment can be used to map translation start sites [21]. In both variants, the lysate is treated with RNase I to generate ~26–34 nt mRNA fragments corresponding to the footprint of the translating ribosome, also known as ribosome protected fragments (RPFs) [22]. After library preparation, RPFs can be mapped to the human and viral genomes to identify translation initiation sites (HAR) and evaluate translation efficiency (CHX).

thumbnail
Fig 3. OC43 translation and protein expression.

(a and b) Schematic outline of CHX- and HAR-based ribosome profiling. See text for details. Right: Polysome profiles following treatment with either CHX or HAR. With CHX treatment most ribosomes are found in polysomes, whereas after 10m HAR treatment to specifically block at initiation only monosomes remain. Polysome profiles were generated from uninfected cells. (c) Distribution of CHX RPFs (blue) and HAR RPFs (pink) across the viral N gene at 6 hpi. Total RNA (black) is included as a control. (d) A log-scale representation of CHX RPFs across the entire viral genome. The scale bar below the annotation highlights the regions displayed in (f). (e) Comparison between RNA abundance and translational output for viral and cellular genes. (f) Distribution of HAR RPFs across the sgRNA region of the genome. Prominent peaks are annotated according to their probable function. (g) A close-up view of the HAR RPF peak internal to the M gene. The inferred P-site mapping is shown below. Below: the codon and amino acid sequence for the full length M gene and the putative truncation variant. (h) A close-up view of the HAR RPF peaks across the NS5A and E genes. (i and j) P-site mapping for CHX RPFs across the first NS5A uORF (i) and E uORF (j). Inset: Summed read-frame frequencies across the entire uORF. Error bars show standard deviation of the mean (n = 4). Below: the codon and amino acid sequence for each uORF. (k) Schematic outline of the viral 5ʹUTR-luciferase reporter constructs. Each construct included the indicated viral 5ʹUTR and first 18 nt of the corresponding viral CDS fused to firefly luciferase (FLuc). (l) Bar graph showing the RNA abundance (left) and luciferase activity (right) for each reporter construct. Each FLuc construct was cotransfected with Renilla luciferase (RLuc) for normalization. Error bars show the standard error around the mean (n = 3). Asterisks show statistically significant difference relative to the NS2A reporter: * p < 0.05, ** p < 0.01.

https://doi.org/10.1371/journal.ppat.1012831.g003

Riboseq reads mapping to cellular CDS regions peaked at 31–32 nt (S4A Fig), somewhat longer than the canonical footprint size of 28 nt [22]. Even so, these longer reads showed clear evidence of triplet periodicity (S4B Fig). Inferred locations of the corresponding ribosomal P-sites aligned with the first position within the codon (‘frame 1’), indicating faithful recovery of ribosome occupancy. We also examined the distribution of reads between cellular 5ʹUTR, CDS, and 3ʹUTR regions (S4C and S4D Fig). As expected, CHX riboseq reads mapped predominantly to CDS, whereas HAR riboseq reads spanned the 5ʹUTR and CDS. At later timepoints we saw increased coverage across 3ʹUTRs in CHX. For most genes the effect was modest, but some exceptions are described below. Taken together, these observations indicate that a large proportion of reads mapped to cellular mRNAs in each dataset represent genuine RPFs.

The data for reads mapped to viral RNAs were more complex. At 6 hpi, CHX RPFs showed relatively even coverage across viral ORFs and were largely excluded from the 3ʹUTR (Figs 3C and S5AS5D). In addition, reads mapping to ORF1A displayed clear triplet periodicity (S5E Fig). Unfortunately, this periodicity was lost at subsequent timepoints, and read coverage became increasingly erratic. We observed a large increase in reads mapping to ORF1B (S5B Fig), which should be translated only at low levels, and the viral 3ʹUTR (S5C and S5D Fig), which should not be translated at all. We conclude that many of the virus-mapping reads from 12–30 hpi are not genuine RPFs, and perhaps represent packaging intermediates, as previously suggested for SARS-CoV-2 [4]. We therefore excluded later timepoints from our analysis of viral translation and focused on the 6 hpi timepoint, prior to substantial virion production (Fig 1D).

Quantitative analysis of viral translation

A genome-wide view of viral translation at 6 hpi showed relatively low ribosome density across ORF1AB, and increased density throughout the subgenomic region (Fig 3D), reflecting the high translation rates of subgenomic mRNAs. Surprisingly, RPF coverage over the ORF1A gene was highly uneven (S6A and S6B Fig); ribosome density across the first ~2.7 kb was up to 10-fold higher than the rest of ORF1A. This may reflect either slow elongation or elevated rates of premature translation termination within this region (see Discussion).

As described above, some proportion of ribosomes terminate at the end of ORF1A while the remainder undergo a −1 frameshift and continue translating to the end of ORF1B. To calculate the “frameshift efficiency”, we divided the footprint density in ORF1B by the density in ORF1A (excluding the region of elevated ribosome density across NSP1 and NSP2). We estimate a frameshift efficiency of 56 ± 16%, similar to values measured previously for SARS-CoV-2 and MHV [4,5].

To determine the translation efficiency of viral mRNAs, we compared ribosome density to transcript abundance (Fig 3E). For most viral ORFs, translation rate correlated well with RNA abundance. The main outliers were ORF1A and ORF1B, which showed considerably lower translation than expected based on their RNA abundance. This is presumably because some proportion of full-length genomic RNA is used for replication or packaging, and thus not accessible to the translational machinery.

Noncanonical translation across the viral genome

Next, we used our HAR riboseq library to catalog translation initiation sites throughout the viral genome. As expected, most HAR peaks were centered over the start codons of canonical genes, but we also identified putative noncanonical translation start sites (Figs 3F, S6C, and S6D). Notably, the M gene included a prominent peak over an internal, in-frame AUG (Fig 3F and 3G), suggesting an alternative translation initiation site. Translation initiation from this downstream AUG is expected to yield an N-terminal truncation variant of M lacking the first 48 amino acids (Fig 3G). By counting the number of RPFs associated with each start site, we estimate that the Δ48M variant is translated at 21% the rate of the full-length ORF (S6E Fig).

We also observed peaks coinciding with uORFs in the 5ʹUTRs of the NS5A and E sgRNAs (Fig 3F and 3H). P-site analysis revealed clear evidence of periodicity across the NS5A uORF (Fig 3I), indicating that it is indeed translated. The peak overlapping with the first uORF in the E gene was less clear-cut (Fig 3J). We observed strong periodicity, but only 2 of the 6 codons showed significant read counts, so the evidence for translation is not definitive.

The uORFs from NS5A and E are rather short (11 and 6 codons, respectively) and unlikely to produce functional peptides, but may influence translation of the downstream ORF. To test this hypothesis, we generated reporter constructs in which the 5ʹUTR of each viral gene was fused to Firefly luciferase (Fig 3K). The resulting constructs were transfected into uninfected HEK293A cells alongside a Renilla luciferase control, and the reporter proteins were quantified using a dual luciferase assay. The NS5A and E reporters produced less protein than some other viral 5ʹUTR reporters, but the differences were modest (<2-fold), and may be explained by a reduction in RNA abundance rather than translation, per se (Fig 3L). We also tested versions of the NS5A and E reporters in which the uORFs had been deleted. However, these ΔuORF variants were not translated any more efficiently than their wild type counterparts (Fig 3L). We conclude that neither the NS5A uORF nor the three E uORFs substantially impedes translation of the downstream ORF in uninfected cells. However, we note that translation regulation may be different in the context of infection [3,23,24].

Finally, we also observed ribosome density over a uORF upstream of ORF1AB (S7A Fig). Evidence for translation was not entirely conclusive in Riboseq, but we note that the ORF1AB uORF is highly conserved throughout Betacoronaviridae (S7BS7D Fig), as previously reported [5,25], supporting a functional role. By contrast, the NS5A and E uORFs are apparently unique to OC43 (S7E Fig).

ER-stress response genes are upregulated during infection

We next examined changes in cellular gene expression throughout the course of infection. RNAseq replicates collected for each timepoint showed excellent reproducibility, as assessed by Spearman correlation (S8A Fig) and principal component analysis (PCA) (Fig 4A). These global analyses revealed that most transcriptional changes only occur relatively late in infection, from ~12 hpi onwards (Figs S8A, S8B, 4A, and 4B) when virus production is already extremely high (~1000 virions h−1; Fig 1D).

thumbnail
Fig 4. Changes to the host transcriptome and translatome during infection.

(a) Principal component analysis (PCA) for different replicates and timepoints. (b) Heatmap showing the changes in RNA abundance throughout the time course. The heatmap shows the most significantly upregulated transcripts (by p-value). (c) GO enrichment analysis of the top 300 upregulated genes at 30 hpi. The size of each circle is proportional to the number of genes and the color shows the statistical significance. (d) Heatmap showing the changes in RNA abundance, translation, and translation efficiency (TE) across all genes associated with ER-related GO terms in (c). (e) Changes in mRNA abundance and translation during the infection time course for two ER stress genes. Error bars show the standard deviation around the mean (n = 3–4). Asterisks show statistically significant difference between uninfected and infected samples: * p < 0.05, ** p < 0.01, *** p < 0.001.

https://doi.org/10.1371/journal.ppat.1012831.g004

To quantify changes in individual transcripts, we performed differential gene expression (DEG) analysis (S8C and S8D Fig and S4 Table). By 30 hpi, 14% of the transcriptome was significantly upregulated (p < 0.001 and log2(fold change)>1) and 15% downregulated. To characterize these DEGs, we performed gene ontology (GO) enrichment analysis for the top 300 significantly upregulated genes (Fig 4C). Of the top 15 GO terms, 13 related to the ER stress response (Fig 4C). In general, genes associated with this category fell into two subgroups. The first consisted of chaperones that assist in the maturation of misfolded proteins within the ER (e.g., DNAJB9, DNAJB11, PDIA4, etc.). The second included genes that regulate translation in response to ER dysfunction, including the ER-associated eIF2α kinase EIF2AK3 (PERK) and other regulators (e.g., DNAJC3, PPP1R15B). We also noted increased splicing of the ER-associated transcription factor XBP1 (S9A Fig), a well-established hallmark of ER stress [26]. We conclude that activation of ER stress response genes is a major component of the cell’s transcriptional response to infection.

We next examined whether conventional antiviral genes are upregulated upon infection. Strikingly, neither IFNB1 (Interferon-β) nor interferon stimulated genes (ISGs) [27] were induced (S9B Fig), suggesting OC43 is able to suppress interferon signaling and/or evade detection by cellular pattern recognition receptors. By contrast, several cytokine and chemokine genes were strongly activated (S9C and S9D Fig). Interestingly, a similar increase was also seen after mock infection with heat-inactivated virus, but generally only at 3 and 6 hr (e.g., CXCL3 and IL6). This suggests that cytokine/chemokine genes are initially induced in response to virus-laden inoculum, but sustained expression requires active infection.

Finally, we investigated whether increased transcription was accompanied by increased translation (S9CS9F Fig). Indeed, both ER stress response genes and cytokine/chemokine genes showed elevated translation, albeit with a ~6hr delay compared to the transcriptional induction (Figs 4E and S9C). We conclude that infected cells experience ER stress during infection and mount a transcriptional and translational response to restore homeostasis. In parallel, cells activate expression of some cytokine and chemokine genes, but interferon stimulated genes generally remain suppressed.

Cellular NMD targets show increased translation during infection

While many of the translationally-induced genes were associated with either ER stress response or cytokine production, we also identified numerous other genes (e.g., EIF4A2, TAF1D, and CCNL1) which did not easily fit into any particular functional category (Fig 5A and S6 Table). To understand why these genes were induced upon infection, we examined the distribution of RNA and Riboseq reads across individual transcripts.

thumbnail
Fig 5. Cellular NMD targets are stabilized and translated during infection.

(a) Volcano plot showing translationally upregulated and downregulated mRNAs at 30 hpi. (b) Ribosome density (blue) and RNA abundance (grey) across EIF4A2. Numbers to the right indicate the scale in RPM. Note that EIF4A2 incorporates an unannotated exon (dashed red box) that includes a premature termination codon (PTC). (c) Changes in RNA abundance of the spliced and unspliced isoforms of EIF4A2. Error bars show standard deviation around the mean (n = 3–4). (d) Volcano plot showing translationally upregulated (red) or downregulated (blue) lncRNAs at 30 hpi. (e) Ribosome density (blue) and RNA abundance (grey) across SNHG1. Numbers to the right indicate the scale in RPM. The location of the first termination codon is indicated as PTC. (f) Changes in RNA abundance of the spliced and unspliced isoforms of SNGH1. Error bars show standard deviation around the mean (n = 3–4). Asterisks show statistically significant difference between the spliced and unspliced isoform: * p < 0.05, ** p < 0.01, *** p < 0.001.

https://doi.org/10.1371/journal.ppat.1012831.g005

Intriguingly, the EIF4A2 isoform expressed during infection includes an additional, unannotated exon with a premature termination codon (PTC) (Fig 5B). Typically, transcripts with PTCs are recognized and degraded by nonsense mediated decay (NMD) during the first round of translation [28]. Nonetheless, EIF4A2PTC showed sharply increased translation over the course of infection. We also observed a strong increase in the abundance of spliced EIF4A2PTC RNA (Fig 5C), presumably reflecting protection from NMD. By contrast, the unspliced precursor was either unchanged or slightly decreased. TAF1D and CCNL1 followed similar patterns. The TAF1D isoform expressed during infection includes multiple exons downstream of the stop codon (S10A Fig), and CCNL1 incorporates an additional exon with a PTC (S10B Fig), so both transcripts are predicted NMD substrates. Despite this, both TAF1DPTC and CCNL1PTC showed increased translation and RNA abundance during infection. Moreover, as noted above, many genes showed increased translation across the 3’UTR (S4C Fig), as does the viral genome (S5D Fig) which would normally be expected to elicit an NMD response.

Next, we examined long noncoding RNAs (lncRNAs) that would also be subject to NMD if translated. We identified 76 translationally induced lncRNAs, including several snoRNA host genes (SNHGs) (Fig 5D). SNHGs encode snoRNAs within their introns; after splicing, the mature snoRNAs are processed from the intron, while the spliced host transcript is generally degraded [2931]. About one-half of spliced SNHG species are degraded in the nucleus by the exosome, while the remainder are exported to the cytoplasm and degraded by NMD [32]. Notably, all six of the SNHGs found here were previously identified as NMD substrates [32]. As an example, inspection of SNHG1 revealed that translation was limited to the 5ʹ end of the transcript, corresponding to the first, relatively short open reading frame (Fig 5E). The abundance of the spliced isoform of SNHG1 was sharply increased over the course of infection, whereas the unspliced precursor was either unchanged or slightly decreased (Fig 5F). We conclude that during OC43 infection the translation machinery becomes permissive for NMD substrates, which are translationally induced and stabilized.

Discussion

In this study, we monitored changes in RNA and translation during the course of OC43 infection. This allowed us to (1) define the transcriptional and translational architecture of the OC43 genome, and (2) characterize the cellular response to infection.

The transcriptional and translational architecture of OC43

The OC43 transcriptome consists of a full-length genomic RNA and seven canonical sgRNAs, corresponding to the seven distinct ORFs downstream of ORF1AB (Figs 1 and 2). OC43 is unusual in this regard, as related coronaviruses typically express the NS5A and E genes from a single sgRNA (S7 Fig) [5]. Beyond the seven canonical sgRNAs, we also identified 265 noncanonical sgRNAs, but these were expressed at low levels and generally lacked substantial coding potential.

Ribosome profiling allowed us to identify several well-translated novel ORFs across the viral genome (Fig 3). Notably, the M gene apparently includes an alternative translation start site downstream of the primary AUG. Translation initiation from this distal site is expected to yield an N-terminally truncated isoform lacking the first 48 amino acids (Δ48M). We also identified several uORFs upstream of ORF1AB, NS5A, and E, with the uORF upstream of NS5A showing strong evidence of translation. The functional consequences of this remain to be determined. The NS5A uORF may produce a functional peptide and/or fulfill a regulatory role, perhaps allowing increased translation of the primary ORF when bulk translation is repressed late in infection.

To our surprise, ribosome occupancy across ORF1A was highly uneven: ribosome density at the 5ʹ end of the gene was at least 10-fold higher compared to downstream regions (S6 Fig). Notably, a similar phenomenon was observed for the closely related coronavirus MHV at 1 hpi [5]. This translational gradient could potentially reflect slow elongation, perhaps caused by suboptimal codons, but we failed to detect any difference in codon usage across ORF1A. An alternative explanation is that the decrease in ribosome density is caused by premature translation termination. Many viruses employ premature translation termination as a gene expression strategy, but this usually involves dedicated frameshift elements and occurs at a single, designated site. By contrast, we observed a gradual decrease in ribosome density across a relatively large region (~2.7 kb). A third possibility is that the increased ribosome density at the 5ʹ end reflects collisions between the viral RdRp and ribosomes. A key feature of coronavirus replication is that the (+) strand genome serves as a template for both transcription and translation, in opposite directions. When RdRp and a ribosome both initiate on a single RNA, they will inevitably collide. We predict that collision slows or removes the leading ribosome, causing downstream depletion and/or accumulation of upstream ribosomes. We note that synthesis of the full length (−) gRNA requires transcription through the entire ORF1a/b region, which is actively translated. RdRp may require the ability to displace translating ribosomes, particularly early in infection when a single, ~30 kb (+) gRNA molecule is the only template for both activities.

At timepoints after 6hpi, the quality of ribosome profiling data declined sharply for viral, but not cellular transcripts, probably caused by packaging of the viral genome. When treated with RNase I, partially packaged fragments will presumably yield a range of footprint sizes, overlapping with the 26–34 nt fragments produced by translating ribosomes.

Cellular response to OC43 infection

The host antiviral response was strikingly muted, with major transcriptional changes only apparent from ~12 hpi onwards (Fig 4). By this point, infected cells were already releasing copious quantities of virus, highlighting the extent to which OC43 is able to evade effective detection. Indeed, aside from several cytokine genes, the large majority of canonical antiviral genes (most notably IFNB1 and ISGs) remained quiescent throughout the entire course of infection. This is notably different from the response to SARS-CoV-2 infection, which provokes a significant increase in the transcription and translation of innate immune genes [9,33]. The relative ‘stealthiness’ of OC43 may be attributable to NS2A, an innate immune antagonist which inhibits the OAS-RNaseL dsRNA recognition pathway [16]. Among human coronaviruses, only OC43 encodes the NS2A gene [1].

In contrast, we saw activation of multiple genes linked to ER stress. We anticipate that a major, and perhaps unavoidable, consequence of OC43 infection is activation of an ER stress response. Several highly-abundant coronavirus proteins are matured through the ER, even as the virus siphons off ER membrane to build its envelope [34]. Combined, these activities may overwhelm the ER’s protein folding capacity, leading to the accumulation of misfolded proteins and triggering the unfolded protein response (UPR), a cellular pathway that alleviates ER stress [35]. The UPR has several downstream effects, including the upregulation of chaperone proteins and the inhibition of global protein translation to reduce the load on the ER. Beginning 12 the 26–34 nt fragments 18 hpi, ER stress response genes were both transcriptionally and translationally induced (Fig 4). A recent study of HCoV-229E and MERS infection also reported increased transcription of ER stress response genes, but with unaltered protein expression [36]. Our results indicate that responses to OC43 are different, although we cannot exclude effects from cell lines or experimental conditions.

Work with SARS-CoV-2 has highlighted the importance of Nsp1 in the regulation of both viral and cellular translation [10,11,23,24,3739]. The C-terminal domain of Cov-2 Nsp1 binds and obstructs the mRNA entry channel of the ribosome, thereby suppressing bulk translation. Viral transcripts escape this suppression due to the presence of a stem loop (SL1) present in the CoV-2 leader [23], a common 70 nt RNA element found at the start of both full-length gRNA and individual sgRNAs. OC43 Nsp1 also downregulates translation via its C-terminal domain, but curiously, the OC43 5’ leader sequence does not confer resistance to Nsp1-mediated suppression [40]. This raises the possibility that OC43 overcomes the translational suppression of its own transcripts simply by synthesizing vast amounts of viral RNA. The relatively weak translation efficiency of OC43 transcripts observed here (Fig 3E) is consistent with this idea.

Finally, we observed increased translation and expression of a few cellular NMD substrates (Fig 5), suggesting NMD is suppressed during infection. This may protect viral RNA from degradation, as coronavirus RNAs incorporate various features expected to trigger NMD, including multiple ORFs and exceptionally long 3ʹUTRs [41,42]. Indeed, the mouse coronavirus MHV was reported to suppress NMD in trans [43] an activity critical for viral replication. More recently, the SARS-CoV-2 nucleocapsid protein was shown to bind and inhibit UPF2, a critical NMD factor [44]. Future work is needed to determine whether this activity is broadly conserved throughout the coronavirus family.

Materials and methods

Cells and virus

MRC-5 cells (ATCC Cat#CCL-171) were cultured at 37°C and 5% CO2 in Dulbecco’s modified Eagle medium (Life Technologies Cat#41965039) supplemented with 10% fetal bovine serum (FBS; Sigma Cat#F7524) and 100U/mL penicillin and streptomycin (P/S) (Gibco Cat#15140-122). One to two days prior to infection, MRC-5 cells were transferred to 33°C. HEK293A cells (ThermoFisher Cat#R7507) were used for dual luciferase assays, and cultured in the same medium but supplemented with 1X MEM NEAA (Gibco Cat#11140-035).

An aliquot of HCoV-OC43 was purchased from ATCC (Cat#VR-1558) and ~100µL was used to infect five T-175 flasks of ~80% confluent MRC-5 cells. Four days post infection, the medium supernatant was pooled and then centrifuged to remove cell debris. The virus-laden supernatant was divided into aliquots and frozen at −80°C. The concentration of the viral stock was calculated using qPCR (see below) and infectivity was measured by serial dilution and visualization of cytopathic effect in MRC-5 cells. Prior to the time-course experiments, cells were first washed with serum-free DMEM, and then incubated with serum-free media containing virus at an MOI of 7 for 1h (adsorption step). Subsequently, the viral inoculum was removed, and replaced with standard serum-containing DMEM. Cells were cultured at 33°C and harvested at the appropriate timepoint. Mock infections were performed in the same manner, but the virus stock was first heat-inactivated at 60°C for 10 min. All HCoV-OC43 virus work was conducted in a Category 2 facility in accordance with the biosafety guidelines at the University of Edinburgh.

Plasmids

The viral 5ʹUTR reporter constructs were synthesized using GeneArt and cloned into pLV-SARS-CoV-2 5ʹUTR-Luciferase (Addgene Cat#191480) [37] using NdeI and BsaBI. The hCMV-IE1:Renilla plasmid (Addgene Cat#118066) [45] was used as a control for the dual luciferase assays.

Immunofluorescence

MRC-5 cells were seeded to 4-well 0.5 ml chamber slides (Ibidi Cat#80426) and fixed at 6 hr intervals following infection. Briefly, chamber slides were washed with PBS and incubated with 4% formaldehyde for 15 min. Subsequently, cells were washed with PBS and permeabilized with 0.2% Triton X-100 for 7 min. Cells were incubated with 0.2% fish gelatin blocking solution (Biotium Cat#22010) for 30 min before addition of sheep anti-N antibody (MRC Protein Phosphorylation and Ubiquitylation Unit Cat#DA116) for 1 hr. Finally, chamber slides were washed with PBS and incubated in darkness with an anti-sheep secondary antibody conjugated to Alexa Fluor dyes (Abcam Cat#ab150177) and DAPI at a dilution of 1:5000. Chamber slides were mounted with fluoromount G (EM Sciences Cat#1798425). Images were obtained using a Zeiss Axio Imager equipped with 0.65 NA 40x objective.

qPCR

To measure the concentration of viral genomes in the medium, we first made a plasmid incorporating a small region of ORF1AB for use as a quantification standard. This plasmid was used to generate a qPCR standard curve relating plasmid DNA concentration to the cycle threshold (Ct) value of the ORF1AB fragment. The reaction consisted of 0.5 µL Luna RT (Promega Cat#E3005S), 2.5 µL of 2X reaction mix, 300 nM of primers (S1 Table) targeting ORF1AB, 1.75 µL template, and 0.5 µL water. To measure virion concentration, 20 µL of culture medium was removed from infected cells and RNA was extracted with 1mL of trizol. The RNA was resuspended in 500 µL water, and used as template for a RT-qPCR reaction as described above.

RNAseq

For RNAseq, approximately 7x105 cells growing on 6-well plates were washed with PBS and harvested with Tri Reagent (Sigma Cat#T9424). Total RNA was extracted, and 3 µg of RNA from each sample was treated with 2U Turbo DNase (ThermoFisher Cat#AM2238) for 30 min at 37°C according to the manufacturer’s protocol. The resulting RNA (300 ng) was depleted of ribosomal RNA using the NEBNext rRNA Depletion Kit (NEB Cat#E6310). rRNA-depleted RNA was purified using RNAClean XP beads (Beckman-Coulter Cat#66514), and libraries were prepared using the NEBNext rRNA Depletion Kit for Human/Mouse/Rat (NEB Cat#E6310). Adapter-ligated cDNA was amplified with 11 cycles of PCR to generate libraries ~300 bp in length. Libraries were sequenced with a 100-cycle kit in paired-end mode.

Ribosome profiling

Approximately 1 × 107 cells growing on a 15 cm plate were used for each ribosome profiling experiment. For harringtonine-based profiling, the cells were treated with 8 µg/mL harringtonine (Cat#ab141941) for 10 min. Subsequently, the culture medium was removed, and cells were washed with ice-cold PBS supplemented with 0.1 mg/mL cycloheximide (CHX) (Sigma Cat#C7698). The cells were then resuspended in 1 mL of lysis buffer (10 mM Tris 7.5, 5 mM MgCl2, 100 mM KCl, 1% tritonX-100, 2 mM DTT, protease inhibitors (1 tablet/10 mL) (Pierce Cat#A32955), and 0.1 mg/mL CHX), and lysed by gently passing through a 25G syringe (BD Cat#300600) four times and incubating on ice for ~5 min. Subsequently, the cell lysates were cleared by centrifugation at 1300g/10 min/4°C and frozen at −80°C for later use. Cycloheximide-based profiling was performed as described above, but the pre-treatment with harringtonine was omitted.

Subsequently, the cell lysate was thawed, and approximately 80 µg of RNA in 1 mL volume was digested with 5 U of RNase I (Lucigen Cat#E0067-10D1) for 45 min at 25°C. Ribosomal complexes were separated by centrifugation at 38000 RPM for 2 hr at 4°C in a 14 mL 10–50% sucrose gradient with a buffer consisting of 20 mM HEPES-KOH pH 7.4, 5 mM MgCl2, 100 mM KCl, 2 mM DTT, and 0.1 mg/mL CHX. Centrifugations were performed using the Beckman-Coulter Optima XPN-100 ultracentrifuge with the SW40 rotor and 14mL polyallomer centrifuge tubes (Seton Scientific Cat#5031). The sucrose gradients were separated into 200 µL fractions using the Piston Gradient Fractionator (Biocomp). The fractions corresponding to the monosome peak (usually 15 in total) were combined, and Tri Reagent was used to extract RNA. Approximately 10 µg of total RNA was separated on a 15% TBE-urea gel at 200V for 65 min, and 26–34 nt footprints were excised using 26 nt and 34 nt RNA markers as a guide (S1 Table) [22]. The resulting gel fragments were incubated overnight on a rotating wheel in 400 µL RNA extraction buffer (300 mM NaOAc, pH 5.2; 1 mM EDTA; and 0.25% SDS). Subsequently, the extracted RNA was precipitated with 1.5 µL GlycoBlue (Life Technologies Cat#AM9515) and 500 µL isopropanol. The RNA was pelleted by centrifugation at 20000g/15 min/4°C, washed with 80% ethanol, and resuspended in 12 µL water.

Subsequently, the RNA footprints were treated T4 PNK to generate 5ʹ and 3ʹ ends amenable to linker ligation. The 12 µL RNA was incubated together with 1.5 µL 10X buffer, 0.5 µL T4 PNK (NEB Cat#M0201), and 0.5 µL RNasIN (Promega Cat#N2511) at 37°C/30m. Next, 1.5 µL of 10 mM ATP was added and the reaction was incubated for another 30m. The RNA was purified by standard phenol:chloroform extraction, precipitated in 70% ethanol, and resuspended in 6 µL.

The resulting RNA was used as input for the SMARTer smRNA-Seq Kit for Illumina (Takara Cat# 635030). cDNA libraries were prepared according to the manufacturer’s protocol, except an rRNA depletion step was incorporated after the reverse transcriptase reaction. Briefly, 12 biotinylated oligos were designed to target abundant rRNA fragments produced by RNase I cleavage (S1 Table). Individual 100 µM stocks were prepared, and then combined in equivolume proportions to prepare the subtractive oligo mix. Subsequently, 20 µL cDNA was combined with 2.5 µL subtractive oligo mix and 2.5 µL 20X SSC (3 M NaCl, 0.3 M NaCitrate, pH 7). The resulting reaction was denatured at 100°C for 90s, and then cooled to 37°C at a rate of 0.1°C/s to allow annealing. Subsequently, the reaction mix was combined with 50 µL MyOne Streptavidin C1 Dynabeads (Life Technologies Cat#65001) in 2X B&W buffer (2 M NaCl, 1 mM EDTA, 10 mM Tris 7.5). The resulting reaction was incubated at 37°C for 15m with shaking. The dynabeads were pelleted using a magnetic rack and 40 µL of the supernatant was recovered and transferred to a new tube containing 1.5 µL GlycoBlue, 6 µL of 5 M NaCl, and 60 µL water. The resulting mix was combined with 150 µL isopropanol, and incubated on ice for 30 min. The rRNA-depleted cDNA was pelleted at 20000g/30 min/4°C, washed with 80% ethanol, and resuspended in 20 µL water.

Finally, 5 µL was used as input for a 13-cycle PCR reaction according to the manufacturer’s protocol. The resulting PCR product was purified using the Nucleospin Gel and PCR clean-up kit (Macherey-Nagel Cat#740609.50), and resolved on a 8% polyacrylamide TBE gel (Life Technologies Cat#EC62162BOX). Fragments from ~190–250 bp in length were excised, and incubated overnight on a rotating wheel in 400 µL DNA extraction buffer (300 mM NaCl, 10 mM Tris 8.0, 1 mM EDTA, and 0.01% igepal). Subsequently, the extracted DNA was precipitated as described above for the RNA footprints. Libraries were sequenced using Illumina NextSeq with a 100-cycle kit in single-end mode.

Reporter assays

HEK293A cells were seeded onto 24 well plates and transfected with individual 5ʹUTR reporters together with the Renilla luciferase control the following day. Plasmids were combined in a 1:1 ratio (250 ng each) and transfected using Lipofectamine 2000 according to the manufacturer’s protocol. The following day, cells were harvested using 200 µL Passive Lysis buffer. Firefly and Renilla luciferase production were evaluated using the Dual Luciferase Reporter Assay System (Promega Cat#E1910) and the Multimode Plate Reader (Molecular Devices).

Quantification and statistical analysis

RNAseq analysis

Sequencing reads were aligned to a concatenated human (ENSEMBL hg38-101) and HCoV-OC43 (ATCC_VR_1558) genome using STAR [46]. The resulting bam files were sorted (samtools sort) and used to generate coverage maps with bamCoverage. The coverage maps were normalized based on cellular mRNA content, so the viral genome was excluded from the quantification:

bamCoverage ‐b file.bam ‐‐ignoreForNormalization HCoV_OC43 ‐‐exactScaling ‐‐normalizeUsing CPM ‐‐binSize 1 ‐‐filterRNAstrand forward ‐o file_plus_strand.bw

Read counts per gene were tabulated using featureCounts [47]:

featureCounts file.bam ‐a human_OC43_annotation.gtf ‐C ‐d 50 ‐D 2000 ‐‐countReadPairs ‐s 2 ‐p ‐B ‐O ‐‐extraAttributes gene_biotype,gene_name ‐‐largestOverlap ‐t gene ‐o read_counts.txt

Virus junction counts were based on the number of uniquely mapped reads spanning discontinuous regions of the viral genome. These were identified by extracting reads with gapped alignments from the sam file.

Riboseq analysis

The Takara SMARTer smRNA‐Seq Kit adds 3 nt to the 5′ end of each read, and a short poly(A) tail at the 3′ end. These features were removed using flexbar:

flexbar ‐r file.fastq ‐n 10 ‐a AAAAAAAAAA ‐ao 4 ‐x 3 ‐m 1 ‐ae RIGHT ‐u 2 ‐f i1.8 ‐t filtered.fastq

The remaining reads were aligned to the concatenated human and HCoV-OC43 genome as described above. The coverage maps were normalized based on cellular mRNA content, so the viral genome and chromosomes encoding ribosomal RNA (Chr 1, 21, and 22) were excluded from the quantification:

bamCoverage ‐b file.bam ‐‐ignoreForNormalization 1 21 22 HCoV_OC43 ‐‐exactScaling ‐‐normalizeUsing CPM ‐‐binSize 1 ‐‐filterRNAstrand forward ‐o file_plus_strand.bw

Read counting was performed with featureCounts:

featureCounts file.bam ‐a human_OC43_annotation.gtf ‐C ‐d 50 ‐D 2000 ‐‐countReadPairs ‐s 1 ‐O ‐‐extraAttributes gene_biotype,gene_name ‐‐largestOverlap ‐t gene ‐o read_counts.txt

Basic quality control analyses for cellular mRNAs (S4 and S5 Figs; RPF lengths, frame, and biotypes) were performed with Ribotoolkit [48] (http://rnainformatics.org.cn/RiboToolkit/index.php) using default parameters. For frame and biotype analysis, only 26–34 nt reads were analyzed.

Viral mRNAs were analyzed with Ribocount (https://pythonhosted.org/riboplot/ribocount.html) using a fasta annotation file corresponding to the viral ORFs. A 12 nt offset was used for all RPFs 26–34 nt in length:

ribocount ‐b file.bam ‐f OC43_CDS.fasta ‐l 26,27,28,29,30,31,32,33,34 ‐s 12,12,12,12,12,12,12,12,12 ‐o OC43_RPFs.txt

Differential gene expression

DESeq2 [49] was used to identify differentially expressed genes and produce PCA and volcano plots. As input, we used the top 14000 cellular mRNAs by expression (defined as the average RPM across all timepoints and replicates). Default parameters were used for all analyses. DEGs were defined as having a log2(fold change)>±1 and (adj. p-value)<0.001.

Supporting information

S1 Fig. Related to Introduction.

(a) The viral life cycle. See Introduction for details. (b) Schematic outline of discontinuous transcription. The viral genome incorporates TRS-B motifs upstream of each structural and accessory protein ORF. When RdRp transcribes a TRS-B, it sometimes “jumps” to the TRS-L at the 5ʹ end of the genome. Because the TRS-L and each TRS-B are similar in sequence, the nascent RNA reanneals to the template, and transcription resumes. The relative abundance of each (−) sgRNA is determined by how frequently RdRp jumps at each TRS-B. In the final step, the (−) sgRNAs serve as templates for synthesis of (+) sgRNAs which are subsequently used in translation.

https://doi.org/10.1371/journal.ppat.1012831.s001

(TIFF)

S2 Fig. Experimental design and validation.

Related to Fig 1. (a and b) Preparation of the viral stock. To minimize experimental variability, we prepared a single batch of virus for the duration of the study. (a) Viral genomes were quantified using RT-qPCR against a region within ORF1A. (b) Infectious particles were quantified using limiting dilution and assessment of cytopathic effect (CPE) under the experimental conditions used in the study. The resulting particle:infection ratio was 7:1. (c) Immunofluorescence microscopy showing infected cells throughout the time course. Cells were stained with DAPI to visualize cell nuclei and decorated with an antibody against the viral N protein to identify infected cells. (d) Cell growth rates following infection with OC43 or mock-infection with heat-inactivated (h and i) virus. Error bars show the standard deviation around the mean (n = 3).

https://doi.org/10.1371/journal.ppat.1012831.s002

(TIFF)

S3 Fig. OC43 noncanonical sgRNA expression.

Related to Fig 2. (a) Plots showing the expression kinetics of genomic RNA and individual sgRNAs throughout the time course. Error bars show standard deviation of the mean (n = 4). (b) Tree diagram showing the structure of the 50 most abundant noncanonical sgRNAs. (c) Structure and protein coding potential of the eight most abundant noncanonical sgRNAs. In every case, the initial ORF is no longer than 16 amino acids. The numbers indicate the donor-acceptor coordinates. (d) Plot showing the hybridization potential between donor sites (i.e., TRS-Bs) and the TRS-L acceptor site. Transcripts are sorted by abundance. Complementary bases are displayed in orange (canonical junctions) or blue (noncanonical junctions), and noncomplementary bases are shown in white.

https://doi.org/10.1371/journal.ppat.1012831.s003

(TIFF)

S4 Fig. Riboseq quality control metrics for cellular genes.

Related to Fig 3. (a) Length distribution of RPFs mapping to cellular (green) and viral (blue) CDS regions following gel-based selection for 26–34 nt fragments. (b) Proportion of cellular-mapping RPFs associated with each reading frame. (c) Distribution of Riboseq reads between 5ʹUTR, CDS, and 3ʹUTR regions. (d) Riboseq read densities across the EEF2 gene for a representative replicate. Total RNA (black), CHX RPFs (blue), HAR RPFs (pink). The numbers to the right of the plot show the scale in RPM for each track. The annotation for EEF2 is shown below. Grey rectangles show 5ʹ and 3ʹUTR regions, black rectangles show CDS, and black lines show introns.

https://doi.org/10.1371/journal.ppat.1012831.s004

(TIFF)

S5 Fig. Riboseq quality control metrics for viral genes.

Related to Fig 3. (a) Schematic showing the structure of the full-length viral gRNA. The scale bars below the annotation highlight the regions displayed in panels (b) and (c). (b and c) Ribosome density across sections of the viral genome. (d) The relative proportion of viral-mapping RPFs which map to the 3ʹUTR. Error bars show standard deviation of the mean (n = 3). (e) Proportion of viral-mapping RPFs associated with each reading frame across ORF1A. Error bars show standard deviation of the mean (n = 3).

https://doi.org/10.1371/journal.ppat.1012831.s005

(TIFF)

S6 Fig. OC43 translation.

Related to Fig 3. (a) Ribosome density at 6 hpi across a section of ORF1A. Four independent replicates of CHX Riboseq are shown (blue). Total RNA determined by RNAseq is shown as a comparison (black). Codon optimality scores are shown below (green). (b) Ribosome density across ORF1A. The red baseline shows the average ribosome density across the latter half of ORF1A. (c) Distribution of HAR RPFs around the start codons of each canonical viral ORF. (d) Density of HAR RPFs at 6 hpi across the subgenomic region of the viral genome. Two independent replicates are shown. The asterisk marks a likely sequencing artifact. (e) Bar chart showing the relative proportion of RPFs associated with the primary start codon for M versus the downstream initiation site. Error bars show the standard deviation around the mean (n = 2).

https://doi.org/10.1371/journal.ppat.1012831.s006

(TIFF)

S7 Fig. Viral uORF translation.

Related to Fig 3. (a) P-site mapping for CHX RPFs across the ORF1AB uORF. Below: the codon and amino acid sequence for the ORF1AB uORF. (b and d) Schematic showing the conserved architecture (b), peptide sequences (c), and sequence context (d) of the ORF1AB uORF across the betacoronavirus A lineage. (e) Genomic organization of the NS5A and E genes across various representatives of the betacoronavirus A lineage. The architectures for MHV [5] and OC43 were determined experimentally, and HKU24 and HKU1 were inferred based on the presence or absence or TRS-B motifs.

https://doi.org/10.1371/journal.ppat.1012831.s007

(TIFF)

S8 Fig. Changes to the host transcriptome during infection.

Related to Fig 4. (a) Matrix showing the pairwise Spearman correlation coefficients for human transcripts between different replicates and timepoints. The red squares outline each set of replicates. (b) Heatmap showing the changes in RNA abundance throughout the time course. The heatmap shows the most significantly decreased transcripts (by p-value). (c) Bar chart showing the number of differentially expressed genes (DEGs) at each timepoint. (d) Volcano plots showing statistically significantly upregulated (red) and downregulated (blue) genes at various timepoints after infection.

https://doi.org/10.1371/journal.ppat.1012831.s008

(TIFF)

S9 Fig. Antiviral gene expression during infection.

Related to Fig 4. (a) Bar graph showing the relative proportion of spliced and unspliced XBP1 throughout the time course. (b) Box-and-whisker plot showing the expression of interferon stimulated genes (ISGs) in response to infection. The box shows the 25th, median, and 75th percentile, while the whiskers show the 10th and 90th percentiles. (c) Transcriptional (left) and translational (center) induction of cytokines and chemokines in response to infection. The transcriptional response to mock infection with heat-inactivated (HI) virus is also shown (right). (d) Changes in mRNA abundance and translation during the infection time course for two cytokine genes. Error bars show the standard deviation around the mean (n = 3–4). (e) Histogram showing the change in translation efficiency (TE) following infection. The 3000 most-expressed genes are included in the analysis. (f) GO analysis for genes showing an increase in TE (log2≥1).

https://doi.org/10.1371/journal.ppat.1012831.s009

(TIFF)

S10 Fig. NMD substrates are translationally derepressed during infection.

Related to Fig 5. (a and b) Ribosome density (blue) and RNA abundance (grey) across TAF1D (a) and CCNL1 (b). Numbers to the right indicate the scale in RPM. Note that TAF1D includes eight additional exons beyond the normal stop codon, while CCNL1 incorporates an unannotated exon that includes a PTC. Both features are highlighted with a dashed red box.

https://doi.org/10.1371/journal.ppat.1012831.s010

(TIFF)

S1 Table. DNA and RNA oligonucleotides used in this study.

https://doi.org/10.1371/journal.ppat.1012831.s011

(XLSX)

S3 Table. mRNA expression measurements.

The table shows mRNA expression (RPKM) for 14000 transcripts throughout the course of OC43 infection.

https://doi.org/10.1371/journal.ppat.1012831.s013

(XLSX)

S4 Table. Differential gene expression analysis (RNA expression).

The table shows differentially expressed genes at each timepoint following infection. Each tab shows a different timepoint compared to the uninfected control.

https://doi.org/10.1371/journal.ppat.1012831.s014

(XLSX)

S5 Table. Ribosome density measurements.

The table shows ribosome density for 5000 transcripts throughout the course of OC43 infection.

https://doi.org/10.1371/journal.ppat.1012831.s015

(XLSX)

S6 Table. Differential gene expression analysis (RNA translation).

The table shows differentially expressed genes at each timepoint following infection. Each tab shows a different timepoint compared to the uninfected control.

https://doi.org/10.1371/journal.ppat.1012831.s016

(XLSX)

Acknowledgments

We thank Richard Clark and Angie Fawkes from the Edinburgh Wellcome Clinical Research Facility for sequencing services, Shaun Webb from the WCB bioinformatics core facility for maintaining the servers we used for processing sequencing data, and Eric Schirmer for helpful advice. We gratefully acknowledge technical assistance from the Edinburgh Protein Production Facility (EPPF), and the Proteomic and Bioinformatics cores.

References

  1. 1. Forni D, Cagliani R, Clerici M, Sironi M. Molecular evolution of human coronavirus genomes. Trends Microbiol. 2017;25(1):35–48. pmid:27743750 PMCID: PMC7111218
  2. 2. Hartenian E, Nandakumar D, Lari A, Ly M, Tucker JM, Glaunsinger BA. The molecular virology of coronaviruses. J Biol Chem. 2020;295(37):12910–34. pmid:32661197 PMCID: PMC7489918
  3. 3. Kim D, Kim S, Park J, Chang HR, Chang J, Ahn J, et al. A high-resolution temporal atlas of the SARS-CoV-2 translatome and transcriptome. Nat Commun. 2021;12(1):5120. pmid:34433827 PMCID: PMC8387416
  4. 4. Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, et al. The coding capacity of SARS-CoV-2. Nature. 2021;589(7840):125–30. pmid:32906143
  5. 5. Irigoyen N, Firth AE, Jones JD, Chung BY-W, Siddell SG, Brierley I. High-resolution analysis of coronavirus gene expression by RNA sequencing and ribosome profiling. PLoS Pathog. 2016;12(2):e1005473. pmid:26919232 PMCID: PMC4769073
  6. 6. Gao X, Zhu K, Qin B, Olieric V, Wang M, Cui S. Crystal structure of SARS-CoV-2 Orf9b in complex with human TOM70 suggests unusual virus-host interactions. Nat Commun. 2021;12(1):2843. pmid:33990585; PMCID: PMC8121815
  7. 7. Thorne LG, Bouhaddou M, Reuschl A-K, Zuliani-Alvarez L, Polacco B, Pelin A, et al. Evolution of enhanced innate immune evasion by SARS-CoV-2. Nature. 2022;602(7897):487–95. pmid:34942634 PMCID: PMC8850198
  8. 8. Jiang H-W, Zhang H-N, Meng Q-F, Xie J, Li Y, Chen H, et al. SARS-CoV-2 Orf9b suppresses type I interferon responses by targeting TOM70. Cell Mol Immunol. 2020;17(9):998–1000. pmid:32728199 PMCID: PMC7387808
  9. 9. Finkel Y, Gluck A, Nachshon A, Winkler R, Fisher T, Rozman B, et al. SARS-CoV-2 uses a multipronged strategy to impede host protein synthesis. Nature. 2021;594(7862):240–5. pmid:33979833
  10. 10. Thoms M, Buschauer R, Ameismeier M, Koepke L, Denk T, Hirschenberger M, et al. Structural basis for translational shutdown and immune evasion by the Nsp1 protein of SARS-CoV-2. Science. 2020;369(6508):1249–55. pmid:32680882 PMCID: PMC7402621
  11. 11. Schubert K, Karousis ED, Jomaa A, Scaiola A, Echeverria B, Gurzeler L-A, et al. SARS-CoV-2 Nsp1 binds the ribosomal mRNA channel to inhibit translation. Nat Struct Mol Biol. 2020;27(10):959–66. pmid:32908316
  12. 12. Li Y, Renner DM, Comar CE, Whelan JN, Reyes HM, Cardenas-Diaz FL, et al. SARS-CoV-2 induces double-stranded RNA-mediated innate immune responses in respiratory epithelial-derived cells and cardiomyocytes. Proc Natl Acad Sci U S A. 2021;118(16):e2022643118. pmid:33811184 PMCID: PMC8072330
  13. 13. Aloise C, Schipper JG, van Vliet A, Oymans J, Donselaar T, Hurdiss DL, et al. SARS-CoV-2 nucleocapsid protein inhibits the PKR-mediated integrated stress response through RNA-binding domain N2b. PLoS Pathog. 2023;19(8):e1011582. pmid:37607209 PMCID: PMC10473545
  14. 14. Zheng Z-Q, Wang S-Y, Xu Z-S, Fu Y-Z, Wang Y-Y. SARS-CoV-2 nucleocapsid protein impairs stress granule formation to promote viral replication. Cell Discov. 2021;7(1):38. pmid:34035218 PMCID: PMC8147577
  15. 15. Rabouw HH, Langereis MA, Knaap RCM, Dalebout TJ, Canton J, Sola I, et al. Middle east respiratory coronavirus accessory protein 4a inhibits PKR-mediated antiviral stress responses. PLoS Pathog. 2016;12(10):e1005982. pmid:27783669 PMCID: PMC5081173
  16. 16. Zhao L, Jha BK, Wu A, Elliott R, Ziebuhr J, Gorbalenya AE, et al. Antagonism of the interferon-induced OAS-RNase L pathway by murine coronavirus ns2 protein is required for virus replication and liver pathology. Cell Host Microbe. 2012;11(6):607–16. pmid:22704621 PMCID: PMC3377938
  17. 17. Lee JY, Wing PAC, Gala DS, Noerenberg M, Järvelin AI, Titlow J, et al. Absolute quantitation of individual SARS-CoV-2 RNA molecules provides a new paradigm for infection dynamics and variant differences. Elife. 2022;11:e74153. pmid:35049501 PMCID: PMC8776252
  18. 18. Kim D, Lee J-Y, Yang J-S, Kim JW, Kim VN, Chang H. The architecture of SARS-CoV-2 transcriptome. Cell. 2020;181(4):914–921.e10. pmid:32330414 PMCID: PMC7179501
  19. 19. Sola I, Almazán F, Zúñiga S, Enjuanes L. Continuous and discontinuous RNA synthesis in coronaviruses. Annu Rev Virol. 2015;2(1):265–88. pmid:26958916 PMCID: PMC6025776
  20. 20. Lau SKP, Lee P, Tsang AKL, Yip CCY, Tse H, Lee RA, et al. Molecular epidemiology of human coronavirus OC43 reveals evolution of different genotypes over time and recent emergence of a novel genotype due to natural recombination. J Virol. 2011;85(21):11325–37. pmid:21849456 PMCID: PMC3194943
  21. 21. Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. 2012;7(8):1534–50. pmid:22836135 PMCID: PMC3535016
  22. 22. McGlincy NJ, Ingolia NT. Transcriptome-wide measurement of translation by ribosome profiling. Methods. 2017;126:112–29. pmid:28579404 PMCID: PMC5582988
  23. 23. Slobodin B, Sehrawat U, Lev A, Hayat D, Zuckerman B, Fraticelli D, et al. Cap-independent translation and a precisely located RNA sequence enable SARS-CoV-2 to control host translation and escape anti-viral response. Nucleic Acids Res. 2022;50(14):8080–92. pmid:35849342 PMCID: PMC9371909
  24. 24. Aviner R, Lidsky PV, Xiao Y, Tassetto M, Kim D, Zhang L, et al. SARS-CoV-2 Nsp1 cooperates with initiation factors EIF1 and 1A to selectively enhance translation of viral RNA. PLoS Pathog. 2024;20(2):e1011535. pmid:38335237 PMCID: PMC10903962
  25. 25. Wu H-Y, Guan B-J, Su Y-P, Fan Y-H, Brian DA. Reselection of a genomic upstream open reading frame in mouse hepatitis coronavirus 5’-untranslated-region mutants. J Virol. 2014;88(2):846–58. pmid:24173235 PMCID: PMC3911626
  26. 26. Yoshida H, Matsui T, Yamamoto A, Okada T, Mori K. XBP1 mRNA is induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Cell. 2001;107(7):881–91. pmid:11779464
  27. 27. Schoggins JW, Wilson SJ, Panis M, Murphy MY, Jones CT, Bieniasz P, et al. A diverse range of gene products are effectors of the type I interferon antiviral response. Nature. 2011;472(7344):481–5. pmid:21478870 PMCID: PMC3409588
  28. 28. Schweingruber C, Rufener SC, Zünd D, Yamashita A, Mühlemann O. Nonsense-mediated mRNA decay - mechanisms of substrate mRNA recognition and degradation in mammalian cells. Biochim Biophys Acta. 2013;1829(6–7):612–23. pmid:23435113
  29. 29. Kiss T, Filipowicz W. Exonucleolytic processing of small nucleolar RNAs from pre-mRNA introns. Genes Dev. 1995;9(11):1411–24. pmid:7797080
  30. 30. Tycowski KT, Shu MD, Steitz JA. A mammalian gene with introns instead of exons generating stable RNA products. Nature. 1996;379(6564):464–6. pmid:8559254
  31. 31. Lykke-Andersen S, Chen Y, Ardal BR, Lilje B, Waage J, Sandelin A, et al. Human nonsense-mediated RNA decay initiates widely by endonucleolysis and targets snoRNA host genes. Genes Dev. 2014;28(22):2498–517. pmid:25403180 PMCID: PMC4233243
  32. 32. Bresson SM, Hunter OV, Hunter AC, Conrad NK. Canonical Poly(A) polymerase activity promotes the decay of a wide variety of mammalian nuclear RNAs. PLoS Genet. 2015;11(10):e1005610. pmid:26484760 PMCID: PMC4618350
  33. 33. Puray-Chavez M, Lee N, Tenneti K, Wang Y, Vuong HR, Liu Y, et al. The translational landscape of SARS-CoV-2-infected cells reveals suppression of innate immune genes. mBio. 2022;13(3):e0081522. pmid:35604092 PMCID: PMC9239271
  34. 34. V’kovski P, Kratzel A, Steiner S, Stalder H, Thiel V. Coronavirus biology and replication: implications for SARS-CoV-2. Nat Rev Microbiol. 2021;19(3):155–70. pmid:33116300 PMCID: PMC7592455
  35. 35. Hetz C, Papa FR. The unfolded protein response and cell fate control. Mol Cell. 2018;69(2):169–81. pmid:29107536
  36. 36. Shaban MS, Müller C, Mayr-Buro C, Weiser H, Meier-Soelch J, Albert BV, et al. Multi-level inhibition of coronavirus replication by chemical ER stress. Nat Commun. 2021;12(1):5536. pmid:34545074 PMCID: PMC8452654
  37. 37. Vora SM, Fontana P, Mao T, Leger V, Zhang Y, Fu T-M, et al. Targeting stem-loop 1 of the SARS-CoV-2 5’ UTR to suppress viral translation and Nsp1 evasion. Proc Natl Acad Sci U S A. 2022;119(9):e2117198119. pmid:35149555 PMCID: PMC8892331
  38. 38. Bujanic L, Shevchuk O, von Kügelgen N, Kalinina A, Ludwik K, Koppstein D, et al. The key features of SARS-CoV-2 leader and NSP1 required for viral escape of NSP1-mediated repression. RNA. 2022;28(5):766–79. pmid:35232816 PMCID: PMC9014875
  39. 39. Lapointe CP, Grosely R, Johnson AG, Wang J, Fernández IS, Puglisi JD. Dynamic competition between SARS-CoV-2 NSP1 and mRNA on the human ribosome inhibits translation initiation. Proc Natl Acad Sci U S A. 2021;118(6):e2017715118. pmid:33479166 PMCID: PMC8017934
  40. 40. Maurina SF, O’Sullivan JP, Sharma G, Pineda Rodriguez DC, MacFadden A, Cendali F, et al. An evolutionarily conserved strategy for ribosome binding and host translation inhibition by β-coronavirus non-structural protein 1. J Mol Biol. 2023;435(20):168259. pmid:37660941 PMCID: PMC10543557
  41. 41. Bühler M, Steiner S, Mohn F, Paillusson A, Mühlemann O. EJC-independent degradation of nonsense immunoglobulin-mu mRNA depends on 3’ UTR length. Nat Struct Mol Biol. 2006;13(5):462–4. pmid:16622410
  42. 42. Hurt JA, Robertson AD, Burge CB. Global analyses of UPF1 binding and function reveal expanded scope of nonsense-mediated mRNA decay. Genome Res. 2013;23(10):1636–50. pmid:23766421 PMCID: PMC3787261
  43. 43. Wada M, Lokugamage KG, Nakagawa K, Narayanan K, Makino S. Interplay between coronavirus, a cytoplasmic RNA virus, and nonsense-mediated mRNA decay pathway. Proc Natl Acad Sci U S A. 2018;115(43):E10157–66. pmid:30297408 PMCID: PMC6205489
  44. 44. Mallick M, Boehm V, Xue G, Blackstone M, Gehring NH, Chakrabarti S. Modulation of UPF1 catalytic activity upon interaction of SARS-CoV-2 Nucleocapsid protein with factors involved in nonsense mediated-mRNA decay. Nucleic Acids Res. 2024;52(21):13325–39. pmid:39360627
  45. 45. Sarrion-Perdigones A, Chang L, Gonzalez Y, Gallego-Flores T, Young DW, Venken KJT. Examining multiple cellular pathways at once using multiplex hextuple luciferase assaying. Nat Commun. 2019;10(1):5710. pmid:31836712 PMCID: PMC6911020
  46. 46. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. pmid:23104886 PMCID: PMC3530905
  47. 47. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30(7):923–30. pmid:24227677
  48. 48. Liu Q, Shvarts T, Sliz P, Gregory RI. RiboToolkit: an integrated platform for analysis and annotation of ribosome profiling data to decode mRNA translation at codon resolution. Nucleic Acids Res. 2020;48(W1):W218–29. pmid:32427338 PMCID: PMC7319539
  49. 49. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. pmid:25516281 PMCID: PMC4302049