Genotype-dependent and non-gradient patterns of RSV gene expression

Respiratory syncytial virus (RSV) is a nonsegmented negative-strand (NNS) RNA virus and a leading cause of severe lower respiratory tract illness in infants and the elderly. Transcription of the ten RSV genes proceeds sequentially from the 3’ promoter and requires conserved gene start (GS) and gene end (GE) signals. Previous studies using the prototypical GA1 genotype Long and A2 strains have indicated a gradient of gene transcription. However, recent reports show data that appear inconsistent with a gradient. To better understand RSV transcriptional regulation, mRNA abundances from five RSV genes were measured by quantitative real-time PCR (qPCR) in three cell lines and cotton rats infected with virus isolates belonging to four different genotypes (GA1, ON, GB1, BA). Relative mRNA levels reached steady-state between four and 24 hours post-infection. Steady-state patterns were genotype-specific and non-gradient, where mRNA levels from the G (attachment) gene exceeded those from the more promoter-proximal N (nucleocapsid) gene across isolates. Transcript stabilities could not account for the non-gradient patterns observed, indicating that relative mRNA levels more strongly reflect transcription than decay. While the GS signal sequences were highly conserved, their alignment with N protein in the helical ribonucleocapsid, i.e., N-phase, was variable, suggesting polymerase recognition of GS signal conformation affects transcription initiation. The effect of GS N-phase on transcription efficiency was tested using dicistronic minigenomes. Ratios of minigenome gene expression showed a switch-like dependence on N-phase with a period of seven nucleotides. Our results indicate that RSV gene expression is in part sculpted by polymerases that initiate transcription with a probability dependent on GS signal N-phase. Author Summary RSV is a major viral pathogen that causes significant morbidity and mortality, especially in young children. Shortly after RSV enters a host cell, transcription from its nonsegmented negative-strand (NNS) RNA genome starts at the 3’ promoter and proceeds sequentially. Transcriptional attenuation is thought to occur at each gene junction, resulting in a gradient of gene expression. However, recent studies showing non-gradient levels of RSV mRNA suggest that transcriptional regulation may have additional mechanisms. We show using RSV isolates belonging to four different genotypes that gene expression is genotype-dependent and one gene (the G or attachment gene) is consistently more highly expressed than an upstream neighbor. We hypothesize that variable alignment of highly conserved gene start (GS) signals with nucleoprotein (i.e., variable GS N-phase) can affect transcription and give rise to non-gradient patterns of gene expression. We show using dicistronic RSV minigenomes wherein the reporter genes differ only in the N-phase of one GS signal that GS N-phase affects gene expression. Our results suggest the existence of a novel mechanism of transcriptional regulation that might play a role in other NNS RNA viruses.


76
Respiratory syncytial virus (RSV) can infect individuals repeatedly and is the 77 most common pathogen associated with severe lower respiratory tract disease in 78 children worldwide [1][2][3][4][5]. Numerous host-related and environmental risk factors for 79 severe disease are known [6][7][8] while viral factors are less clear. 80 RSV is a nonsegmented negative-strand (NNS) RNA virus classified into two 81 major subgroups, A and B, largely distinguished by antigenic differences in the 82 attachment or G protein [9,10]. The two subgroups are estimated to have diverged from 83 an ancestral strain over 300 years ago [11] and have evolved into multiple co-circulating 84 genotypes [11][12][13][14][15]. 85 The RNA genome of RSV is embedded in interlinking and helix-forming subunits 86 of nucleocapsid (N) protein, together forming the ribonucleocapsid (RNP) complex [16, 87 17]. Viral mRNA are not encapsidated [17,18]. Formation of the RNP complex requires 88 high concentrations of N protein and a 5' terminal dinucleotide AC synthesized by the 89 polymerase independently of template [18,19]. Each subunit of N protein binds a seven 90 nucleotide stretch of RNA via contacts with the sugar phosphate backbone, causing the 91 RNA to adopt a conformation with a distinct configuration of solvent-exposed and buried 92 nucleobases [16,17]. Exposed nucleobases can presumably interact directly with viral 93 polymerases bound to the RNP complex [16]. Moreover, the alignment of the viral RNA 94 to the N protein within the RNP (N-phase) will determine its pattern of exposed and 95 buried bases. The effects of N-phase on promoter recognition have been explored in 96 RSV and some paramyxoviruses [18,[20][21][22][23]. N-phase affects RNA synthesis by 5 97 paramyxoviral RNA polymerases but, in RSV, promoter recognition is strongly 98 determined by the proximity of the promoter sequence to the 3' terminus of the genome; 99 replication is abolished if the core promoter starts six or more nucleotides from the 3' 100 end [23]. Unlike its effects on promoter recognition and replication, the effects of N-101 phase on transcription are unexplored. 102 Transcription in RSV and other NNS viruses is sequential, with genes transcribed 103 in their order of occurrence from the 3' promoter of the genome [18,[24][25][26][27][28][29]. Each of the 104 ten genes of RSV contains essential gene start (GS) and gene end (GE) signals 105 flanking the open reading frame (ORF) [30][31][32]. Transcription is initiated at the GS signal 106 which also serves as a capping signal on the 5' end of the nascent mRNA [18,33,34]. 107 The polymerase then enters elongation mode until it reaches a GE signal, where the 108 mRNA is polyadenylated and released [18,30]. Two genes overlap at the 5' end of the 109 RSV genome. The GE signal of matrix 2 (M2) occurs downstream of the GS signal of 110 the large polymerase (L) gene. The polymerase must return from the M2 GE signal for 111 full-length L mRNA to be made [35], suggesting that transcribing polymerases scan the 112 RSV genome bidirectionally for a new GS signal after terminating transcription. Indeed, 113 scanning polymerase dynamics may be a universal feature of NNS virus transcription 114 [18,[36][37][38]. 115 By homology with other NNS viruses, it is widely assumed that transcription in 116 RSV follows a gradient, where the extent to which a gene is transcribed falls with its 117 distance from the 3' promoter [29,39,40]. Earlier studies reported data consistent with 118 a gradient [39,41,42]; however, recent studies show mRNA abundances that peak at 119 the G gene, which is located in the middle of the genome [40,43]. We recently reported 6 120 the G gene to be the most abundant in clinical samples obtained from RSV/A-and 121 RSV/B-infected infants [44]. Thus, existing data suggest that patterns of RSV gene 122 expression are more variable than has been assumed.

123
Here we explored the natural diversity of patterns of RSV gene expression by 124 using qPCR to measure mRNA abundances of five different RSV genes (NS1, NS2, N, 125 G, F) from isolates that we sequenced belonging to both subgroups and four different  Oligonucleotide standards of known concentration were used to convert cycle 140 threshold (C T ) values measured by real-time PCR for mRNA targets (Fig 1A) to mRNA 141 abundances. Twenty oligonucleotide standards and sets of reagents (primers and 142 probe) were needed to quantify 20 mRNA targets (five genes in four isolates). All 143 reagents gave rise to a similar range of C T values for standards at equal concentrations 144 ( Fig 1B).

160
All four sets of steady-state mRNA levels were non-gradient, with levels of G 161 mRNA exceeding levels of N mRNA (Fig 3). Steady-state mRNA levels also showed 162 both subgroup-and genotype-specific differences (Fig 3). Between subgroups, relative 163 levels of NS1 and NS2 were most different (Fig 3), with the two being similar in RSV/A, 164 and with NS1 levels exceeding NS2 by a factor of ~5 in RSV/B (Fig 3). Within RSV/A, 165 the level of NS1 exceeded NS2 in the GA1 isolate, and was matched by NS2 in the ON 166 isolate (Fig 3). In RSV/B, the level of G mRNA exceeded N in the BA isolate (~5-fold 167 greater) more than it did in the GB1 isolate (~2-fold greater) (Fig 3). Furthermore, 168 genotype-specific steady-state mRNA levels were comparable in A549, Vero, and HEp2 169 cell lines ( Fig 4A). 170 We explored whether relative mRNA levels might change in the context of a fully 171 immunocompetent host. A pair of cotton rats was infected with each virus isolate and 172 both lung lavage (LL) and nasal wash (NW) samples were collected at four days pi.

173
Relative mRNA levels were genotype-specific and similar in cotton rat LL and NW 174 samples, and comparable to those measured in vitro ( Fig 4B).

177
The observed divergence from a transcription gradient could be the result of 178 differential stability of the RSV mRNAs. Therefore, we measured transcript stabilities by 179 blocking transcription using the RSV RNA-dependent RNA polymerase (RdRp) inhibitor 180 GS-5734 then monitoring mRNA levels by qPCR over time. Decay was measured for all 181 five mRNAs from each of the four isolates in HEp-2 cells (Fig 5A). Exponential decay 182 functions were fit to the data and half-lives were calculated from the decay constants.

183
Half-lives ranged from 10 to 27 hours with a mean of 16 ± 5 hours ( Fig 5B). Distributions 9 184 of mRNA stabilities varied among the isolates, with GA1 having the greatest uniformity 185 and lowest mean (= 12 ± 1 hours) ( Fig 5A). Gene expression patterns were estimated 186 by correcting measured mRNA abundances for degradation and recalculating relative 187 mRNA levels (mRNA expressed = measured mRNA # * e (decay constant * 24 hr) ). Estimated 188 levels of gene expression remained non-gradient; thus, differential mRNA stabilities do 189 not account for the non-gradient patterns observed ( Fig 5C). These data indicate that 190 relative mRNA levels are 1) more strongly shaped by gene expression than decay and 191 2) can safely be interpreted to reflect levels of gene expression. 193 194 Whole genome sequences of the four RSV isolates were obtained by next-195 generation sequencing and analyzed for differences in GS signals that might help 196 explain the non-gradient gene expression patterns observed. GS signals were highly 197 conserved, with a single U to C substitution at position ten of the G gene GS signal (Fig   198   6A). 199 We analyzed GS signal sequences for their alignment with N protein, as the 200 alignment of a GS signal with bound N protein will affect its conformation and determine 201 its configuration of solvent-exposed and buried nucleobases [16]. The alignment of a 202 GS signal with N protein (N-phase) might therefore affect interactions with scanning proxy as the exact 5' terminus of each RSV genotype is not known. Thus, the estimated 10 208 GS signal N-phase will differ from the actual N-phase if the nucleotide length beyond 209 the end of the L GE signal is not equal to an integer multiple of seven. However, every 210 GS signal N-phase within a genotype would be uniformly affected, making estimated 211 intra-genotype differences equal to actual intra-genotype differences. GS signal N-212 phase was highly variable, making it a potential source of the variation observed in 213 patterns of gene expression ( Fig 6B). two states with low, and one state with intermediate activity (Fig 7C). Ratios increased 229 by as much as 50% relative to the minimum measured ( Fig 7C). Thus the N-phase of 230 the Firefly luciferase GS signal affected the relative level of gene expression, and by 11 231 inference, transcription initiation ( Fig 7C). Furthermore, ratios of luciferase activity were 232 consistent with a periodicity of seven nucleotides ( Fig 7C). We observed genotype-dependent and non-gradient patterns of RSV gene 237 expression. We hypothesize that non-gradient patterns require a mechanism to alter the oligonucleotide standards, and support the accuracy of our approach to measuring viral 254 mRNA abundances.

255
A gene expression gradient has been widely assumed for RSV, but supporting 256 data come from a modest number of studies and are largely restricted to laboratory-257 adapted isolates (Long and A2) from the prototypic GA1 genotype of subgroup A. The 258 first measurements were made by Collins and Wertz (1983) using an A2 strain in HEp-2 259 cells [28,42,49]. They discovered the gene order of RSV and found it was Our minigenome data suggest that polymerases preferentially initiate 282 transcription at GS signals with certain solvent-exposed nucleobases (3C and 10U of 283 the RSV GS signal). What accounts for this preference, and what events follow GS 284 signal recognition and lead to either transcription initiation or continued scanning is 285 unknown. It is interesting that the U to C substitution in position ten of the G gene GS 286 signal has been shown to result in less not more transcription [30]. Thus, additional 287 factor(s) beyond GS signal N-phase may account for over-expression of the G gene. It 288 is worth stating that transcription initiation, being a molecular event, must be stochastic.

289
RSV transcription is therefore sequential but likely not obligatorily sequential. A relative 290 excess of G gene mRNA can occur from polymerases, more often than not, failing to 291 initiate transcription at the N gene before initiating at the G gene. It is also possible that 292 the N gene is usually expressed before the G gene, but G mRNA accumulates more