Genome-wide transposon mutagenesis of paramyxoviruses reveals constraints on genomic plasticity

The antigenic and genomic stability of paramyxoviruses remains a mystery. Here, we evaluate the genetic plasticity of Sendai virus (SeV) and mumps virus (MuV), sialic acid-using paramyxoviruses that infect mammals from two Paramyxoviridae subfamilies (Orthoparamyxovirinae and Rubulavirinae). We performed saturating whole-genome transposon insertional mutagenesis, and identified important commonalities: disordered regions in the N and P genes near the 3' genomic end were more tolerant to insertional disruptions; but the envelope glycoproteins were not, highlighting structural constraints that contribute to the restricted antigenic drift in paramyxoviruses. Nonetheless, when we applied our strategy to a fusion-defective Newcastle disease virus (Avulavirinae subfamily), we could select for F-revertants and other insertants in the 5' end of the genome. Our genome-wide interrogation of representative paramyxovirus genomes from all three Paramyxoviridae subfamilies provides a family-wide context in which to explore specific variations within and among paramyxovirus genera and species.


33
The Paramyxoviridae family encompasses a diverse and ever-expanding range of mammalian 34 pathogens, including such familiar human viruses as measles (MeV), mumps (MuV), parainfluenza, and 35 henipaviruses (1). Paramyxoviruses (PMVs) are negative-sense, single-stranded RNA viruses with genes 36 coding for at least six major proteins: nucleocapsid (N), phosphoprotein (P), matrix (M), fusion 37 glycoprotein (F), receptor binding protein (RBP, formerly designated variously as HN, H, or G), and the 38 large protein (L) that possesses RNA-dependent RNA polymerase (RdRp) activity (2). In addition, different 39 virus species encode a host of accessory proteins from the P gene. Others have additional less well-40 characterized genes (e.g. for small-hydrophobic (SH) proteins in some orthorubulaviruses). Despite 41 persisting in human populations for centuries, individual PMVs show a remarkable lack of antigenic 42 variability within the common envelope glycoproteins (F and HN/H/G), and often cross-react to antibodies 43 raised against closely-related viruses (3-5). For example, the MeV and MuV strains used in the MMR 44 vaccines have not changed in the last 40 years, and yet are still protective against current field isolates (6). 45 Indeed, a MuV-like virus isolated from bats (7) is cross-neutralized by mumps-vaccinated human sera (8), 46 and the latest ICTV classification considers this bat mumps virus as a strain of MuV rather than a new 47 Orthorubulavirus species (2). This is in contrast to the well-known propensity for antigenic drift of 48 influenza virus, another negative-sense RNA virus, in response to various pressures including host 49 populations' adaptive immune responses (9). 50 Our lab previously examined the overall genetic plasticity of a vaccine-strain MeV through whole-51 genome transposon insertional mutagenesis (10), and found that unlike influenza virus (11), MeV did not 52 tolerate insertional changes in its surface glycoproteins, F and H. MeV also demonstrated a greater overall 53 intolerance for mutagenesis throughout its genome, concomitant with the known increased genetic stability 54 of PMVs (3, 12). However, MeV-H utilizes protein receptors to mediate virus entry (13), while a wide range 55 of other PMVs use sialic acids (SAs) to facilitate entry (14), like influenza does. Thus, we sought to learn 56 whether divergent SA-using PMVs would demonstrate a tolerance in their attachment glycoproteins that 57 insertion-tolerant regions within the PMV genomes that can be exploited for engineering recombinant 82 PMVs. 83

84
Sendai virus broadly tolerates insertions in the 3' end of the genome 85 In order to identify which regions of the SeV genome can best accommodate insertional mutations, 86 we utilized a Mu-transposon insertional mutagenesis strategy (Fig. 1A) to introduce 15 nucleotide (nt) 87 insertions throughout the SeV genome, which is equivalent to 5 amino acid (aa) insertions if the transposon 88 lands in an open reading frame (ORF). The insertional mutagenesis library approach is a more disruptive 89 approach to interrogating the PMV genome, in comparison to single-nucleotide mutagenesis approaches that 90 are more suited to interrogating single genes. However, insertants induce a severe selective pressure on the 91 virus, which is helpful for whole-genome interrogation. Briefly, we first generated a SeV genomic plasmid 92 with an extra 3-nt stop codon added at the end of the EGFP reporter gene (SeV 6n+3). Since PMV genomes 93 follow the "rule-of-six", where the entire genome length must be an exact multiple of six (6n) in order to 94 replicate well (15), SeV 6n+3 should be rescued inefficiently and replicate even less so on its own ( Fig. 1B  95 and data not shown). The same is true for MuV, whose genome we also interrogated similarly in this study 96 (Fig. 1C). Since our intention was to understand the tolerance of each of the virus' native genetic regions for 97 insertions, we then excluded the reporter gene from all downstream analyses. We subjected this SeV 6n+3 98 plasmid ('parental 6n+3' in Fig. 1A) to Mu-transposon mutagenesis using optimized conditions to achieve 99 saturating mutagenesis, ultimately leaving random 15-nt insertions across the genome. Transposon-mutated 100 genomic plasmid libraries are therefore 6n+18 (Fig. 1A), restoring the "rule-of-six", and should be more 101 competent for rescue and replication (Fig. 1B, C). In addition, any genomes that failed to receive the 102 transposon remain 6n+3 in length; these genomes cannot be rescued well, and ultimately will not be 103 represented in our sequencing analyses because they lack the transposon sequence. Importantly, the 15-nt 104 transposon 'scar' itself is designed to be translatable in all three reading frames. The SeV 6n+18 insertional 105 library was rescued in BSRT7 cells with rescue events (~3 x 10 5 ) equal to approximately 19-fold coverage 106 of the genome (Supplementary Table S1). Rescued virus from the supernatant was passaged twice in biological triplicates at a low multiplicity of infection (moi) on Vero cells, which should screen for the 108 replicative fitness of any given insertion. We chose Vero cells -lacking interferon signaling -as a neutral 109 background to remove confounding antiviral selective pressures in our experiments. In particular, the P gene 110 encodes the phosphoprotein, an essential co-factor for the viral transcriptase and replicase, but also encodes 111 accessory proteins to mitigate the host interferon response. In the absence of interferon signaling, insertants 112 that disrupt accessory proteins are not selected against and we can better explore the structural and 113 functional plasticity of the phosphoprotein itself. Thereafter, the original library plasmid pool (input), 114 supernatant from the rescue (P0), and passages (P1, P2) were deep-sequenced and evaluated for the 115 presence of the 15-nt insertion. 116 In the input plasmid pool, insertions were found at 65.9% of nucleotide sites, and ultimately targeted 117 ~93% of amino acids throughout the genome (Supplementary Table S1: Library metrics). Importantly, 118 insertions were distributed evenly across the genome in the plasmid pool ( Supplementary Fig. S1), 119 demonstrating no bias in the input SeV 6n+18 library. 120 We mapped insertion coverage from P0, P1, and P2 onto the SeV genome ( Fig. 2A-C) and observed 121 a clear purifying selection upon passaging, as expected. At P2, we observe a clear enrichment for insertions 122 in the N and P genes, located at the 3' end of the genome, and a depletion of insertions elsewhere in the 123 genome. The magnitude and location of insertional enrichment is relatively reproducible between each of 124 the three replicates seen in Fig. 2D. The consistent preference for insertions in the 3' end of the genome, and 125 particularly the non-coding region between N and P, is also clear. To determine if there were other broad 126 patterns to insertant enrichment, we then analyzed the change in insertional frequency from P0 to P2 in the 127 5' UTR, ORF, and 3' UTR of each viral gene (Fig. 2E). We observed significant enrichment of the 5'UTR-P 128 over its cognate ORF; similar trends can be seen with the other ORFs and their cognate UTRs when they are 129 well-represented at P2. HN and L genes have much smaller UTRs and drastically fewer insertants by P2, 130 making interpretation of their relative enrichments difficult. We also specifically observed that the 5' UTR 131 of the N gene and 3' UTR of the L gene both show reduced insertions relative to their neighboring regions, 132 indicative of the stringency of the additional roles these regions play as the 3' leader and 5' trailer sequences 133 of the virus, required for viral genomic amplification. 134 135 Selected SeV insertants have a different fitness hierarchy in focused competition assays 136 To evaluate the validity of the results from our pooled rescue and passaging experiments, we re-137 generated a subset of the most-highly represented SeV 6n+18 single insertants (see Supplementary Table  138 S2). At times, our choices were dictated by cloning successes as some insertants were inexplicably 139 refractory to cloning. To begin, we chose two insertants from each of the most-enriched regions (N-ORF, 3'-140 UTR-N, 5'-UTR-P, and P-ORF); generally, we chose highly-represented insertants from the original library, 141 but we also chose insertants that are more evenly distributed across those regions of interest. Since some 142 insertants did not rescue (Fig. 3A, discussed below) we included an additional N-ORF-1405 insertant to our 143 panel to maintain representation of highly-enriched regions. Finally, we added the most-enriched insertant 144 for each of the remaining M, F, HN, and L genes, for a total of thirteen insertants (Fig. 3A). We rescued 145 each insertant and amplified it separately, monitoring viral replication kinetics (Fig. 3B, C) and peak titer 146 ( Fig. 3A). Two of the insertants (HN-ORF and L-ORF) could not produce infectious virus upon rescue with 147 our highly-efficient reverse genetics system (Fig. 3A, Supplementary Table S2), and a further three 148 insertants (N-ORF-1684, M-ORF, and F-ORF) produced peak titers that were too low for downstream 149 applications (Fig. 3A), indicating that these insertants likely relied on other genomes in the original pool to 150 complement their defects in replication. The remaining eight insertants grew well relative to wild-type SeV, 151 and produced sufficient virus for use in our downstream assay. 152 Next, we sought to evaluate the fitness hierarchy of the insertants. Using a multiplex competition 153 assay that more closely reflects the selective pressures in the passaging library screen, we evaluated if the 154 insertants' relative abundance in this assay would correlate with either their peak titer or representation in 155 the original library. So, the eight insertants that could be rescued and replicated well, four each from the N 156 and P genes ( Fig. 3B-C), were pooled at equal titers and used to infect Vero cells at a total moi of 0.01. We then monitored the fitness of the selected insertants in this competitive outgrowth assay by measuring their 158 expansion over two passages, using MinION long-read sequencing (Fig. 3D, E). To better understand the 159 biological meaning of our library output (i.e., insert abundance at P2), which assays predicted each other's 160 outcomes, and where we ranked our selected insertants relative to each other in each assay, we found that 161 peak titer was tightly-correlated with performance in the competitive outgrowth assay (Fig. 3F), but that 162 ranking from the library screen did not predict either downstream measure of relative fitness (Fig. 3G). This 163 suggests that the SeV n+18 screening library output was predictive of insertant viability -i.e. whether it is 164 replication-competent -but not necessarily of relative fitness compared to other genomes. Thereafter, to 165 determine if the complex epistatic interactions we observed in in SeV apply to other sialic acid-using 166 paramyxoviruses, we next turned our analysis to the orthorubulavirus MuV. 167 168

Mumps virus is less tolerant to insertional mutagenesis 169
Using the same rule-of-six-based strategy as we did with SeV ( Fig. 1A, C), we generated a MuV 170 6n+3 parental genome with which to carry out whole-genome transposon insertional mutagenesis and 171 produced a MuV 6n+18 saturating library of insertants. This library was rescued in BSRT7 cells to ensure a 172 minimum of ten-fold coverage of the genome (>1.6 x 10 5 rescue events), then passaged twice on Vero cells 173 in biological triplicates as was done for the SeV 6n+18 insertional mutagenesis library. Coverage metrics 174 from deep sequencing are found in Supplementary Table S1  MuV and SeV, respectively, at P0), and an unexpectedly low titer of P0 rescued virus (3.2x10 2 iu/mL vs. 178 2.2x10 5 iu/ml for SeV). These initial results suggest that the MuV genome has a much lower overall 179 tolerance for insertions. 180 181

Mumps virus tolerates insertions in the 3' end of its genome
Due to the low titer from rescue (P0), P1 was carried out at an moi of 0.0001. After two passages, 183 the MuV 6n+18 library also showed evidence of purifying selection ( Fig. 4A-C). We observed some 184 enrichment for insertions at the 3' end of the genome (N and V/P genes) as was seen with SeV, and a 185 surprising secondary peak in the F coding region (Fig. 4C). Stream graphs of each passage replicate (Fig.  186 4D) show that in comparison with SeV, there was more variability in the regions of the MuV 6n+18 library 187 that were enriched between replicates. This may be a function of the reduced overall coverage of the library, 188 which may allow for stochastic rescue and amplification of viruses that pass a certain viability threshold. 189 Finally, in comparing the specific gene regions that permitted insertions (Fig. 4E), we observed a preference 190 for only certain UTRs over gene coding regions such as the 5' UTRs of V/P and SH. 191

Mumps virus competitive outgrowth assay reflects threshold viability of enriched insertants in library 193
screen 194 In order to assess the validity of the MuV 6n+18 library results, we chose the most-highly enriched 195 insertants overall from the library (Supplementary Table S3), representing the N and V/P regions of the 196 genome, as well as the F coding region. Once again, we also included the highest represented insertant from 197 each of the remaining genes (M, HN, and L) in an attempt to more evenly evaluate the library, and rescued 198 each of these insertants. As with SeV, we were unable to rescue the insertants in M, HN, and L 199 (Supplementary Table S3, Fig. 5A), but we were surprised to find that the F insertants also could not be 200 rescued to produce infectious virus. It is likely that these insertants became enriched in the context of the 201 library at P2 by relying on either compensatory mutations or complementation by other genomes. 202 Precedence for the latter is demonstrated by the G264R MeV-F mutant: in the context of an adversely 203 tagged MeV-H where neither wild-type nor G264R MeV-F resulted in syncytia, only viruses with diploid 204 genomes independently bearing the wild-type and G264R MeV-F are able to form syncytia (16). 205 Once rescued, we evaluated the remaining successful insertants for growth kinetics and peak titers 206 ( Fig. 5A-C), noting that overall these insertants grew well relative to wild-type MuV. We then pooled six of the insertants at equal titers, and infected Vero cells with a total moi of 0.01 in a competitive outgrowth 208 assay. Because the sequencing resolution afforded by the Oxford Nanopore MinION cannot consistently 209 distinguish between insertants P-5' UTR-1976 and-1977, which are shifted by only a single nucleotide, we 210 selected P-5'UTR-1977 for use in the competitive outgrowth assay as it was best-represented in the original 211 library screening (Supplementary Table S3). Over two passages, we evaluated the distribution of the 212 insertants (Fig. 5D, E), and unlike SeV, observed a clear dominance of the N coding region insertants -213 particularly, N-ORF-1781 -over all the other insertants. The other insertants were only found at 4-40 reads 214 out of the ~1,000 reads in each sample. N-ORF-1386 is a distant second, but still clearly dominant over the 215 other clones. These two insertants also showed the highest peak titer in Fig. 5A-C, along with the excluded 216 P-5'UTR-1976. Evaluating the three assays for correlation by ranking, we determined that peak titer and 217 competitive outgrowth were best correlated (Fig. 5F), while neither correlated well with the original library 218 ( Fig. 5G) similarly to what we observed with SeV. Cumulatively, this indicates that while the library screen 219 was valuable for identifying viable insertants, it was not predictive of relative fitness in downstream assays, 220 whereas fitness in one downstream assay predicts relative fitness in another reasonably well. 221 222

Fusion-defective Newcastle disease virus allows access to novel fitness landscapes 223
In order to (1) test the selective power of our transposon mutagenesis experimental set-up, and (2) 224 determine if the consistent enrichment of insertants in the 3' end of the genome is a technical artifact of our 225 system, we created a fusion-incompetent NDV by changing a naturally occurring NotI site in the fusion 226 peptide of our NDV genome (see schematic for NDV Fmut in Fig. 6A). Recall that our transposon 227 mutagenesis screen requires that the plasmid encoding the viral genome be free of NotI sites. We 228 hypothesized that the vast majority of insertants in this fusion-defective (NDV Fmut ) genomic background 229 would not be viable and could not be rescued unless (i) the insertant(s) directly compensated for the fusion-230 peptide mutation (F A138T ), and/or (ii) the insertants occurred on a fusion-revertant genome.
Our NDV Fmut genomic plasmid library had a serendipitous skew in insertions towards the 5' end of 232 the genome that was not caused by sequencing bias (compare Fig. 6B to 6C). We also observed that this 233 skew was maintained in the P0 rescue population (Fig. 6D), which reflects the likelihood that a wide range 234 of NDV insertants were competent for genome amplification and budding. This suggests that NDV, like 235 SeV, has a high overall capacity for genetic plasticity. However, the P0 infectious titer was extremely low at 236 10 iu/mL (Supplementary Table S4). This is expected since the vast majority of rescue events from the 237 NDV Fmut genomic library would result in the production of non-infectious virion particles. To further 238 increase the selection pressure by genetic "bottlenecking", NDV Fmut (6n+18) P1 was carried out at an 239 extremely low moi (<10 -5 ). We observed a clear response to the bottleneck selective pressure upon 240 subsequent passaging (P1 and P2, Fig. 6E and F), where the capacity for productive entry and fusion is 241 essential for viral fitness, replication, and eventual amplification under the conditions examined. 242

NDV Fmut background selects for compensatory insertants in fusion protein 244
Remarkably, when we analyzed the insertant enrichment over two passages ( Fig. 6D-F, 245 Supplementary Table S5), we found that a vast majority of insertants were clustered around nts 11867-246 11877 in the L gene, with a subset of F insertants clustered around nt 5383 in NDV-F. This unusual 247 distribution demonstrates that our experimental system is not inherently biased towards selection of 248 insertants in the 3' end of paramyxovirus genomes. As we had with our previous libraries, we recreated 249 individual selected insertants and attempted rescue (Supplementary Table S5), but found that only the F-250 insertants were competent for virus spread while still maintaining the original F A138T fusion-inactivating 251 mutation in the NDV Fmut genome (e.g. F-ORF-5367 and F-ORF-5384, Supplementary Table S5). These 252 insertants correspond to the hinge region between domains III and I in the fusion protein, which has been 253 implicated in fusion regulation (17), Thus, we identified insertants that specifically compensated for the 254 alteration of the highly conserved A138 residue in the F-protein fusion peptide. 255

Input transposon distribution drives selection of L-insertants associated with F mut -revertants 257
In addition to insertants that directly compensated for our fusion-peptide mutation, our hypothesis 258 predicts that enrichment of other apparently viable insertants should occur on fusion-revertant genomes. 259 Any such insertants should also follow the frequency distribution of the original input library. For example, 260 since the transposon coverage of the input library was skewed towards the 5' end of the NDV Fmut genome by 261 approximately ten-fold relative to the 3' end of the genome (Fig. 6C), then any potential NDV-F mut -262 reversion and/or compensatory point mutations should also be more likely to occur in accordance with the 263 probability distribution associated with 5'-skewed insertants. This is relevant when examining the highly-264 represented L-insertants. We noted that these insertants' genomes were replication competent in the rescue 265 cells, but did not produce infectious virus particles (Supplementary Table S5). When we re-examined the 266 deep sequencing results in toto from NDV P2, we further identified high-abundance single nucleotide 267 mutations in the NDV structural genes (M, F, and HN, the latter now formally designated as RBP (2)). 268 Since L-ORF -11872 constituted such a high proportion of insertants in P2 ( Table S5). Thus, reversion of the F A138T point mutation was most likely 282 responsible for the L-insertant's relative fitness within the NDV library. 283

286
In order to explore the genetic plasticity of sialic acid-using paramyxoviruses, we generated 287 saturating transposon insertional mutagenesis libraries of SeV and MuV, and then rescued and passaged 288 these libraries to select for relatively fit insertants. We found that both SeV (Fig. 2) and MuV (Fig. 4) 289 tolerated insertions in the N and P genes, and especially in their untranslated regions. When we rescued 290 selected clones of the most highly-enriched insertants, we found that overall capacity for rescue correlated 291 Broadly, the data from these libraries correlate well with our earlier work on insertional mutagenesis 301 of MeV. Our original intent in adding SeV and MuV libraries to our insertional mutagenesis repertoire was 302 to identify whether usage of sialic acid would permit greater tolerance to structural change (11) or whether 303 we would still observe significant constraint on PMV glycoproteins that could explain their well-known 304 lack of antigenic drift (3) as we saw with MeV (10). We found that SeV and MuV do not show increased 305 tolerance for insertions in their glycoproteins. This is in contrast to the genetic and structural plasticity 306 observed in the HA (hemagglutinin) glycoproteins of sialic acid-using Influenza viruses (11, 18), indicating 307 that receptor usage does not determine tolerance to insertional mutagenesis. Even when we disabled NDV 308 fusion as a means of forcing change in the virus' structural region, we observed a strong selection for 309 reversion mutants, and only a lesser accumulation of compensatory insertants. When we tested the viability of individual insertants in these regions with SeV and MuV, we found that these viruses were incompetent 311 for virus amplification. These observations are consistent with the monoserotypic nature of most PMV 312 species (3-6, 8), and with careful analysis of evolutionary constraints on PMV fusion proteins (19). ORF. We therefore hypothesize that highly-expressed genes like N and P can tolerate some dysregulation 349 without significant negative effects, whereas the intolerance to dysregulation of less-abundant genes like F, 350 H/HN, and L may be an indicator of how stringently-regulated they are. Our studies suggest that the 351 regulatory functions of these intergenic regions in PMVs (21-24) should be systematically explored in their 352 appropriate genomic contexts, which is now possible using robust and efficient reverse genetics systems. 353 In SeV and MuV, these enriched UTRs near the 3' end of the genome are co-incidentally near the 354 eGFP reporter gene, which is between N and P. Nonetheless, this likely reflects an aspect of virus biology 355 rather than a simple proximity to the reporter gene since we also previously observed an increased tolerance 356 for insertants in the 3' end of the MeV genome (10). In addition, the eGFP reporter in our NDV reporter 357 genome is located between P and M, but we do not see enrichment for insertion in those untranslated 358 regions.
We also observed an enrichment for insertions in the coding regions of N and P in our fusion-360 competent libraries. Paramyxoviral P proteins code for multiple accessory proteins in different frames by 361 use of alternative translation start sites (C proteins) and by mRNA editing (V, W), and these proteins are 362 largely involved in blocking host antiviral sensing and response. Despite the constraint of coding in multiple 363 frames, we observed enrichment for insertions in all our fusion-competent libraries at the 5' (N-terminus) 364 end of the P gene, the region common to all the ORFs. However, the C, V, and W proteins of PMVs bear 365 unstructured regions and are highly variable between virus species and genera (1, 25, 26), while the N-366 terminus of P is specifically understood to be intrinsically disordered (27). Together, this may explain the 367 unexpected insertional tolerance in P. N insertions are also found primarily in the unstructured C-terminus 368 "N tail " region, which has already been shown to accommodate insertions and deletions with limited negative 369 impact on MeV in tissue culture (28, 29). We thus propose that our transposon mutagenesis enriches for 370 insertants in such unstructured regions of proteins, since specific functional elements will remain accessible 371 regardless of upstream and downstream insertions. be far more disordered, and in fact an unstructured region near CR-VI has been shown to tolerate both large 383 epitope tags and a complete break of the polymerase into two separate ORFs, as long as they are brought back into contact by artificial domains (30). While we cannot determine causation within the library setting, 385 follow-up failed attempts to rescue the L-insertants alone without revertant F mutations does not suggest 386 that these insertants specifically potentiated acquisition of point mutations; i.e. we can find no evidence that 387 the enriched L-insertants are more error-prone polymerases. Thus, a much more detailed structural analysis 388 and mutagenesis exploration of NDV-L, as well as other PMV polymerases, will be required to understand 389 what determines this region's specific tolerance to insertion. It is interesting to note that since this region 390 was not predicted by structural analyses, a functional insertional mutagenesis assay does still have 391 information to offer for designing sites for tagging viral proteins or inserting novel tandem ORFs and fusion 392 proteins. 393 Within this body of work, we compared fitness in the screening libraries with individual insertant 394 clonal fitness, and competitive fitness in the more-contained competitive outgrowth assays. We found that 395 relative clonal fitness (as measured by growth curves and peak titer) correlated well with relative fitness in 396 competitive outgrowth assays, particularly for SeV. However, in the context of the library setting, the 397 number of insertant reads at P2 appears to be more affected by complex epistatic factors. The ranked 398 frequency of insertant reads at P2 did not always match their clonal or competitive fitness in more 399 controlled assays. Altogether, the evidence suggests that the transposon library approach, without the more 400 careful downstream analyses shown in our studies, is best suited to dissecting viable vs non-viable viral 401 genomes in our assay setting, rather than predicting the relative fitness of individual insertants. 402 By placing our NDV Fmut library under a unique form of selective pressure, we drove enrichment of 403 insertants in otherwise-intolerant regions of the genome -F, a structural protein, and L, the viral 404 polymerase. Thus, we have demonstrated the power of our strategy to reveal not only the genomic plasticity 405 of paramyxovirus genomes, but also the ability to use our methods to design arbitrarily selective screening 406 campaigns to interrogate paramyxovirus biology. Specifically, we envision leveraging our efficient and 407 robust reverse genetics systems to design and execute selection strategies that can be used to interrogate the 408 fitness landscapes of individual genes that were previously not accessible using conventional paramyxovirus passaging and selection. Furthermore, we have shown the viability of designing strategies to select for 410 mutants in fitness landscapes that are otherwise not easily accessible during the normal course of 411 paramyxovirus evolution. 412 In toto, we found common regions of tolerance and intolerance for insertions in PMV genomes, 413 specifically identifying tolerance to dysregulation of highly-expressed genes. We further noted that there are 414 structural constraints on changing PMV antigenicity, and that unstructured regions in the N, P, and 415 accessory proteins permit insertional mutagenesis. We demonstrated that this highly-disruptive whole-416 genome insertional mutagenesis library approach could be informative for paramyxoviruses placed under 417 unique selective pressures: not only such genetic pressures as we used here, but also perturbations like 418 interferon treatment or amplification in susceptible host animals. Overall, the combined commonalities and 419 differences in these paramyxovirus mutagenesis libraries provide a broader family-wide context in which to 420 understand specific variations within PMV genera or species. 421

424
Experimental Design. 425 Our transposon mutagenesis strategy for whole genome interrogation of paramyxoviruses is outlined in Fig.  426 1 and the accompanying text. It takes advantage of our efficient and robust reverse genetics system (31) and 427 leverages the rule-of-six (15), the latter being a unique feature of paramyxovirus replication. Whole genome 428 insertional mutagenesis libraries were generated for three paramyxoviruses described below, and these 429 libraries were rescued (P0) and passaged (P1, P2) in tissue culture to identify genetic regions that were 430 relatively tolerant to insertion for downstream characterization. 431 432 Research Objectives: Through this study we sought (i) to identify genetic regions or determinants of 433 plasticity common to sialic-using paramyxoviruses, (ii genes that enable trypsin independent growth (35). The MuV JL5 -EGFP strain is derived from the JL5 514 vaccine strain. NDV-eGFP was based on LaSota strain with mutations in its cleavage site to be cleaved by 515 urokinase-type plasminogen activator (uPA) (36). In order to attenuate viral genomes that lack the 516 transposon insertion, we introduced an extra 3nt stop codon after the reporter gene in each construct, 517 rendering the viral genome 6n+3 nucleotides; these are indicated as SeV 6n+3, MuV 6n+3, and NDV 6n+3. 518 We also abolished NotI restriction sites in each virus' plasmid in order to facilitate transposon removal. All 519 modification for plasmids were performed using overlap PCR mutagenesis with InFusion cloning (Takara 520 Biosciences, USA). Viral genome and support plasmids were maintained in chemically-competent Stbl2 E. 521 coli cells (ThermoFisher Scientific, USA) with growth at 30°C. 522 Supplementary Table S6 contains the primer sequences for generating all recombinant insertant plasmid  523 genomes. Nucleotide position in the genome is labelled without the eGFP transcriptional unit, and insertant 524 position refers to the nucleotide after which the transposon sequence began. Insertants are named by genetic 525 region and nucleotide position. 526 527 Transposon-mediated mutagenesis. 528 The Mutation Generation System (ThermoFisher Scientific, USA) was used to randomly insert transposons 529 in the 6n+3 genomic plasmids using a modified protocol. An in vitro transposon insertion reaction was 530 performed on approximately 850ng per viral genome plasmid (40ng DNA per kb of plasmid) of 6n+3 531 genomic plasmids, which were dialyzed twice for 30 min in 1L ddH 2 O, and then transformed into 532 ElectroSHOX cells (Bioline USA, discontinued). Following transformation, the cells were plated on 20 x 533 15cm plates with LB agar containing ampicillin (MilliporeSigma, USA) and kanamycin (ThermoFisher 534 Scientific, USA) (selecting for plasmid transformants and transposon insertion respectively) and allowed to 535 grow for ~18 hours at 30°C. The bacterial colonies were scraped from the agar with PBS, and pelleted, and 536 DNA was extracted from the pooled colonies using a PureLink HiPure maxiprep kit (ThermoFisher 537 Scientific, USA). 30ug of transposon-containing genomic plasmid was digested with NotI-HF (New 538 England Biolabs, USA) for 3 hours to remove the transposon body. The restricted plasmid was then gel 539 purified using the Qiaex II kit (Qiagen, USA), and 500 ng of the DNA was re-ligated at 25°C for 30 minutes 540 using T4 DNA Ligase (New England Biolabs, USA) and heat-inactivated at 65°C for 10 minutes. The entire 541 ligation mixture was dialyzed for 20 min in 1L ddH 2 O, and then transformed into ElectroSHOX cells and 542 plated on 20 x 15cm plates containing ampicillin only. After ~18 hours' growth at 30°C, the colonies 543 containing 6n+3 viral genomes with the transposon scar (6n+18) were again scraped from the plates into 544 PBS, and the viral 6n+18 genome DNA was again extracted using the HiPure maxiprep kit. 545 546 Rescue of recombinant viruses (P0) from cDNA. 547 For recovery of recombinant viruses, rescue was performed as described in Beaty et al. (2017). 4x10 5 BSR 548 T7/5 cells per well were seeded in 6-well plates. The following day, DNA and Lipofectamine LTX / PLUS 549 reagent (ThermoFisher Scientific, USA) were combined as indicated in Supplementary Table S7 in  550 OptiMEM (ThermoFisher Scientific, USA) with gentle mixing by pipetting only. After incubation at room 551 temperature for 30 minutes, the DNA:lipofectamine mixture was added dropwise onto cells. Separate 552 transfection reactions were set up for each rescue well. Transfected cells were incubated at 37°C for 8-10 553 days, until the level of infection reached 100% as determined by observation of GFP-positive cells by 554 microscopy. Supernatant was collected from rescue cells, pooled, and clarified by centrifugation. Clarified 555 supernatants were stored at -80°C. 556 557 Analysis of relative rescue efficiency. SeV-WT, SeV-parental (6n+3), and SeV-library (6n+18) genomes 558 were rescued as described in detail above. Two days post-rescue, cells were collected with PBS+50mM 559 EDTA, pelleted, and re-suspended into 2% paraformaldehyde for fixation. After 15 minutes, cells were 560 pelleted again, and re-suspended into PBS + 2mM EDTA + 2%FBS. Cells were assayed by flow cytometry 561 on a BD FACSCantoII with BD FACSDiva v6.0, and evaluated for GFP-positivity in the Blue-1 channel, 562 relative to un-transfected cells. 5x10 5 events were collected for each sample -the equivalent of a full 6-well 563 well. The WT, parental (6n+3), and libraries (6n+18) for MuV were evaluated the same way. 564 565 Titering viral supernatants. 566 Titrations of SeV, MuV, and NDV stocks were performed on Vero cells in a 96-well format, with individual 567 infection events (infectious units, iu/mL) identified by GFP fluorescence at 24 hours post-infection using an 568 Acumen plate reader (TTP Labtech, USA). 569 570 Passaging virus for SeV and MuV library screen. 571 5.2x10 6 Vero cells in a 15cm dish were infected at an MOI of 0.01 for each passage and replicate, with the 572 exception of passage 1 in MuV. We adopted a MOI of 0.0003 (5120 iu/dish) for passage 1 (P1) of MuV, 573 because P0 titer was very low. Thereafter, infection was monitored by microscopy and supernatants were 574 collected when the level of infection reached 100% as determined by observation of GFP-positive infected 575 cells by microscopy, at 8-10 days post-infection (dpi).) Supernatant was collected from rescue cells, pooled, 576 and clarified by centrifugation. Clarified supernatants were stored at -80°C. Screening experiments were 577 done in triplicate independently. 578 579 Passaging virus for NDV library screen. 580 BSRT7 cells were infected with NDV using the same strategy and parameters as with SeV and MuV above. 581 Since infectious titers from P0 (rescue) of NDV were very low, P1 was carried out at a very low moi 582 (<0.0001) and required several additional days to reach confluence post-infection. 583 584 RT-PCR and Illumina sequencing for library screen. 585 The SeV, MuV, and NDV RNA was extracted from thawed supernatant using QIAamp viral RNA 586 extraction kit (Qiagen, USA). Genomic RNA was then amplified in six equal-sized segments using 587 overlapping primers sets (Supplementary samples were sequenced on a HiSeq2000 using 100-nt single-end reads in Rapid Run mode. Analysis of the 592 transposon insertions was performed as previously described (10). 593 594 Sequencing analysis of library screen. 595 Identification of the transposon insertions were carried out as in Heaton et al. (11). Briefly, reads with the 596 transposon scar sequence of TGCGGCCGCA were extracted from the total sequencing data. The scar 597 sequence was then deleted, leaving a 5nt duplication at the site of insertion. These sequences were then 598 aligned to the viral reference sequences by bowtie2, and processed sam files were used to identify the 599 position of insertion in each read. 600 601 Data analysis and insertant selection from library screens. 602 Although transposon coverage was overall >10-fold for both SeV and MuV libraries, individual nucleotide 603 positions did not always receive an insertion. Thus, all analyses were carried out using a 100nt sliding 604 window to prevent division by 0. Additionally, raw insertant counts in P1 and P2 are likely to be biased by 605 varying transposon abundance in the input and rescue (P0) libraries, so we normalized P1 and P2 reads by 606 the number of insertants in P0, and presented these passages as triplicate average percent reads over P0 607 (Figs. 2A-C and 4A-C). 608 609 To identify the most highly-enriched individual insertants in the library, we first identified 40 insertants 610 with the highest overall raw read count at P2 from each library. We then divided these by normalized P0 611 reads, and eliminated any insertants whose relative abundance drastically decreased over passage (average 612 P2/P0 < 30%) in order to account for variability of coverage in P0. From those remaining, we showed the 613 top 20 insertants for SeV (Supplementary Table S2), and the insertants that showed an average of 1 or more 614 reads in MuV (Supplementary Table S3). Individual insertants from these lists were selected for 615 downstream characterization as described in Results. 616 617 Insertant rescue and growth curves. 618 Individual insertant viruses were rescued in BSR T7/5 cells as described above, were amplified in Vero cells 619 once, and titered as above. 2x10 5 Vero cells per well in a 12-well dish were infected at an MOI of 0.01 for 620 2h, followed by replacement of fresh medium. Samples were collected daily for titration with complete 621 media exchange. 622 623 Competitive outgrowth assay.

624
Because individual insertant viruses demonstrated different growth characteristics that could render our 625 standard titration assay (described above) inaccurate, we titered the individual insertants by focus-forming 626 assay prior to combining them for a competition outgrowth assay. 2x10 5 Vero cells per well in 12-well 627 plates were inoculated with a serial 10-fold dilution of insertants for 2 hours. Cells were washed with PBS 628 once and then replaced with an overlay methylcellulose (1% methylcellulose in DMEM plus 2% FBS) to 629 prevent establishment of secondary foci. At 7 dpi (SeV) or 4 dpi (MuV), the number of eGFP-positive 630 infectious foci were manually counted using a Nikon Eclipse TE300 inverted fluorescent microscope 631 (Melville, NY, USA). 632 633 Competitive outgrowth assays were carried out in independent biological triplicates: equal infectious units 634 as defined by the focus-forming assay above of 8 SeV insertants or 6 MuV insertants were mixed, creating 635 P0 mixture. Then the titer of each of these mixtures was re-quantified by iu as described above.   Heat-map comparison of the abundance of select insertants in P2 from our NGS data (left, library P2), and 848 peak titers of highly-represented insertants that were selected for individual confirmation as a recombinant 849 parental virus (6n+3) bearing that particular insertion (+15) (right, Peak titer). Peak titers are indicated from 850 three independent growth curves (Replicates 1, 2, 3). Black blocks in the heat map indicate insertants that 851 failed to produce detectable virus in rescue and so could not be used for any further replicates (indicated by 852 following white blocks). The color-intensity scale for the heat maps comparing the relative abundance of 853 insertants in P2 (avg counts), and the peak titers of selected insertants described above